This repository has been archived by the owner on Sep 27, 2019. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 624
[15721] Index Suggestion #1347
Open
sivaprasadsudhir
wants to merge
182
commits into
cmu-db:master
Choose a base branch
from
sivaprasadsudhir:auto_index
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
[15721] Index Suggestion #1347
Changes from 160 commits
Commits
Show all changes
182 commits
Select commit
Hold shift + click to select a range
d18033d
added the files for cost evaluation
pbollimp 5fdadea
llvm for mac
vkonagar ec6c94b
Basic classes
sivaprasadsudhir 492b95f
added the configuration enumeration files
pbollimp 8410136
Add Whatif API
vkonagar 96eadf4
Add optimizer cost query func skeleton
vkonagar 9087931
Complete what if API implementation. Testing pending.
vkonagar 0908588
Ignore query planning
vkonagar 5e2cbff
Analyze tables was missing. Fixed it
vkonagar fcfe058
fix the query
vkonagar 04e49f8
add comments, fix some code style
vkonagar d62462b
Fix whatif API test
vkonagar 2e19c1c
run formatter
sivaprasadsudhir ac653aa
Add index selection module skeleton
vkonagar 4d44009
skeleton for admissible column parsing
vkonagar 371fd38
adding cost model classes
sivaprasadsudhir c23cc36
cleanup and reorganize the code
sivaprasadsudhir 4d694ec
Intermediate changes. Query parser not complete.
vkonagar a51fe84
Intermediate changes. Query parser not complete.
vkonagar d043128
removed cost model class
sivaprasadsudhir 32f9040
Add IndexObject Pool
vkonagar 324e430
Memoization support completed
sivaprasadsudhir 5978d32
Complete query parser
vkonagar a24ded7
Complete query parser
vkonagar 11bc159
multi column index, wip
sivaprasadsudhir e0cac79
Add tests for admissible indexes
vkonagar 83c1b44
Fix what if index and admissive indexes test
vkonagar 1e5925c
added outline for naive enumeration method
pbollimp 4b463dc
Fix get admissible indexes test
vkonagar 96a41b1
Fix get admissible indexes test
vkonagar 12a343a
Added the IndexConfiguration set difference
pbollimp e98461a
Minor BUg Fix
sivaprasadsudhir 1ec6f55
Split computing and getting const
sivaprasadsudhir d23d0dc
Fix compilation error and typos
vkonagar a94cac9
Finish Configuration Enumeration module
pbollimp 11adba0
Fix the main index selection algorithm
vkonagar 4c8dce7
Finish Merging
pbollimp 6f67e0c
Merge
vkonagar aa63a5f
cleanup
sivaprasadsudhir f8a8180
Restructure code
vkonagar b619333
More refactoring
vkonagar d01d018
added comments to index selection context
sivaprasadsudhir d9d0cfc
Added the comparator for the candidate index enumeration
pbollimp d984e89
Adding comments
pbollimp 11fdce2
Restructure generate candidate indexes
vkonagar afa1582
Fix merge
vkonagar 3178695
partial test for multi columnindex generation
sivaprasadsudhir 5f4a822
Add candidate index gen test
vkonagar fd2de46
Minor change to ComputeCost. Formatting and comments.
pbollimp 3db49a7
Add comments
vkonagar b7c4f9c
comments
sivaprasadsudhir 756ecb8
More formatting and comments.
pbollimp 0d336d0
more comments
vkonagar f58cf77
brief comments.
pbollimp 213a351
rename pl_assert to peloton_assert
sivaprasadsudhir e846956
Remove GetCost and rename ComputeCost to GetCost
pbollimp 85705dd
fix multicolumnindex generation
sivaprasadsudhir 920083a
minor fixes
sivaprasadsudhir 93b2214
Fix admissible index and candidate pruning tests
vkonagar e3b43d0
Fix unused variables
vkonagar c907ef3
Add more tests to WhatIfAPI and IndexSelection
vkonagar 342f6a3
Implement the suggestions mentioned in the code review
vkonagar c54f4e0
Uncomment the choose best plan call
vkonagar 39259fb
Fix tests
vkonagar f323ed9
Add support for multi-column index
chenboy 6330ab6
Fix conflicts after merge
chenboy b291f58
nit fixes
sivaprasadsudhir f4ce787
Fix what-if index tests
vkonagar c6915f7
Add more multi-column index sets in the test cases.
vkonagar 49b95df
Add testing utility class for index suggestion tests
vkonagar a6da36d
Add to cmake for the files in the previous commit
vkonagar 01c994e
Modify what-if tests to use the utility class
vkonagar e1dad43
Fix formatting
vkonagar 90e7d65
Code review fix
vkonagar 57c1c83
fix tests
sivaprasadsudhir 4b4e256
nit
sivaprasadsudhir 61786ae
Fix memory leaks and misc nit fixes
vkonagar fa1dbba
fixed the test temportarily for the index bug
sivaprasadsudhir 6bbaa94
Rename IndexObject to HypotheticalIndexObject
vkonagar 5591755
debugging the shared pointer issue
sivaprasadsudhir 5d0d2b8
Fix segfault. Some more Renames
vkonagar 28e818b
check the exact indexes
sivaprasadsudhir 8fd0bf4
Fix the tests to use the util
vkonagar 3f394f7
fixing the index selection
sivaprasadsudhir 8f1b897
Fix formatting
vkonagar 40576fe
Rebase and fix conflicts while rebasing
vkonagar 10843ca
latest tests
sivaprasadsudhir 3085a58
Better tests
sivaprasadsudhir 1e9b959
Add get workload support to the testing utility class.
vkonagar 55354b9
Fix stray
vkonagar 96f500b
Comment out the debug code in optimizer
vkonagar eb3da24
Add index suggestion task skeleton
vkonagar 2657e76
Add query history catalog GET methods.
vkonagar a564372
Fix formatting
vkonagar 9f5bdc5
Update index suggestion task
vkonagar e290797
Add new workload
vkonagar 57955b4
Add new test - incomplete
vkonagar ecec9ce
Add more than 3 columns cost model test
vkonagar 4e3370c
Fix join query parsing for table name extraction
vkonagar 818c583
Add more queries to workload D
vkonagar e4865c4
DEBUG -> TRACE
vkonagar 53c1101
Changed the columns from a set to vector
sivaprasadsudhir ae3e26b
Merge branch 'auto_index' of https://github.com/sivaprasadsudhir/pelo…
sivaprasadsudhir 7152d46
Fix compilation error
vkonagar 0062cc5
Merge branch 'auto_index' of https://github.com/sivaprasadsudhir/pelo…
sivaprasadsudhir fee2bea
Complete the index suggestion task - RPC is pending.
vkonagar 4642b34
Merge remote-tracking branch 'origin/auto_index' into auto_index
vkonagar 490677f
Get args at RPC handler
vkonagar 51d7f56
Refactored the tests
sivaprasadsudhir fc0d60e
Merge branch 'auto_index' of https://github.com/sivaprasadsudhir/pelo…
sivaprasadsudhir a48e085
Fix compilation issue and list serialization
vkonagar a3ac507
Merge remote-tracking branch 'origin/auto_index' into auto_index
vkonagar f6b18d0
Complete RPC handler
vkonagar eb5239f
fix logs
sivaprasadsudhir 693516b
Fix compilation error in peloton-bin
vkonagar 6017790
Merge remote-tracking branch 'origin/auto_index' into auto_index
vkonagar b024304
Add dropIndex RPC
vkonagar 8b2169c
run brain and server together in one process for testing
sivaprasadsudhir f718511
Merge branch 'auto_index' of https://github.com/sivaprasadsudhir/pelo…
sivaprasadsudhir 8639124
MOved tunable knobs into a separate structure
sivaprasadsudhir 3a5227a
changed the arguments of the constructor
sivaprasadsudhir aeabd94
completed the refactor
sivaprasadsudhir 7ee9b0f
Fix index selection job -- rename some stuff
vkonagar 99be940
Merge branch 'auto_index' of github.com:sivaprasadsudhir/peloton into…
vkonagar 1e3cd9c
minor style changes
sivaprasadsudhir bd4593b
Rename more stuff
vkonagar 5fe0108
Merge remote-tracking branch 'origin/auto_index' into auto_index
vkonagar a8af555
More renames
vkonagar 273b89b
Fix DML statement handling in workload
vkonagar 7091c7f
Fix cost model bug for more than 2 column indexes
vkonagar 67ff655
Add an extensive test on multi-column optimizer cost model test
vkonagar 51139e6
concrete test case to show the issues with non-deterministic set of i…
sivaprasadsudhir f9b2c5e
Add drop indexes RPC
vkonagar cb8d209
Merge branch 'auto_index' of https://github.com/sivaprasadsudhir/pelo…
sivaprasadsudhir 3c3559e
Run formatter
vkonagar 2da21af
Merge remote-tracking branch 'origin/auto_index' into auto_index
vkonagar 71d4213
Fix drop indexes
vkonagar 7d6fc37
Fix a bug in config enumeration for case where no index is better
pbollimp 6d48e80
Fix formatter issue
vkonagar d22b7bb
Merge remote-tracking branch 'origin/auto_index' into auto_index
vkonagar 1060627
Fix travis error
vkonagar 0b12801
Fix the test that is failing non-deteministically due to the optimize…
pbollimp 5029ed1
Merge branch 'auto_index' of https://github.com/sivaprasadsudhir/pelo…
pbollimp 1e31d2a
Use only one transaction for the entire run of the job. Also, generat…
vkonagar 8b937da
hopefully, final version of the algorithm
sivaprasadsudhir f8262cd
added multiple choices for the output
sivaprasadsudhir f4bca42
more index selection tests
sivaprasadsudhir 4c37855
Add missing populate index
vkonagar 38757ac
Consider non-equality predicates for index scan in the cost model
chenboy 4792d91
Drop the indexes only if it is not suggested this time
vkonagar 5460082
fixed precision issues
sivaprasadsudhir 3b757f1
Merge branch 'auto_index' of https://github.com/sivaprasadsudhir/pelo…
sivaprasadsudhir 8bc5170
minor fixes
sivaprasadsudhir 51f5a1a
Fix the AnalyzeStats crash
vkonagar 5c322c1
Fix: Index Selection returns empty set because the
vkonagar 3ef9128
Fix a bug during where clause parsing to make it work with TPCC
pbollimp 146100d
Fix the compilation error
vkonagar d250fbe
Address some of the code review comments
pbollimp 3230ec3
Fix create/drop index -- running TPCC
vkonagar dc424ea
Fix analyze stats crash. Fix query history logging for PREPARED state…
vkonagar 43b742b
Change knobs
vkonagar c422a63
More misc
vkonagar 27a0df0
addressing commits
sivaprasadsudhir a06189a
Restructure code
vkonagar 332543f
Reformat code
vkonagar 9d0a005
small correction to make it compile in debug mode
pbollimp 11d2f3e
remove the unnecessary commented parts of test and code
pbollimp 59ee8d3
Restructure code, fix nits
vkonagar 6817300
remove #define
pbollimp 3546f6a
Merge remote-tracking branch 'origin/auto_index' into auto_index
pbollimp e2e4578
Restructure code
vkonagar 4f48831
Run formatter
vkonagar 4dc06ac
fix errors for compilation in debug mode
pbollimp 65d5a06
Merge remote-tracking branch 'origin/auto_index' into auto_index
pbollimp 480ae4d
fix query logger test
pbollimp 81420e7
trying to pass the compilation on travis
pbollimp 28483e5
change debug logging to trace level logging
pbollimp e1bd8ba
Fix warning in IndexConfigComparator
vkonagar f8e6eda
trace-->debug
vkonagar 597e798
Hack to make travis pass the build.
vkonagar b99312a
Hack to make travis pass the build.
vkonagar 50db015
remove multiple of unnecessary debug statements
pbollimp File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
//===----------------------------------------------------------------------===// | ||
// | ||
// Peloton | ||
// | ||
// index_selection_context.cpp | ||
// | ||
// Identification: src/brain/index_selection_context.cpp | ||
// | ||
// Copyright (c) 2015-2018, Carnegie Mellon University Database Group | ||
// | ||
//===----------------------------------------------------------------------===// | ||
|
||
#include "brain/index_selection_context.h" | ||
#include "common/logger.h" | ||
|
||
namespace peloton { | ||
namespace brain { | ||
|
||
IndexSelectionContext::IndexSelectionContext(IndexSelectionKnobs knobs) | ||
: knobs_(knobs) {} | ||
|
||
} // namespace brain | ||
} // namespace peloton |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,174 @@ | ||
//===----------------------------------------------------------------------===// | ||
// | ||
// Peloton | ||
// | ||
// index_selection_job.cpp | ||
// | ||
// Identification: src/brain/index_selection_job.cpp | ||
// | ||
// Copyright (c) 2015-2018, Carnegie Mellon University Database Group | ||
// | ||
//===----------------------------------------------------------------------===// | ||
|
||
#include "brain/index_selection_util.h" | ||
#include "brain/index_selection_job.h" | ||
#include "brain/index_selection.h" | ||
#include "catalog/query_history_catalog.h" | ||
#include "catalog/system_catalogs.h" | ||
#include "optimizer/stats/stats_storage.h" | ||
|
||
namespace peloton { | ||
namespace brain { | ||
|
||
#define BRAIN_SUGGESTED_INDEX_MAGIC_STR "brain_suggested_index" | ||
|
||
void IndexSelectionJob::OnJobInvocation(BrainEnvironment *env) { | ||
auto &txn_manager = concurrency::TransactionManagerFactory::GetInstance(); | ||
auto txn = txn_manager.BeginTransaction(); | ||
LOG_INFO("Started Index Suggestion Task"); | ||
|
||
optimizer::StatsStorage *stats_storage = | ||
optimizer::StatsStorage::GetInstance(); | ||
|
||
ResultType stats_result = stats_storage->AnalyzeStatsForAllTables(txn); | ||
if (stats_result != ResultType::SUCCESS) { | ||
LOG_ERROR( | ||
"Cannot generate stats for table columns. Not performing index " | ||
"suggestion..."); | ||
txn_manager.AbortTransaction(txn); | ||
return; | ||
} | ||
|
||
// Query the catalog for new SQL queries. | ||
// New SQL queries are the queries that were added to the system | ||
// after the last_timestamp_ | ||
auto &query_catalog = catalog::QueryHistoryCatalog::GetInstance(txn); | ||
auto query_history = | ||
query_catalog.GetQueryStringsAfterTimestamp(last_timestamp_, txn); | ||
if (query_history->size() > num_queries_threshold_) { | ||
LOG_INFO("Tuning threshold has crossed. Time to tune the DB!"); | ||
|
||
// Run the index selection. | ||
std::vector<std::string> queries; | ||
for (auto query_pair : *query_history) { | ||
queries.push_back(query_pair.second); | ||
} | ||
|
||
// TODO: Handle multiple databases | ||
brain::Workload workload(queries, DEFAULT_DB_NAME, txn); | ||
LOG_INFO("Knob Num Indexes: %zu", env->GetIndexSelectionKnobs().num_indexes_); | ||
LOG_INFO("Knob Naive: %zu", env->GetIndexSelectionKnobs().naive_enumeration_threshold_); | ||
LOG_INFO("Knob Num Iterations: %zu", env->GetIndexSelectionKnobs().num_iterations_); | ||
brain::IndexSelection is = {workload, env->GetIndexSelectionKnobs(), txn}; | ||
brain::IndexConfiguration best_config; | ||
is.GetBestIndexes(best_config); | ||
|
||
if (best_config.IsEmpty()) { | ||
LOG_INFO("Best config is empty"); | ||
} | ||
|
||
// Get the existing indexes and drop them. | ||
// TODO: Handle multiple databases | ||
auto database_object = catalog::Catalog::GetInstance()->GetDatabaseObject( | ||
DEFAULT_DB_NAME, txn); | ||
auto pg_index = catalog::Catalog::GetInstance() | ||
->GetSystemCatalogs(database_object->GetDatabaseOid()) | ||
->GetIndexCatalog(); | ||
auto indexes = pg_index->GetIndexObjects(txn); | ||
for (auto index : indexes) { | ||
auto index_name = index.second->GetIndexName(); | ||
// TODO [vamshi]: REMOVE THIS IN THE FINAL CODE | ||
// This is a hack for now. Add a boolean to the index catalog to | ||
// find out if an index is a brain suggested index/user created index. | ||
if (index_name.find(BRAIN_SUGGESTED_INDEX_MAGIC_STR) != | ||
std::string::npos) { | ||
bool found = false; | ||
for (auto installed_index: best_config.GetIndexes()) { | ||
if ((index.second.get()->GetTableOid() == installed_index.get()->table_oid) && | ||
(index.second.get()->GetKeyAttrs() == installed_index.get()->column_oids)) { | ||
found = true; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: using a |
||
} | ||
} | ||
// Drop only indexes which are not suggested this time. | ||
if (!found) { | ||
LOG_DEBUG("Dropping Index: %s", index_name.c_str()); | ||
DropIndexRPC(database_object->GetDatabaseOid(), index.second.get()); | ||
} | ||
} | ||
} | ||
|
||
for (auto index : best_config.GetIndexes()) { | ||
// Create RPC for index creation on the server side. | ||
CreateIndexRPC(index.get()); | ||
} | ||
|
||
// Update the last_timestamp to the be the latest query's timestamp in | ||
// the current workload, so that we fetch the new queries next time. | ||
// TODO[vamshi]: Make this efficient. Currently assuming that the latest | ||
// query can be anywhere in the vector. if the latest query is always at the | ||
// end, then we can avoid scan over all the queries. | ||
last_timestamp_ = GetLatestQueryTimestamp(query_history.get()); | ||
} else { | ||
LOG_INFO("Tuning - not this time"); | ||
} | ||
txn_manager.CommitTransaction(txn); | ||
} | ||
|
||
void IndexSelectionJob::CreateIndexRPC(brain::HypotheticalIndexObject *index) { | ||
// TODO: Remove hardcoded database name and server end point. | ||
capnp::EzRpcClient client("localhost:15445"); | ||
PelotonService::Client peloton_service = client.getMain<PelotonService>(); | ||
|
||
// Create the index name: concat - db_id, table_id, col_ids | ||
std::stringstream sstream; | ||
sstream << BRAIN_SUGGESTED_INDEX_MAGIC_STR << "_" << index->db_oid << "_" | ||
<< index->table_oid << "_"; | ||
std::vector<oid_t> col_oid_vector; | ||
for (auto col : index->column_oids) { | ||
col_oid_vector.push_back(col); | ||
sstream << col << "_"; | ||
} | ||
auto index_name = sstream.str(); | ||
|
||
auto request = peloton_service.createIndexRequest(); | ||
request.getRequest().setDatabaseOid(index->db_oid); | ||
request.getRequest().setTableOid(index->table_oid); | ||
request.getRequest().setIndexName(index_name); | ||
request.getRequest().setUniqueKeys(false); | ||
|
||
auto col_list = | ||
request.getRequest().initKeyAttrOids(index->column_oids.size()); | ||
for (auto i = 0UL; i < index->column_oids.size(); i++) { | ||
col_list.set(i, index->column_oids[i]); | ||
} | ||
|
||
PELOTON_ASSERT(index->column_oids.size() > 0); | ||
auto response = request.send().wait(client.getWaitScope()); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you check the response and through some warning if it does not succeed? |
||
} | ||
|
||
void IndexSelectionJob::DropIndexRPC(oid_t database_oid, | ||
catalog::IndexCatalogObject *index) { | ||
// TODO: Remove hardcoded database name and server end point. | ||
capnp::EzRpcClient client("localhost:15445"); | ||
PelotonService::Client peloton_service = client.getMain<PelotonService>(); | ||
|
||
auto request = peloton_service.dropIndexRequest(); | ||
request.getRequest().setDatabaseOid(database_oid); | ||
request.getRequest().setIndexOid(index->GetIndexOid()); | ||
|
||
auto response = request.send().wait(client.getWaitScope()); | ||
} | ||
|
||
uint64_t IndexSelectionJob::GetLatestQueryTimestamp( | ||
std::vector<std::pair<uint64_t, std::string>> *queries) { | ||
uint64_t latest_time = 0; | ||
for (auto query : *queries) { | ||
if (query.first > latest_time) { | ||
latest_time = query.first; | ||
} | ||
} | ||
return latest_time; | ||
} | ||
|
||
} | ||
} |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is should be addressed at least by having a wrapper function with database name as argument pass in. Multiple database handling is important especially after the catalog refactor.
-- Tianyu, Justin & Tianyi