-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql/performance: independent filter expressions on JOINs should be pushed down to the operands #8566
Comments
Try this: select * from (select * from a where a.id=70000000) as a, (select * from b where b.id=70000000) as b; We don't yet propagate the filters in the outer select into the inner scans, so you have to do it manually to get the expected performance. There is a bunch of techniques we plan to use to optimize query performance and they are on the roadmap, however we're not yet there. For an overview of the current state of our JOIN implementation you can peruse the following blog post: |
@petermattis, Thank you very much. It works now. Our service expects so much on relational queries optimization. |
@knz,@petermattis. Thank you very much. |
@zhaoyuxi if you have time we'd be interested to know which types of JOINs your service uses the most often. This way we could put them higher in our list of objectives. |
@knz It is my pleasure. Our service has 50 kinds of structured objects. The ojbect and its index are separately stored.No strong consistency. Read much more and write much less. First do the select based on the index table to get the object ID. Then get the object. The great challenges for us are:
Query result has about 10000 records. The page only takes topN usually 10 every time. 1500 select TPS is asked on 4C36G VM. 2.Several objects may span multiple nodes, and the efficiency of distributed SQL is even worse. Not only to be flexible, but also to require low CPU consumption. |
It’s our pleasure to get your ping. Thanks very much. Cockroach is so amazing for us. Scalable, Survivable, SQL and Consistent (Not strongly) are our needs But we worry about SQL performance. Especially distributed SQL.
|
@zhaoyuxi thanks for sharing. |
@knz From the sample query, each sub1-sub4 table has a filter that can be evaluated on that table. @zhaoyuxi Thanks for sharing. Please note that our current JOIN implementation is meant to be just functional (only on very small data sets); it is not anywhere close to what we plan to have in 1.0. We also plan to distribute SQL computation among the nodes that store relevant data. I would like to hear more detail on 3. It sounds like you want to know exactly how data is distributed and stored? That is not something we have planned.. Assuming you had a way of knowing how table data is sliced and where each piece is stored, what would you do with that information? What is the "deep optimization" regarding? |
Now the main has 6,000,000 records. The sub about 10 |
@RaduBerinde Thanks for your input. |
1 Issue appearance
I created two tables and inserted 800,000 rows into each table. A relational query hanging more than 20 minutes is not over yet. 1 CPU was 100% fully filled by a switch thread such as following two stacks:
Thread 11 (Thread 0x2af3f2c00700 (LWP 11476)):
#0 0x000000000058ca11 in runtime.getitab ()
#1 0x000000000058dbe5 in runtime.assertI2I ()
#2 0x0000000000934cd7 in github.com/cockroachdb/cockroach/sql/parser.(*ComparisonExpr).Eval ()
#3 0x000000000092f7c0 in github.com/cockroachdb/cockroach/sql/parser.(*AndExpr).Eval ()
#4 0x0000000000c21b83 in github.com/cockroachdb/cockroach/sql/sqlbase.RunFilter ()
#5 0x0000000000f1c0e1 in github.com/cockroachdb/cockroach/sql.(*selectNode).Next ()
#6 0x0000000000f23fa6 in github.com/cockroachdb/cockroach/sql.(*selectTopNode).Next ()
#7 0x0000000000ece0e6 in github.com/cockroachdb/cockroach/sql.(*Executor).execStmt ()
#8 0x0000000000ecb9e9 in github.com/cockroachdb/cockroach/sql.(*Executor).execStmtInOpenTxn ()
#9 0x0000000000eca5e4 in github.com/cockroachdb/cockroach/sql.(*Executor).execStmtsInCurrentTxn ()
#10 0x0000000000ec9aad in github.com/cockroachdb/cockroach/sql.runTxnAttempt ()
#11 0x0000000000f5cb60 in github.com/cockroachdb/cockroach/sql.(*Executor).execRequest.func2 ()
#12 0x0000000000ae8c4b in github.com/cockroachdb/cockroach/internal/client.(*Txn).Exec ()
#13 0x0000000000ec8450 in github.com/cockroachdb/cockroach/sql.(*Executor).execRequest ()
#14 0x0000000000ec7656 in github.com/cockroachdb/cockroach/sql.(*Executor).ExecuteStatements ()
#15 0x0000000000fec6a8 in github.com/cockroachdb/cockroach/sql/pgwire.(*v3Conn).executeStatements ()
#16 0x0000000000fe8548 in github.com/cockroachdb/cockroach/sql/pgwire.(*v3Conn).handleSimpleQuery ()
#17 0x0000000000fe813a in github.com/cockroachdb/cockroach/sql/pgwire.(*v3Conn).serve ()
#18 0x0000000000fde369 in github.com/cockroachdb/cockroach/sql/pgwire.(*Server).ServeConn ()
#19 0x0000000000ad298d in github.com/cockroachdb/cockroach/server.(*Server).Start.func8.1 ()
#20 0x0000000001000c7d in github.com/cockroachdb/cockroach/util/netutil.(*Server).ServeWith.func1 ()
#21 0x00000000005dfbd1 in runtime.goexit ()
#22 0x000000c82037ce10 in ?? ()
#23 0x000000c8201581f0 in ?? ()
#24 0x00002af38040a180 in ?? ()
#25 0x000000c82d16ed10 in ?? ()
#26 0x000000c82035e000 in ?? ()
#27 0x7568746967203866 in ?? ()
#28 0x616d2f6d6f632e62 in ?? ()
#29 0x692d6f672f6e7474 in ?? ()
#30 0x36353a7974746173 in ?? ()
#31 0x000000c82543a958 in ?? ()
#32 0x0000000000eca5e4 in github.com/cockroachdb/cockroach/sql.(*Executor).execStmtsInCurrentTxn ()
#33 0x0000000000000000 in ?? ()
Thread 8 (Thread 0x2af3f3203700 (LWP 11479)):
#0 0x0000000000589cb3 in runtime.mapiternext ()
#1 0x0000000000f2265a in github.com/cockroachdb/cockroach/sql.qvalMap.populateQVals ()
#2 0x0000000000f1c099 in github.com/cockroachdb/cockroach/sql.(*selectNode).Next ()
#3 0x0000000000f23fa6 in github.com/cockroachdb/cockroach/sql.(*selectTopNode).Next ()
#4 0x0000000000ece0e6 in github.com/cockroachdb/cockroach/sql.(*Executor).execStmt ()
#5 0x0000000000ecb9e9 in github.com/cockroachdb/cockroach/sql.(*Executor).execStmtInOpenTxn ()
#6 0x0000000000eca5e4 in github.com/cockroachdb/cockroach/sql.(*Executor).execStmtsInCurrentTxn ()
#7 0x0000000000ec9aad in github.com/cockroachdb/cockroach/sql.runTxnAttempt ()
#8 0x0000000000f5cb60 in github.com/cockroachdb/cockroach/sql.(*Executor).execRequest.func2 ()
#9 0x0000000000ae8c4b in github.com/cockroachdb/cockroach/internal/client.(*Txn).Exec ()
#10 0x0000000000ec8450 in github.com/cockroachdb/cockroach/sql.(*Executor).execRequest ()
#11 0x0000000000ec7656 in github.com/cockroachdb/cockroach/sql.(*Executor).ExecuteStatements ()
#12 0x0000000000fec6a8 in github.com/cockroachdb/cockroach/sql/pgwire.(*v3Conn).executeStatements ()
#13 0x0000000000fe8548 in github.com/cockroachdb/cockroach/sql/pgwire.(*v3Conn).handleSimpleQuery ()
#14 0x0000000000fe813a in github.com/cockroachdb/cockroach/sql/pgwire.(*v3Conn).serve ()
#15 0x0000000000fde369 in github.com/cockroachdb/cockroach/sql/pgwire.(*Server).ServeConn ()
#16 0x0000000000ad298d in github.com/cockroachdb/cockroach/server.(*Server).Start.func8.1 ()
#17 0x0000000001000c7d in github.com/cockroachdb/cockroach/util/netutil.(*Server).ServeWith.func1 ()
#18 0x00000000005dfbd1 in runtime.goexit ()
#19 0x000000c82037ce10 in ?? ()
#20 0x000000c8201581f0 in ?? ()
#21 0x00002af38040a180 in ?? ()
#22 0x000000c82d16ed10 in ?? ()
#23 0x000000c82035e000 in ?? ()
#24 0x7568746967203866 in ?? ()
#25 0x616d2f6d6f632e62 in ?? ()
#26 0x692d6f672f6e7474 in ?? ()
#27 0x36353a7974746173 in ?? ()
#28 0x000000c82543a958 in ?? ()
#29 0x0000000000eca5e4 in github.com/cockroachdb/cockroach/sql.(*Executor).execStmtsInCurrentTxn ()
#30 0x0000000000000000 in ?? ()
2 Two tables:
CREATE TABLE a(
id STRING(128) PRIMARY KEY,
name STRING(32),
type STRING(32),
int_id INT UNIQUE,
int_type INT,
INDEX type_idx (type),
INDEX int_id_idx (int_id),
INDEX int_type_idx (int_type)
);
CREATE TABLE b(
id STRING(128) PRIMARY KEY,
name STRING(32),
type STRING(32),
int_id INT UNIQUE,
int_type INT,
foreignA STRING NOT NULL REFERENCES a,
INDEX(foreignA)
);
All was fast as following:
SELECT * from a where a.id='70000000';
SELECT * from a where id='70000000';
SELECT * from b where b.id='70000000';
SELECT * from b where id='70000000';
But I didn’t have the patience to wait for the result after more than 20 minutes on this case:
SELECT * from a,b where a.id='70000000' and b.id='70000000';
The text was updated successfully, but these errors were encountered: