-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: better query planning for OR expressions #2142
Comments
We really want |
@RaduBerinde @knz is this still an active issue? Should it be 1.0 if so? If not, please close. |
It's still active. We don't optimize OR. Probably won't make it for 1.0 though.
Meta: what about we don't reclassify active but postponed issues blindly but instead ensure first there's a doc issue/PR to mark the thing as known limitation?
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
|
Document this as known limitation with cockroachdb/docs#1381 (review). @knz, @petermattis, please review my language there and let me know if we need to add a workaround. |
Deassigning myself as I am not working on this before a suitable IR. |
This commit adds a new exploration rule that can produce better query plans for disjunctions (e.g. a = 1 OR b = 2). The rule transforms some Select + Scan expressions with a disjunction filter into a Union of two Select expressions, each with one side of the disjuction as a filter. This can result in faster query plans in cases where two indexes cover each side of the disjunction. This rule only applies for Scan expressions that contain a strict key. Fixes cockroachdb#2142 Release note (performance improvement): The query optimizer now produces faster query plans for some disjunctions (OR expressions) by utilizing multiple indexes.
This commit adds a new exploration rule that can produce better query plans for disjunctions (e.g. a = 1 OR b = 2). The rule transforms some Select + Scan expressions with a disjunction filter into a Union of two Select expressions, each with one side of the disjuction as a filter. This can result in faster query plans in cases where two indexes cover each side of the disjunction. This rule only applies for Scan expressions that contain a strict key. Fixes cockroachdb#2142 Release note (performance improvement): The query optimizer now produces faster query plans for some disjunctions (OR expressions) by utilizing multiple indexes.
This commit adds a new exploration rule that can produce better query plans for disjunctions (e.g. a = 1 OR b = 2). The rule transforms some Select + Scan expressions with a disjunction filter into a Union of two Select expressions, each with one side of the disjuction as a filter. This can result in faster query plans in cases where two indexes cover each side of the disjunction. This rule only applies for Scan expressions that contain a strict key. Fixes cockroachdb#2142 Release note (performance improvement): The query optimizer now produces faster query plans for some disjunctions (OR expressions) by utilizing multiple indexes.
This commit adds a new exploration rule that can produce better query plans for disjunctions (e.g. a = 1 OR b = 2). The rule transforms some Select + Scan expressions with a disjunction filter into a Union of two Select expressions, each with one side of the disjuction as a filter. This can result in faster query plans in cases where two indexes cover each side of the disjunction. This rule only applies for Scan expressions that contain a strict key. Fixes cockroachdb#2142 Release note (performance improvement): The query optimizer now produces faster query plans for some disjunctions (OR expressions) by utilizing multiple indexes.
46923: ui: Link from statement diagnostics to details page r=dhartunian a=koorosh Resolves: #46559 We have Statements diagnostics history page with list of all requested diagnostics. Before, statement fingerprint were represented as a simple text and now it is a links to statement details page. One notion, that it is possible to have diagnostics for statements which is already cleared. In this case statement is displayed as a text instead of link. Release note (admin ui change): Add links to statement details from Statement Diagnostics history page. Release justification: bug fixes and low-risk updates to new functionality ![out](https://user-images.githubusercontent.com/3106437/78255335-d824e300-74ff-11ea-9203-233ac8ba67fc.gif) 47094: opt: add GenerateUnionSelects exploration rule for disjunction r=mgartner a=mgartner #### opt: add GenerateUnionSelects exploration rule for disjunction This commit adds a new exploration rule that can produce better query plans for disjunctions (e.g. a = 1 OR b = 2). The rule transforms some Select + Scan expressions with a disjunction filter into a Union of two Select expressions, each with one side of the disjuction as a filter. This can result in faster query plans in cases where two indexes cover each side of the disjunction. This rule only applies for Scan expressions that contain a strict key. Fixes #2142 Release note (performance improvement): The query optimizer now produces faster query plans for some disjunctions (OR expressions) by utilizing multiple indexes. #### sql: allow UNION with hidden and non-hidden columns This commit removes an assertion that required corresponding columns on each side of a UNION to be both hidden or non-hidden. Prior to this change, the following statements would yield the error: "UNION types cannot be matched". CREATE TABLE ab (a INT, b INT); SELECT a, b, rowid FROM ab UNION VALUES (1, 2, 3); With this commit, the above statements are executed without error. Release note (bug fix): Fixed an incorrect error the ocurred when executing UNION statements with hidden and non-hidden columns. Co-authored-by: Andrii Vorobiov <and.vorobiov@gmail.com> Co-authored-by: Marcus Gartner <marcus@cockroachlabs.com>
2nd oldest open issue in the project finally closed! |
@andy-kimball cool! But it depends on the interpretation. The issue mention the use case of Our use case is the JSONB comparison: |
This change fixes the inequality case as well, as long as the optimizer estimates that there are fewer rows to scan in the two index case than the one index case. As for the JSON case, can you give a complete example, including schema, some rows of sample data, and full SQL query? We can see if the new code handles that case. |
Here's an inequality example:
In this case, the optimizer estimates that the first index scan will return ~10K rows, and then second index scan will return ~5K rows. It decides that the UNION of these two is better than scanning all 100K rows of the primary index, and so uses the new plan enabled by this work. |
Great, I've looked at the description and the tests and couldn't see the inequality case. It's not a complete query (I can provide the complete query, but it has another problems with query planer: full table scan on select version();
version
+------------------------------------------------------------------------------------------+
CockroachDB CCL v19.2.5 (x86_64-unknown-linux-gnu, built 2020/03/16 18:27:12, go1.12.12)
CREATE TABLE t (
id UUID NOT NULL DEFAULT gen_random_uuid(),
perms JSONB NOT NULL,
CONSTRAINT "primary" PRIMARY KEY (id ASC),
INVERTED INDEX ix_perms (perms)
);
insert into t (perms) select jsonb_build_object('scope', jsonb_build_object('xx' || a::string, jsonb_build_array(jsonb_build_object('type', 'column')))) from generate_series(1,100000) temp(a);
> explain (opt) select * from t where perms->'scope'->'xx2000' @> '[{"type":"column"}]' or perms->'scope'->'xx4000' @> '[{"type":"column"}]';
text
+--------------------------------------------------------------------------------------------------------------------------------+
select
├── scan t
└── filters
└── (perms @> '{"scope": {"xx2000": [{"type": "column"}]}}') OR (perms @> '{"scope": {"xx4000": [{"type": "column"}]}}')
> select * from t where perms->'scope'->'xx2000' @> '[{"type":"column"}]' or perms->'scope'->'xx4000' @> '[{"type":"column"}]';
id | perms
+--------------------------------------+---------------------------------------------+
6815b87d-4af4-4a58-aee0-de00dc4375f4 | {"scope": {"xx4000": [{"type": "column"}]}}
9102f2a6-a26a-4139-b085-7b33948b4652 | {"scope": {"xx2000": [{"type": "column"}]}}
(2 rows)
Time: 668.315636ms
> select * from t where perms->'scope'->'xx9000' @> '[{"type":"column"}]';
id | perms
+--------------------------------------+---------------------------------------------+
93025901-3c48-4d79-94ba-ee32bf4f0a2a | {"scope": {"xx9000": [{"type": "column"}]}}
(1 row)
Time: 1.05896ms This one could be fixed in 20.1 (not tested yet):
> select * from t where perms->'scope'->'xx20000' @> '[{"type":"column"}]' union select * from t where perms->'scope'->'xx41000' @> '[{"type":"col
umn"}]';
pq: unable to encode table key: *tree.DJSON |
It appears that 20.1 fixes #35260 and #46709, but not #35706. Also, the newly merged UNION rule (which will go into 20.2 since it missed cutoff for 20.1) does not handle this JSON case you give, because it only splits OR clauses that operate over different columns from the input table. I opened an issue to track the inverted index case: #47340. |
Consider a query like
SELECT * FROM foo WHERE a > 1 OR b > 2
. Even if there are appropriate indexes to satisfy botha > 1
andb > 2
, the current query planner will perform a full table or index scan because it can't use both conditions at once. Instead, the query planner should check to see if 2 separate index scans can be used and their results merged using a "union" operation.This is essentially a generalization of #2140.
The text was updated successfully, but these errors were encountered: