You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I ran across an interesting observation while trying to explain how queries are run in parallel within CRDB. I was using the internal KV workload and have a fair number of ranges distributed across three nodes:
root@:26257/kv> SELECT k2, count(*) FROM kv GROUP BY k2;
k2 | count
---------+----------
z | 4561620
needle | 8
I was expecting that the above query would run in parallel, but when looking at the Jaeger trace collected, it appears to not be serial.
To clarify - the query runs in parallel on the three nodes, but the traces for the actual flow and scans are missing from node 2 and 3. It looks like a problem with tracing when using the vectorized engine. I tried to repro locally in a simpler case and could not - I saw traces for scans from all nodes.
RaduBerinde
changed the title
vectorized query doesn't appear to run concurrently across nodes
sql: incomplete vectorized traces
Jan 28, 2021
Describe the problem
I ran across an interesting observation while trying to explain how queries are run in parallel within CRDB. I was using the internal KV workload and have a fair number of ranges distributed across three nodes:
I was expecting that the above query would run in parallel, but when looking at the Jaeger trace collected, it appears to not be serial.
[stmt-bundle-625911811735650305.zip](https://github.com/cockroachdb/cockroach/files/5889843/stmt-bundle-625911811735650305.zip) [stmt-bundle-628463749489590273.zip](https://github.com/cockroachdb/cockroach/files/5889847/stmt-bundle-628463749489590273.zip)
At the advise of engineering, I ran this by disabling vectorization
SET vectorize = off
and it appeared to run as expected albeit longer.The text was updated successfully, but these errors were encountered: