sql: support partitioned hash sharded index #76358

chengxiong-ruan · 2022-02-10T05:43:37Z

Release note (sql change): Previously, crdb blocked users from creating
hash sharded index in all kinds of partitioned tables including implict
partitioned tables using PARTITION ALL BY or REGIONAL BY ROW. Now
we turn on the support of hash sharded index in implicit partitioned
tables. Which means primary key cannot be hash sharded if a table is
explicitly partitioned with PARTITION BY or an index cannot be hash
sharded if the index is explicitly partitioned with PARTITION BY.
Paritioning columns cannot be placed explicitly as key columns of a
hash sharded index as well, including regional-by-row table's crdb_region
column. When a hash sharded index is partitioned, ranges are pre-split
within every single possible partition on shard boundaries. Each partition
is split up to 16 ranges, otherwise split into the number bucket count ranges.

cockroach-teamcity · 2022-02-10T05:43:44Z

This change is

chengxiong-ruan · 2022-02-11T22:58:17Z

pkg/ccl/logictestccl/testdata/logic_test/partitioning_hash_sharded_index, line 102 at r1 (raw file):

CREATE TABLE t_parent (
  id INT PRIMARY KEY USING HASH,
  part STRING CHECK (part IN ('seattle', 'new york'))

Notice that we need this check constraint to avoid a full scan when doing FK or Uniqueness check. This is only happens to PARTITION ALL BY.

chengxiong-ruan · 2022-02-14T15:29:39Z

pkg/sql/opt/exec/execbuilder/testdata/hash_sharded_index, line 263 at r2 (raw file):

│ into: t_hash_indexed(crdb_internal_a_shard_8, a, b)
│ auto commit
│ arbiter constraints: t_hash_indexed_pkey

looks like my change to optIndex.ImplicitColumnCount made the builder able to find an hash-sharded primary key arbiter would this be a concern? looks legit to me though.

chengxiong-ruan

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @mgartner)

pkg/ccl/logictestccl/testdata/logic_test/partitioning_hash_sharded_index, line 1295 at r3 (raw file):

│                             estimated row count: 1 (missing stats)
│                             table: t_unique_hash_sec_key@t_unique_hash_sec_key_pkey
│                             spans: /"new york"/1-/"new york"/2 /"seattle"/1-/"seattle"/2

this test seems to be flaky on the spans to scan :( which I think is because optimizer think the cost is same between them? I'll probably need to comment it out.
cc @mgartner

pkg/ccl/logictestccl/testdata/logic_test/partitioning_hash_sharded_index, line 1356 at r3 (raw file):

│                 estimated row count: 1 (missing stats)
│                 table: t_unique_hash_sec_key@t_unique_hash_sec_key_pkey
│                 spans: /"new york"/2-/"new york"/3 /"seattle"/2-/"seattle"/3

same flakiness

pkg/ccl/logictestccl/testdata/logic_test/regional_by_row_hash_sharded_index, line 1179 at r3 (raw file):

│                             estimated row count: 1 (missing stats)
│                             table: t_unique_hash_sec_key@t_unique_hash_sec_key_pkey
│                             spans: /"@"/1/0 /"\x80"/1/0 /"\xc0"/1/0

same flakiness

pkg/ccl/logictestccl/testdata/logic_test/regional_by_row_hash_sharded_index, line 1240 at r3 (raw file):

│                 estimated row count: 1 (missing stats)
│                 table: t_unique_hash_sec_key@t_unique_hash_sec_key_pkey
│                 spans: /"@"/2/0 /"\x80"/2/0 /"\xc0"/2/0

same flakiness

chengxiong-ruan

The 2nd commit add support of pre-splitting on shard boundaries for each possible partitioning prefix.

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @mgartner)

postamar

This will need another, more in-depth review but I gave it a look because I was curious.

Reviewed 13 of 21 files at r1, 3 of 7 files at r2, 1 of 1 files at r3, 5 of 5 files at r4, 2 of 2 files at r5, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @chengxiong-ruan, and @mgartner)

-- commits, line 25 at r5:
Your second commit is missing a title.

pkg/sql/create_index.go, line 521 at r4 (raw file):

			return nil, nil, pgerror.New(
				pgcode.FeatureNotSupported,
				"partitioning field cannot be hash sharded index key column",

I suggest you reword this because it might be confusing. What you're trying to convey is that you cannot create this sharded index on these columns unless none of them are featured in the PARTITION ALL BY clause.

pkg/sql/create_table.go, line 1255 at r4 (raw file):

// A non-nil regionConfig represents the region configuration of a multi-region
// db. A nil regionConfig means current db is not multi-regional.
//

I appreciate this comment. Doesn't n.Locality != nil imply that regionConfig is also not nil, though? This isn't your doing and I may be wrong so feel free to ignore this comment.

pkg/sql/create_table.go, line 1677 at r4 (raw file):

			return nil, pgerror.New(
				pgcode.FeatureNotSupported,
				"partitioning field cannot be hash sharded index key column",

Same comment as previously.

pkg/sql/schema_changer.go, line 2681 at r5 (raw file):

							return nil
						},
					)

Shouldn't there be a ForEachRange loop also? Or is that simply not relevant here because sharding & partitioning can only simultaneously be a thing for REGIONAL BY ROW and hence list-partitioning?

pkg/sql/catalog/descpb/structured.proto, line 1172 at r4 (raw file):

  // set on the table. This means that all indexes implicitly inherit all
  // partitioning from the PARTITION ALL BY clause or region configuration.
  optional bool partition_all_by = 44 [(gogoproto.nullable)=false];

Thanks for updating this comment. This fact was hardly obvious, sadly.

pkg/ccl/logictestccl/testdata/logic_test/regional_by_row_hash_sharded_index, line 1266 at r4 (raw file):

                    └── • scan buffer
                          columns: (id, email, crdb_region, crdb_internal_email_shard_16, email_new, crdb_internal_email_shard_16_cast, check1)
                          label: buffer 1

We don't typically have a lot of EXPLAINs in logic tests, presumably because the results are so brittle. I wonder if these are just going to be a nuisance in the long run.

chengxiong-ruan

Thanks for the review! Replied to some of your comments and will fix the others.

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @chengxiong-ruan, @mgartner, and @postamar)

pkg/sql/create_table.go, line 1255 at r4 (raw file):

Previously, postamar (Marius Posta) wrote…

I appreciate this comment. Doesn't n.Locality != nil imply that regionConfig is also not nil, though? This isn't your doing and I may be wrong so feel free to ignore this comment.

Oh, n.Locality != nil only means user input a SQL to create a table with LOCALITY, but it doesn't mean the db is multi-regional. If n.Locality != nil and regionConfig, it errors out. This happens about 40 lines below, but please allow me to paste the code here:

if n.Locality != nil && regionConfig == nil &&
  !opts.bypassLocalityOnNonMultiRegionDatabaseCheck {
  return nil, errors.WithHint(pgerror.Newf(
    pgcode.InvalidTableDefinition,
    "cannot set LOCALITY on a table in a database that is not multi-region enabled",
  ),
    "database must first be multi-region enabled using ALTER DATABASE ... SET PRIMARY REGION <region>",
  )
}

pkg/sql/schema_changer.go, line 2681 at r5 (raw file):

Previously, postamar (Marius Posta) wrote…

Shouldn't there be a ForEachRange loop also? Or is that simply not relevant here because sharding & partitioning can only simultaneously be a thing for REGIONAL BY ROW and hence list-partitioning?

good callout. The reasoning here is that we'd need to sample within each range, and this implies we need a bunch of interesting logic to sampling between a pair of keys (lower bound and upper bound). We may do that but it might not be very helpful. Because the traffic tends to hit a single range still. For example, we sample 2 points a and b between [lower_bound, upper_bound] and we split each into ranges of 3 shards, then we'll have ranges:

.... some previous ranges...
[/idx_id, /idx_id/a/0)
[/idx_id/a/0, /idx_id/a/1)
[/idx_id/a/1, /idx_id/a/2)
[/idx_id/a/2, /idx_id/b/0)
[/idx_id/b/0, /idx_id/b/1)
[/idx_id/b/1, /idx_id/b/2)
..... other ranges after

now, say, traffic coming in inserting with some key with value between a and b, it would only hit range[/idx_id/a/2, /idx_id/b/0), and the other ranges are not helpful here. We may presplit only on the sampling points instead of shard boundaries, but that's out of the scope of this pr. I'll add some comment about this, and also I forgot to add some comment we don't consider subpartition here subpartition is not supported by implicit partitioning yet.

pkg/sql/catalog/descpb/structured.proto, line 1172 at r4 (raw file):

Previously, postamar (Marius Posta) wrote…

Thanks for updating this comment. This fact was hardly obvious, sadly.

:)

pkg/ccl/logictestccl/testdata/logic_test/regional_by_row_hash_sharded_index, line 1266 at r4 (raw file):

Previously, postamar (Marius Posta) wrote…

We don't typically have a lot of EXPLAINs in logic tests, presumably because the results are so brittle. I wonder if these are just going to be a nuisance in the long run.

yeah, agree! I originally thought I should put it here because I'm testing a enterprise feature. @mgartner do you want me to move these test into tests of execbuilder?

postamar

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @chengxiong-ruan, and @mgartner)

pkg/sql/create_table.go, line 1255 at r4 (raw file):

Previously, chengxiong-ruan (Chengxiong Ruan) wrote…

Oh, n.Locality != nil only means user input a SQL to create a table with LOCALITY, but it doesn't mean the db is multi-regional. If n.Locality != nil and regionConfig, it errors out. This happens about 40 lines below, but please allow me to paste the code here:
if n.Locality != nil && regionConfig == nil &&
  !opts.bypassLocalityOnNonMultiRegionDatabaseCheck {
  return nil, errors.WithHint(pgerror.Newf(
    pgcode.InvalidTableDefinition,
    "cannot set LOCALITY on a table in a database that is not multi-region enabled",
  ),
    "database must first be multi-region enabled using ALTER DATABASE ... SET PRIMARY REGION <region>",
  )
}

Ah! Makes sense. Thanks.

pkg/sql/schema_changer.go, line 2681 at r5 (raw file):

Previously, chengxiong-ruan (Chengxiong Ruan) wrote…

good callout. The reasoning here is that we'd need to sample within each range, and this implies we need a bunch of interesting logic to sampling between a pair of keys (lower bound and upper bound). We may do that but it might not be very helpful. Because the traffic tends to hit a single range still. For example, we sample 2 points a and b between [lower_bound, upper_bound] and we split each into ranges of 3 shards, then we'll have ranges:
.... some previous ranges...
[/idx_id, /idx_id/a/0)
[/idx_id/a/0, /idx_id/a/1)
[/idx_id/a/1, /idx_id/a/2)
[/idx_id/a/2, /idx_id/b/0)
[/idx_id/b/0, /idx_id/b/1)
[/idx_id/b/1, /idx_id/b/2)
..... other ranges after
now, say, traffic coming in inserting with some key with value between a and b, it would only hit range[/idx_id/a/2, /idx_id/b/0), and the other ranges are not helpful here. We may presplit only on the sampling points instead of shard boundaries, but that's out of the scope of this pr. I'll add some comment about this, and also I forgot to add some comment we don't consider subpartition here subpartition is not supported by implicit partitioning yet.

Interesting! 👍

chengxiong-ruan

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @mgartner, and @postamar)

-- commits, line 25 at r5:

Previously, postamar (Marius Posta) wrote…

Your second commit is missing a title.

Done.

pkg/sql/create_index.go, line 521 at r4 (raw file):

Previously, postamar (Marius Posta) wrote…

I suggest you reword this because it might be confusing. What you're trying to convey is that you cannot create this sharded index on these columns unless none of them are featured in the PARTITION ALL BY clause.

Done.

pkg/sql/create_table.go, line 1255 at r4 (raw file):

Previously, postamar (Marius Posta) wrote…

Ah! Makes sense. Thanks.

Done.

pkg/sql/create_table.go, line 1677 at r4 (raw file):

Previously, postamar (Marius Posta) wrote…

Same comment as previously.

Done.

pkg/sql/schema_changer.go, line 2681 at r5 (raw file):

Previously, postamar (Marius Posta) wrote…

Interesting! 👍

Done.

rytaft

Reviewed 1 of 21 files at r1.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @chengxiong-ruan, @mgartner, and @postamar)

pkg/ccl/logictestccl/testdata/logic_test/partitioning_hash_sharded_index, line 1295 at r3 (raw file):

Previously, chengxiong-ruan (Chengxiong Ruan) wrote…

this test seems to be flaky on the spans to scan :( which I think is because optimizer think the cost is same between them? I'll probably need to comment it out.
cc @mgartner

You probably need to be explicit about the column families in the schema, because otherwise the logic tests will randomize them -- that can be a source of flakiness. The optimizer plan is deterministic.

pkg/ccl/logictestccl/testdata/logic_test/partitioning_hash_sharded_index, line 859 at r8 (raw file):

  id INT PRIMARY KEY,
  email STRING,
  part STRING CHECK (part IN ('seattle', 'new york'))

add the column family here and in other tables where you're showing the explain plan

pkg/ccl/logictestccl/testdata/logic_test/regional_by_row_hash_sharded_index, line 1266 at r4 (raw file):

Previously, chengxiong-ruan (Chengxiong Ruan) wrote…

yeah, agree! I originally thought I should put it here because I'm testing a enterprise feature. @mgartner do you want me to move these test into tests of execbuilder?

You won't be able to move this to execbuilder since as you noted partitioning is an enterprise feature. But you should probably try to put the EXPLAINs in different files from the normal query results, so that we can test more configurations for the normal query results without making the EXPLAINs flaky. Notice that we have different files for this already (e.g., regional_by_row and regional_by_row_query_behavior), so you might want to follow that example.

chengxiong-ruan

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @mgartner, @postamar, and @rytaft)

pkg/ccl/logictestccl/testdata/logic_test/partitioning_hash_sharded_index, line 1295 at r3 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

You probably need to be explicit about the column families in the schema, because otherwise the logic tests will randomize them -- that can be a source of flakiness. The optimizer plan is deterministic.

ah, thanks for pointing out the monster here! explicit families are added and it's not flaky anymore

pkg/ccl/logictestccl/testdata/logic_test/partitioning_hash_sharded_index, line 859 at r8 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

add the column family here and in other tables where you're showing the explain plan

it worked, thanks again!

pkg/ccl/logictestccl/testdata/logic_test/regional_by_row_hash_sharded_index, line 1266 at r4 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

You won't be able to move this to execbuilder since as you noted partitioning is an enterprise feature. But you should probably try to put the EXPLAINs in different files from the normal query results, so that we can test more configurations for the normal query results without making the EXPLAINs flaky. Notice that we have different files for this already (e.g., regional_by_row and regional_by_row_query_behavior), so you might want to follow that example.

thanks! I moved all tests with EXPLAIN to separate files.

chengxiong-ruan

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @chengxiong-ruan, @mgartner, @postamar, and @rytaft)

-- commits, line 6 at r9:

Previously, mgartner (Marcus Gartner) wrote…

nit: "kye" => "key"

good eyes! fixed

pkg/sql/opt/optbuilder/insert.go, line 368 at r9 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

In the case of a non-partitioned hash-sharded primary index, I don't think we need to fetch existing rows. The shard column is a function of the other PK columns, so it should be possible to perform the UPSERT fast path which blindly write the primary index entry without fetching the existing row. I don't think this is a huge concern for now, but I think it's worth leaving a todo: TODO(mgartner): It should be possible to perform an UPSERT fast path for a non-partitioned hash-sharded primary index, but this restriction disallows it..

Thanks for calling it out. TODO added

pkg/ccl/logictestccl/testdata/logic_test/regional_by_row_hash_sharded_index_query_plan, line 198 at r9 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

It would be interesting to have multi-insert-row variations for all of the INSERT ON CONFLICT ... and UPSERT tests, like we did in your previous PR. These cases are interesting because different optimizations need to be applied to make the query plan efficient.

test cases added.

pkg/ccl/logictestccl/testdata/logic_test/regional_by_row_hash_sharded_index_query_plan, line 703 at r10 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

Does it pass both configurations for you locally? It looks like only multiregion-9node-3region-3azs is failing in CI. Otherwise, I don't know off the top of my head, sorry!

ha, no worries :) yeah, it passes in both configs locally

pkg/sql/opt/exec/execbuilder/testdata/hash_sharded_index, line 263 at r2 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

It looks semantically correct to me, but less efficient. The previous plan was a special case, the UPSERT fast path, which avoids reading from the table before performing an UPSERT. See my comment in pkg/sql/opt/optbuilder/insert.go. I don't think this is a huge concern, but worth leaving a TODO above this test: TODO(mgartner): We should be able to perform the UPSERT fast path in this case, because the shard column is a function of the other PK columns..

TODO added, thanks!

chengxiong-ruan

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @chengxiong-ruan, @mgartner, @postamar, and @rytaft)

pkg/ccl/logictestccl/testdata/logic_test/partitioning_hash_sharded_index_query_plan, line 40 at r12 (raw file):

);

# TODO (mgartner): there is a lookup join with lookup condition checking every

Apologies if I added too many such TODOs. I wanted to add a single comment for the file, but still wanted to be more accurate and consistent with the convention on how other TODOs were added :(

chengxiong-ruan · 2022-02-21T19:00:40Z

There is still a test failed in CI due to some discrepancy on scan planning:
at local, with the same test configuration, I always get one scan scanning the 3 regions:

└── • scan
      columns: (email)
      estimated row count: 1 (missing stats)
      table: t_unique_hash_sec_key@idx_uniq_hash_email
      spans: /"@"/13/"some_email"/0 /"\x80"/13/"some_email"/0 /"\xc0"/13/"some_email"/0
      parallel

in CI it also scans the 3 regions, but it has 2 scans then make a union:

└── • union all
    │ columns: (email)
    │ estimated row count: 1 (missing stats)
    │ limit: 1
    │
    ├── • scan
    │     columns: (email)
    │     estimated row count: 1 (missing stats)
    │     table: t_unique_hash_sec_key@idx_uniq_hash_email
    │     spans: /"@"/13/"some_email"/0
    │
    └── • scan
          columns: (email)
          estimated row count: 1 (missing stats)
          table: t_unique_hash_sec_key@idx_uniq_hash_email
          spans: /"\x80"/13/"some_email"/0 /"\xc0"/13/"some_email"/0
          parallel

I've been scratching my head because could repro it :(

chengxiong-ruan · 2022-02-21T19:01:02Z

could -> couldn't

mgartner

Nice work! First commit . I'll let someone with more experience with splitting sign off on the second commit.

I'll take a look at the flaky test and let you know what I find.

Reviewed 2 of 5 files at r9, 15 of 15 files at r11, 3 of 3 files at r12, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @ajwerner, @chengxiong-ruan, @postamar, and @rytaft)

pkg/sql/create_index.go, line 521 at r4 (raw file):

Previously, chengxiong-ruan (Chengxiong Ruan) wrote…

Done.

nit: Maybe we can shorten this a bit? "hash sharded indexes cannot include implicit partitioning columns from "PARTITION ALL BY" or "LOCALITY REGIONAL BY ROW"

pkg/sql/create_table.go, line 1677 at r4 (raw file):

nit: Maybe we can shorten this a bit? "hash sharded indexes cannot include implicit partitioning columns from "PARTITION ALL BY" or "LOCALITY REGIONAL BY ROW"

pkg/sql/create_table.go, line 1894 at r11 (raw file):

					return nil, pgerror.New(
						pgcode.FeatureNotSupported,
						"hash sharded indexes don't support explicit partitioning with PARTITION BY",

nit: Maybe: "hash sharded indexes cannot be explicitly partitioned"

pkg/ccl/logictestccl/testdata/logic_test/partitioning_hash_sharded_index, line 134 at r11 (raw file):

  crdb_internal_c_shard_16 INT4 NOT VISIBLE NOT NULL AS (mod(fnv32(crdb_internal.datums_to_bytes(c)), 16:::INT8)) VIRTUAL,
  CONSTRAINT t_to_be_hashed_pkey PRIMARY KEY (a ASC),
  INDEX t_to_be_hashed_c_idx (c ASC) USING HASH WITH (bucket_count=16),

Normally there is a CHECK constraint created for shard columns. Why is it missing in this test and the ones below?

pkg/ccl/logictestccl/testdata/logic_test/partitioning_hash_sharded_index_query_plan, line 40 at r12 (raw file):

Previously, chengxiong-ruan (Chengxiong Ruan) wrote…

Apologies if I added too many such TODOs. I wanted to add a single comment for the file, but still wanted to be more accurate and consistent with the convention on how other TODOs were added :(

This is fine, thanks for leaving them. After this is merged I'll rebase #76817 and look into any plans that don't improve. If they don't improve, it might be related to this known issue: #63882. If so, I'll try to revive #65093.

rytaft · 2022-02-21T20:58:01Z

For the flaky test: the plan with the UNION ALL is how we plan locality optimized search. That is the correct plan for multiregion-9node-3region-3azs. However, we can't yet plan locality optimized search with the multiregion-9node-3region-3azs-tenant configuration due to #75864, so you should remove that configuration from the top of regional_by_row_hash_sharded_index_query_plan (Similar to the regional_by_row_query_behavior file).

Also it seems like you need these two lines to avoid flakes (also copied from regional_by_row_query_behavior):

statement ok
SET CLUSTER SETTING kv.closed_timestamp.side_transport_interval = '10ms';
SET CLUSTER SETTING kv.closed_timestamp.target_duration = '10ms';

chengxiong-ruan

Thanks @rytaft, you're right! after setting those cluster settings, looks like lookups and scans are split into 2 ops consistently, one for local and another for remote.

Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @ajwerner, @chengxiong-ruan, @mgartner, @postamar, and @rytaft)

pkg/sql/create_index.go, line 521 at r4 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

nit: Maybe we can shorten this a bit? "hash sharded indexes cannot include implicit partitioning columns from "PARTITION ALL BY" or "LOCALITY REGIONAL BY ROW"

Done.

pkg/sql/create_table.go, line 1677 at r4 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

nit: Maybe we can shorten this a bit? "hash sharded indexes cannot include implicit partitioning columns from "PARTITION ALL BY" or "LOCALITY REGIONAL BY ROW"

Done.

pkg/ccl/logictestccl/testdata/logic_test/partitioning_hash_sharded_index, line 134 at r11 (raw file):

hash sharded indexes cannot be explicitly partitioned
yes, it's still there, we just chose not to print a hidden constraint now since USING HASH implies the constraint with this fix #74179
let me know if that makes sense.

pkg/ccl/logictestccl/testdata/logic_test/partitioning_hash_sharded_index_query_plan, line 40 at r12 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

This is fine, thanks for leaving them. After this is merged I'll rebase #76817 and look into any plans that don't improve. If they don't improve, it might be related to this known issue: #63882. If so, I'll try to revive #65093.

Thank you!

Validations are added based on 2 rules: 1. hash sharded index cannot work with any explicit partitioning 2. hash sharded index cannot have partitioning field as explicit key column Also have a minor fix on how we calculate number of implicit columns. Release note (sql change): Previously, crdb blocked users from creating hash sharded index in all kinds of partitioned tables including implict partitioned tables using `PARTITION ALL BY` or `REGIONAL BY ROW`. Now we turn on the support of hash sharded index in implicit partitioned tables. Which means primary key cannot be hash sharded if a table is explicitly partitioned with `PARTITION BY` or an index cannot be hash sharded if the index is explicitly partitioned with `PARTITION BY`. Paritioning columns cannot be placed explicitly as key columns of a hash sharded index as well, including regional-by-row table's `crdb_region` column.

Release note (sql change): When a hash sharded index is partitioned, ranges are pre-split within every single possible partition on shard boundaries. Each partition is split up to 16 ranges, otherwise split into the number bucket count ranges. Note that, only list partition is being presplit, we don't presplit for range partition.

mgartner

Reviewed 7 of 7 files at r13, 3 of 3 files at r14, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @ajwerner, @chengxiong-ruan, @mgartner, @postamar, and @rytaft)

postamar · 2022-02-22T21:38:36Z

LGTM as far as I can tell.

chengxiong-ruan · 2022-02-22T21:39:17Z

bors r+

craig · 2022-02-22T23:44:24Z

Build succeeded:

GitHub CI (Cockroach)

In cockroachdb#76358, a change was made to consider both partitioning and hash-shard columns as implicit, which was required for finding arbiters for UPSERT statements in tables that have partitioned, hash-sharded primary indexes. A side-effect of this change is that it prevents the planning of the fast-path of UPSERTs into tables with hash-sharded primary indexes. This commit fixes this regression. Release justification: This fixes a performance regression for hash-sharded indexes, which are being released as non-experimental in the upcoming release. There is no release note because this regression is not present in any releases. Release note: None

77561: changefeedccl: set server.child_metrics.enabled public column to true r=sherman-grewal a=sherman-grewal changefeedccl: set server.child_metrics.enabled public column to true Resolves #77536 Since we're expecting people to use this setting to use metrics labels, it should show up when running SHOW ALL CLUSTER SETTINGS. Release note (enterprise change): The public column of setting server.child_metrics.enabled will be set to true. Release justification: This change is very small and low risk 77775: opt: fix upsert fast path for hash-sharded primary indexes r=mgartner a=mgartner In #76358, a change was made to consider both partitioning and hash-shard columns as implicit, which was required for finding arbiters for UPSERT statements in tables that have partitioned, hash-sharded primary indexes. A side-effect of this change is that it prevents the planning of the fast-path of UPSERTs into tables with hash-sharded primary indexes. This commit fixes this regression. Release justification: This fixes a performance regression for hash-sharded indexes, which are being released as non-experimental in the upcoming release. There is no release note because this regression is not present in any releases. Release note: None Co-authored-by: Sherman Grewal <sherman.grewal@cockroachlabs.com> Co-authored-by: Marcus Gartner <marcus@cockroachlabs.com>

chengxiong-ruan force-pushed the support-hash-sharded-index-with-implicit-partitioning branch 5 times, most recently from f321e31 to 13c2bee Compare February 11, 2022 22:25

chengxiong-ruan force-pushed the support-hash-sharded-index-with-implicit-partitioning branch 3 times, most recently from 6fcaac8 to db2412f Compare February 14, 2022 15:10

chengxiong-ruan marked this pull request as ready for review February 14, 2022 15:29

chengxiong-ruan requested a review from a team February 14, 2022 15:29

chengxiong-ruan requested a review from a team as a code owner February 14, 2022 15:29

chengxiong-ruan requested review from mgartner and ajwerner February 14, 2022 15:30

chengxiong-ruan force-pushed the support-hash-sharded-index-with-implicit-partitioning branch from db2412f to 2a19f7b Compare February 14, 2022 16:06

chengxiong-ruan commented Feb 14, 2022

View reviewed changes

chengxiong-ruan force-pushed the support-hash-sharded-index-with-implicit-partitioning branch 2 times, most recently from c8c31de to 16d4473 Compare February 15, 2022 03:40

chengxiong-ruan commented Feb 15, 2022

View reviewed changes

postamar reviewed Feb 15, 2022

View reviewed changes

chengxiong-ruan commented Feb 15, 2022

View reviewed changes

postamar reviewed Feb 15, 2022

View reviewed changes

chengxiong-ruan force-pushed the support-hash-sharded-index-with-implicit-partitioning branch from 16d4473 to 40faba4 Compare February 15, 2022 15:16

chengxiong-ruan commented Feb 15, 2022

View reviewed changes

rytaft reviewed Feb 15, 2022

View reviewed changes

chengxiong-ruan force-pushed the support-hash-sharded-index-with-implicit-partitioning branch from 40faba4 to 7fa5c71 Compare February 15, 2022 18:10

chengxiong-ruan commented Feb 15, 2022

View reviewed changes

chengxiong-ruan force-pushed the support-hash-sharded-index-with-implicit-partitioning branch from 7fa5c71 to 3cfb251 Compare February 15, 2022 19:30

chengxiong-ruan force-pushed the support-hash-sharded-index-with-implicit-partitioning branch from 01e9395 to c6dd0ac Compare February 17, 2022 21:41

chengxiong-ruan commented Feb 17, 2022

View reviewed changes

chengxiong-ruan force-pushed the support-hash-sharded-index-with-implicit-partitioning branch 2 times, most recently from 9ae8a97 to 8234562 Compare February 21, 2022 16:53

chengxiong-ruan commented Feb 21, 2022

View reviewed changes

mgartner approved these changes Feb 21, 2022

View reviewed changes

chengxiong-ruan force-pushed the support-hash-sharded-index-with-implicit-partitioning branch from 8234562 to d1cc0a8 Compare February 21, 2022 22:40

chengxiong-ruan commented Feb 21, 2022

View reviewed changes

chengxiong-ruan added 2 commits February 21, 2022 18:38

chengxiong-ruan force-pushed the support-hash-sharded-index-with-implicit-partitioning branch from d1cc0a8 to e7caa94 Compare February 21, 2022 23:39

mgartner approved these changes Feb 22, 2022

View reviewed changes

craig bot merged commit 67c8277 into cockroachdb:master Feb 22, 2022

This was referenced Feb 22, 2022

sql: support partitioned hash sharded index cockroachdb/docs#13087

Closed

sql: support partitioned hash sharded index cockroachdb/docs#13088

Closed

chengxiong-ruan deleted the support-hash-sharded-index-with-implicit-partitioning branch February 23, 2022 14:34

postamar mentioned this pull request Feb 23, 2022

Support hash-sharded index in partitioned tables #74626

Closed

This was linked to issues Feb 23, 2022

Support hash-sharded index in partitioned tables #74626

Closed

presplit partitioned hash sharded index #76507

Closed

chengxiong-ruan mentioned this pull request Feb 23, 2022

presplit partitioned hash sharded index #76507

Closed

mgartner mentioned this pull request Mar 14, 2022

opt: fix upsert fast path for hash-sharded primary indexes #77775

Merged

livio-a mentioned this pull request Apr 5, 2022

Handle locality in eventstore and projections zitadel/zitadel#3224

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sql: support partitioned hash sharded index #76358

sql: support partitioned hash sharded index #76358

chengxiong-ruan commented Feb 10, 2022 •

edited

Loading

cockroach-teamcity commented Feb 10, 2022

chengxiong-ruan commented Feb 11, 2022

chengxiong-ruan commented Feb 14, 2022

chengxiong-ruan left a comment

chengxiong-ruan left a comment

postamar left a comment

chengxiong-ruan left a comment

postamar left a comment

chengxiong-ruan left a comment

rytaft left a comment

chengxiong-ruan left a comment

chengxiong-ruan left a comment

chengxiong-ruan left a comment

chengxiong-ruan commented Feb 21, 2022

chengxiong-ruan commented Feb 21, 2022

mgartner left a comment

rytaft commented Feb 21, 2022

chengxiong-ruan left a comment

mgartner left a comment

postamar commented Feb 22, 2022

chengxiong-ruan commented Feb 22, 2022

craig bot commented Feb 22, 2022

sql: support partitioned hash sharded index #76358

sql: support partitioned hash sharded index #76358

Conversation

chengxiong-ruan commented Feb 10, 2022 • edited Loading

cockroach-teamcity commented Feb 10, 2022

chengxiong-ruan commented Feb 11, 2022

chengxiong-ruan commented Feb 14, 2022

chengxiong-ruan left a comment

Choose a reason for hiding this comment

chengxiong-ruan left a comment

Choose a reason for hiding this comment

postamar left a comment

Choose a reason for hiding this comment

chengxiong-ruan left a comment

Choose a reason for hiding this comment

postamar left a comment

Choose a reason for hiding this comment

chengxiong-ruan left a comment

Choose a reason for hiding this comment

rytaft left a comment

Choose a reason for hiding this comment

chengxiong-ruan left a comment

Choose a reason for hiding this comment

chengxiong-ruan left a comment

Choose a reason for hiding this comment

chengxiong-ruan left a comment

Choose a reason for hiding this comment

chengxiong-ruan commented Feb 21, 2022

chengxiong-ruan commented Feb 21, 2022

mgartner left a comment

Choose a reason for hiding this comment

rytaft commented Feb 21, 2022

chengxiong-ruan left a comment

Choose a reason for hiding this comment

mgartner left a comment

Choose a reason for hiding this comment

postamar commented Feb 22, 2022

chengxiong-ruan commented Feb 22, 2022

craig bot commented Feb 22, 2022

chengxiong-ruan commented Feb 10, 2022 •

edited

Loading