Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
76012: server, sql: add VIEWCLUSTERSETTING user privilege r=koorosh a=koorosh Before, only users with `admin` role or `MODIFYCLUSTERSETTING` permission could view cluster settings. Now, new role is added to provide users view-only permission to view cluster settings from SQL shell and in Db Console (in Advanced debugging > Cluster settings). This change doesn't change behavior for `MODIFYCLUSTERSETTING` option, it also allows view and modify cluster settings. Release note (sql change): new user privileges are added: `VIEWCLUSTERSETTING` and `NOVIEWCLUSTERSETTING` that allows users to view cluster settings only. Resolves: #74692 76215: kvserver: loosely couple raft log truncation r=tbg a=sumeerbhola In the ReplicasStorage design we stop making any assumptions regarding what is durable in the state machine when syncing a batch that commits changes to the raft log. This implies the need to make raft log truncation more loosely coupled than it is now, since we can truncate only when certain that the state machine is durable up to the truncation index. Current raft log truncation flows through raft and even though the RaftTruncatedStateKey is not a replicated key, it is coupled in the sense that the truncation is done below raft when processing the corresponding log entry (that asked for truncation to be done). The current setup also has correctness issues wrt maintaining the raft log size, when passing the delta bytes for a truncation. We compute the delta at proposal time (to avoid repeating iteration over the entries in all replicas), but we do not pass the first index corresponding to the truncation, so gaps or overlaps cannot be noticed at truncation time. We do want to continue to have the raft leader guide the truncation since we do not want either leader or followers to over-truncate, given our desire to serve snapshots from any replica. In the loosely coupled approach implemented here, the truncation request that flows through raft serves as an upper bound on what can be truncated. The truncation request includes an ExpectedFirstIndex. This is further propagated using ReplicatedEvalResult.RaftExpectedFirstIndex. This ExpectedFirstIndex allows one to notice gaps or overlaps when enacting a sequence of truncations, which results in setting the Replica.raftLogSizeTrusted to false. The correctness issue with Replica.raftLogSize is not fully addressed since there are existing consistency issues when evaluating a TruncateLogRequest (these are now noted in a code comment). Below raft, the truncation requests are queued onto a Replica in pendingLogTruncations. The queueing and dequeuing is managed by a raftLogTruncator that takes care of merging pending truncation requests and enacting the truncations when the durability of the state machine advances. The pending truncation requests are taken into account in the raftLogQueue when deciding whether to do another truncation. Most of the behavior of the raftLogQueue is unchanged. The new behavior is gated on a LooselyCoupledRaftLogTruncation cluster version. Additionally, the new behavior can be turned off using the kv.raft_log.enable_loosely_coupled_truncation.enabled cluster setting, which is true by default. The latter is expected to be a safety switch for 1 release after which we expect to remove it. That removal will also cleanup some duplicated code (that was non-trivial to refactor and share) between the previous coupled and new loosely coupled truncation. Note, this PR is the first of two -- loosely coupled truncation is turned off via a constant in this PR. The next one will eliminate the constant and put it under the control of the cluster setting. Informs #36262 Informs #16624 Release note (ops change): The cluster setting kv.raft_log.loosely_coupled_truncation.enabled can be used to disable loosely coupled truncation. 76358: sql: support partitioned hash sharded index r=chengxiong-ruan a=chengxiong-ruan Release note (sql change): Previously, crdb blocked users from creating hash sharded index in all kinds of partitioned tables including implict partitioned tables using `PARTITION ALL BY` or `REGIONAL BY ROW`. Now we turn on the support of hash sharded index in implicit partitioned tables. Which means primary key cannot be hash sharded if a table is explicitly partitioned with `PARTITION BY` or an index cannot be hash sharded if the index is explicitly partitioned with `PARTITION BY`. Paritioning columns cannot be placed explicitly as key columns of a hash sharded index as well, including regional-by-row table's `crdb_region` column. When a hash sharded index is partitioned, ranges are pre-split within every single possible partition on shard boundaries. Each partition is split up to 16 ranges, otherwise split into the number bucket count ranges. Co-authored-by: Andrii Vorobiov <and.vorobiov@gmail.com> Co-authored-by: sumeerbhola <sumeer@cockroachlabs.com> Co-authored-by: Chengxiong Ruan <chengxiongruan@gmail.com>
- Loading branch information