ttljob: introduce sql.ttl.default_select_rate_limit cluster setting, ttl_select_rate_limit storage param #110742
Labels
A-row-level-ttl
branch-release-23.2
Used to mark GA and release blockers, technical advisories, and bugs for 23.2
branch-release-23.2.0-rc
(deleted)
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
GA-blocker
T-sql-foundations
SQL Foundations Team (formerly SQL Schema + SQL Sessions)
Ideally, admission control (AC) should handle all shaping of traffic that affects a bottleneck resource. And AC is well suited to shaping TTL traffic when the bottleneck is CPU or write bandwidth. But we have seen production cases where the TTL job is run frequently and scans a large table quickly and saturates the provisioned read bandwidth and read IOPS. These two resource dimensions are not currently considered by admission control and the work involved to change that is substantial (tracked in #107623).
For the TTL case, we need a quick workaround for the above cases. The TTL job has an optional
sql.ttl.default_delete_rate_limit
that can be set to shape the writes, but we need one for the reads too, in cases where there few writes (e.g. the TTL job running every 24h, but usual expiry happening after 365 days). We should introduce an optionalsql.ttl.default_select_rate_limit
for these scenarios.@ecwall
Additionally, add a ttl_select_rate_limit storage param similar to ttl_delete_rate_limit.
Jira issue: CRDB-31587
Epic CRDB-18322
The text was updated successfully, but these errors were encountered: