write caching - configurations #16924

bharathv · 2024-03-06T22:16:11Z

This PR adds all the configurations/tunables needed for the write caching project as described in https://docs.google.com/document/d/1gD2vXeQuv9gslagl6rDxx3XQsBI9BkXT2Ktpz9my2V0/edit#heading=h.5zvj0etnl8iy

The code using them will come as a separate PR, so until that lands these are dummy unused configurations, just splitting them out into a separate PR for reviewers' convenience.

Summary of configurations

Cluster level:

write_caching - cluster level write caching default - [on, off, disabled]
raft_replica_max_flush_delay_ms - for timer based flush - default 100ms
(to be deprecated) raft_flush_timer_interval_ms

Topic level:

write.caching - topic level override for write_caching - [on, off]
flush.ms - topic level override for raft_replica_max_flush_delay_ms
flush.bytes - topic level override for raft_replica_max_pending_flush_bytes

write_caching=disabled is intended to be the feature kill switch that takes precedence over topic overrides.

Release Notes

Features

Adds new cluster and topic level configurations for write caching feature.

Backports Required

Release Notes

bharathv · 2024-03-06T22:47:57Z

/dt

vbotbuildovich · 2024-03-07T01:19:47Z

new failures in https://buildkite.com/redpanda/redpanda/builds/45762#018e1636-93c3-4647-b703-644154fe715e:

"rptest.tests.describe_topics_test.DescribeTopicsTest.test_describe_topics_with_documentation_and_types"

new failures in https://buildkite.com/redpanda/redpanda/builds/45762#018e1636-93bc-4acf-aad7-3901eb2cf74c:

"rptest.tests.cluster_config_test.ClusterConfigTest.test_valid_settings"
"rptest.tests.topic_creation_test.CreateTopicsTest.test_no_log_bloat_when_recreating_existing_topics"

new failures in https://buildkite.com/redpanda/redpanda/builds/45762#018e1636-93c0-4ff3-b62e-5b541a1bc235:

"rptest.tests.random_node_operations_test.RandomNodeOperationsTest.test_node_operations.enable_failures=False.num_to_upgrade=0.with_tiered_storage=False"
"rptest.tests.random_node_operations_test.RandomNodeOperationsTest.test_node_operations.enable_failures=True.num_to_upgrade=0.with_tiered_storage=False"

new failures in https://buildkite.com/redpanda/redpanda/builds/45762#018e169f-f929-488e-bef0-4ef43a7fb63b:

"rptest.tests.topic_creation_test.CreateTopicsTest.test_no_log_bloat_when_recreating_existing_topics"

new failures in https://buildkite.com/redpanda/redpanda/builds/45762#018e169f-f922-44f5-8b1f-7d46ffc91219:

"rptest.tests.cluster_config_test.ClusterConfigTest.test_valid_settings"

new failures in https://buildkite.com/redpanda/redpanda/builds/45762#018e169f-f92d-4ac0-b53e-211cdff5232d:

"rptest.tests.describe_topics_test.DescribeTopicsTest.test_describe_topics_with_documentation_and_types"

new failures in https://buildkite.com/redpanda/redpanda/builds/45851#018e1ce1-2ccd-4ac4-9860-9905b3d87a97:

"rptest.tests.cloud_retention_test.CloudRetentionTest.test_gc_entire_manifest.cloud_storage_type=CloudStorageType.ABS"

vbotbuildovich · 2024-03-07T01:30:00Z

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/45762#018e1636-93bc-4acf-aad7-3901eb2cf74c

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/45785#018e1818-d8a8-417c-8e77-45a56dcf6a10

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/45809#018e1a5b-c224-4604-9534-741815d5ebff

src/v/cluster/types.cc

nvartolomei · 2024-03-07T11:26:57Z

src/v/kafka/server/handlers/configs/config_utils.h

+              model::write_caching_mode::off);
+        }
+        auto cluster_default = config::shard_local_cfg().write_caching();
+        if (cluster_default == model::write_caching_mode::disabled) {


Wondering if we should error here or just accept it and log a warn instead.

I'm thinking about the following case: someone enables write caching on some topics, disabled the feature globally, then wants to change the settings on the topics such that when the feature is enabled back it is only applied to one topic rather than all as it was pre-disabling it.

With the current implementation the above workflow won't be possible.

Doing so I think gives an impression to the caller that write caching is enabled.

Eg:

Admin disables write caching globally
User X enables write caching on their topic, command returns success

X thinks write caching is in place when in reality it isn't. Returning an explicit error forces them check that it is disabled globally and cannot be used, wdyt.

nvartolomei · 2024-03-07T11:29:58Z

src/v/kafka/server/handlers/topics/validators.h

+        }
+        try {
+            auto val = boost::lexical_cast<size_t>(it->value.value());
+            return val > 0;


do we always duplicate the validation logic in 2 places like this? also seen this in config_utils.h

Thats true, thats how it is done today. The two cases are topic creation and updating properties, the validators use different signatures but we could probably consolidate the logic somehow by factoring out the common validation logic.

bharathv · 2024-03-07T18:09:03Z

force_push fixed linter errors, tightened the e2e test and squashed a couple of commits.

bharathv · 2024-03-07T21:12:19Z

force pushed to fix rebase conflicts with dev (which was the last dt failure).

src/v/redpanda/admin/api-doc/debug.json

src/v/config/convert.h

src/v/model/metadata.h

ztlpn · 2024-03-07T22:40:39Z

src/v/raft/group_manager.h

@@ -49,6 +49,8 @@ class group_manager {
        config::binding<std::chrono::milliseconds> election_timeout_ms;
        config::binding<std::optional<size_t>> replica_max_not_flushed_bytes;
        config::binding<std::chrono::milliseconds> flush_timer_interval_ms;
+        config::binding<model::write_caching_mode> write_caching;
+        config::binding<std::chrono::milliseconds> write_caching_flush_ms;


I'm a bit confused about relation between old and new config properties. AFAIU we are going to deprecate flush_timer_interval_ms but reuse replica_max_not_flushed_bytes. Will the old code using replica_max_not_flushed_bytes be removed after write caching is introduced? Should we then add another binding write_caching_flush_bytes for consistency and remove the old binding when it is no longer needed?

AFAIU we are going to deprecate flush_timer_interval_ms but reuse replica_max_not_flushed_bytes

correct.

Will the old code using replica_max_not_flushed_bytes be removed after write caching is introduced? Should we then add another binding write_caching_flush_bytes for consistency and remove the old binding when it is no longer needed?

Correct, that is coming in the next PR, tried to keep this PR totally a mechanical change limiting to configs/properties.

src/v/config/configuration.cc

src/v/compat/cluster_compat.h

Adds cluster configuration write_caching. Accepted values are [on, off, disabled] on/off applies to all topics in the cluster by default unless there is a topic level override via property write.caching (to be added in subsequent commits). write_caching=disabled takes precedence over everything including topic overrides and is intended to be a kill switch if something goes wrong.

write.caching can override write_caching cluster configuration. Acceptable values [on, off] write_caching=disabled takes precedence over write.caching overrides.

raft_replica_max_flush_delay_ms acts as the cluster default for flush.ms. raft_flush_timer_interval_ms will be deprecated in favor of this new configuration in a future PR.

flush.ms topic property overrides raft_replica_max_flush_delay_ms flush.bytes topic property overrides raft_replica_max_pending_flush_bytes flush.ms acceptable values >= 1ms flush.bytes acceptable values > 0

unclear if this is needed, just adding for completeness.

The configuration opptions are very rarely changed, so it is wasteful to compute them for every write request (reconciling cluster + topic properties). This changes keeps a copy of the configs in the consensus class which are updated using notification hooks.

Configuration values as seen by each replica are now a part of the debug dump for the partition.

These errors are not propagated the clients (which needs to be fixed), until then logging in debug mode.

github-actions bot added the area/redpanda label Mar 6, 2024

bharathv force-pushed the wc1 branch from 124fbc6 to 63b7d82 Compare March 7, 2024 07:33

bharathv marked this pull request as ready for review March 7, 2024 07:34

bharathv requested review from mmaslankaprv, dotnwat, nvartolomei and ztlpn March 7, 2024 07:34

bharathv self-assigned this Mar 7, 2024

nvartolomei reviewed Mar 7, 2024

View reviewed changes

src/v/cluster/types.cc Show resolved Hide resolved

nvartolomei reviewed Mar 7, 2024

View reviewed changes

bharathv force-pushed the wc1 branch from 63b7d82 to 21d3e2e Compare March 7, 2024 18:07

bharathv requested a review from nvartolomei March 7, 2024 18:10

bharathv force-pushed the wc1 branch 2 times, most recently from f2cd870 to daf12a1 Compare March 7, 2024 21:10

ztlpn reviewed Mar 7, 2024

View reviewed changes

bharathv force-pushed the wc1 branch from daf12a1 to e5bc836 Compare March 8, 2024 05:50

bharathv requested a review from ztlpn March 8, 2024 05:54

bharathv added 7 commits March 8, 2024 09:34

k/topic_property: Add key for write caching topic property

b8d0290

k/topic_property: write.caching implementation

ce415eb

write.caching can override write_caching cluster configuration. Acceptable values [on, off] write_caching=disabled takes precedence over write.caching overrides.

configuration: Add raft_replica_max_flush_delay_ms

8421108

raft_replica_max_flush_delay_ms acts as the cluster default for flush.ms. raft_flush_timer_interval_ms will be deprecated in favor of this new configuration in a future PR.

k/topic_properties: key defs for flush.ms and flush.bytes

27db06c

k/config: add parse_and_set_optional_duration utility

7ab0862

k/topic_properties: flush.ms and flush.bytes implementations

7e3c8b2

flush.ms topic property overrides raft_replica_max_flush_delay_ms flush.bytes topic property overrides raft_replica_max_pending_flush_bytes flush.ms acceptable values >= 1ms flush.bytes acceptable values > 0

bharathv added 9 commits March 8, 2024 09:34

k/describe_config: add plumbing for new topic properties

02a0142

k/topic_properties: add adl encoding

f8f47ed

unclear if this is needed, just adding for completeness.

k/handlers: remove dead code

efcd60e

raft/debug: add write caching configs to partition debug state

a7efd39

Configuration values as seen by each replica are now a part of the debug dump for the partition.

k/alter configs: Add more debug logging

44bbebf

These errors are not propagated the clients (which needs to be fixed), until then logging in debug mode.

tests: alter config tests for write caching properties

7805d09

tests/ducktape: e2e test for write caching configs

040cac1

offline_log_viewer: updates for write caching properties

bd9ae19

bharathv force-pushed the wc1 branch from e5bc836 to bd9ae19 Compare March 8, 2024 17:35

ztlpn approved these changes Mar 8, 2024

View reviewed changes

bharathv merged commit 60384c8 into redpanda-data:dev Mar 8, 2024
17 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

write caching - configurations #16924

write caching - configurations #16924

bharathv commented Mar 6, 2024

bharathv commented Mar 6, 2024

vbotbuildovich commented Mar 7, 2024 •

edited

Loading

vbotbuildovich commented Mar 7, 2024 •

edited

Loading

nvartolomei Mar 7, 2024

bharathv Mar 7, 2024

nvartolomei Mar 7, 2024

bharathv Mar 7, 2024

bharathv commented Mar 7, 2024

bharathv commented Mar 7, 2024

ztlpn Mar 7, 2024

bharathv Mar 8, 2024

write caching - configurations #16924

write caching - configurations #16924

Conversation

bharathv commented Mar 6, 2024

Release Notes

Features

Backports Required

Release Notes

bharathv commented Mar 6, 2024

vbotbuildovich commented Mar 7, 2024 • edited Loading

vbotbuildovich commented Mar 7, 2024 • edited Loading

nvartolomei Mar 7, 2024

Choose a reason for hiding this comment

bharathv Mar 7, 2024

Choose a reason for hiding this comment

nvartolomei Mar 7, 2024

Choose a reason for hiding this comment

bharathv Mar 7, 2024

Choose a reason for hiding this comment

bharathv commented Mar 7, 2024

bharathv commented Mar 7, 2024

ztlpn Mar 7, 2024

Choose a reason for hiding this comment

bharathv Mar 8, 2024

Choose a reason for hiding this comment

vbotbuildovich commented Mar 7, 2024 •

edited

Loading

vbotbuildovich commented Mar 7, 2024 •

edited

Loading