Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v23.3.x] cluster: initialize eviction stm for tx coordinator #17380

Merged
merged 4 commits into from
Mar 26, 2024

Conversation

andrwng
Copy link
Contributor

@andrwng andrwng commented Mar 25, 2024

We previously didn't register the eviction stm before returning from
partition creation of tx coordinator partitions. This meant that despite
the efforts of storage housekeeping and space management, tx coordinator
partitions would never evict data.

NOTE: this bug[1] doesn't exist on dev, but does in v23.3, and not in v23.2[2].

[1]

if (is_tx_manager_topic(_raft->ntp()) && _is_tx_enabled) {
_tm_stm = builder.create_stm<cluster::tm_stm>(
clusterlog,
_raft.get(),
_feature_table,
_tm_stm_cache_manager.local().get(_raft->ntp().tp.partition));
_raft->log()->stm_manager()->add_stm(_tm_stm);
co_return co_await _raft->start(std::move(builder));
}
if (is_transform_offsets_topic(_raft->ntp())) {
vassert(
topic_cfg.has_value(),
"No topic configuration passed, stm requires configuration for "
"partition count.");
_transform_offsets_stm = builder.create_stm<transform_offsets_stm_t>(
topic_cfg->partition_count, clusterlog, _raft.get());
}
/**
* Data partitions
*/
if (!storage::deletion_exempt(_raft->ntp())) {
_log_eviction_stm = builder.create_stm<cluster::log_eviction_stm>(
_raft.get(), clusterlog, _kvstore);
_raft->log()->stm_manager()->add_stm(_log_eviction_stm);
}

[2]
} else if (is_tx_manager_topic(_raft->ntp())) {
if (
_raft->log_config().is_collectable()
&& !storage::deletion_exempt(_raft->ntp())) {
_log_eviction_stm = ss::make_shared<cluster::log_eviction_stm>(
_raft.get(), clusterlog, _as, _kvstore);
stm_manager->add_stm(_log_eviction_stm);
}
if (_is_tx_enabled) {
auto tm_stm_cache = _tm_stm_cache_manager.local().get(
_raft->ntp().tp.partition);
_tm_stm = ss::make_shared<cluster::tm_stm>(
clusterlog, _raft.get(), feature_table, tm_stm_cache);
stm_manager->add_stm(_tm_stm);
}

Includes a backport of #17379
Partially includes a backport of #15424 for testing

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v23.3.x
  • v23.2.x

Release Notes

Bug Fixes

  • Fixes a bug that would prevent the transaction coordinator topic from reclaiming disk space.

mmaslankaprv and others added 3 commits March 25, 2024 16:55
CONFLICTS:
- tm_stm_name not required to construct tm_stm in partition::start()

Removed `get_name` method from `raft::state_machine_base` and made it a
static variable. This way called does not have to provide a name and we
can easily query state machine by type rather than by name.

Signed-off-by: Michal Maslanka <michal@redpanda.com>
(cherry picked from commit 9a09680)
Added an API method that will allow caller to query for an stm with a
specific name. The `dynamic_cast` is used to provide a basic level of
runtime correctness check.

Signed-off-by: Michal Maslanka <michal@redpanda.com>
(cherry picked from commit dd5798c)
We previously didn't register the eviction stm before returning from
partition creation of tx coordinator partitions. This meant that despite
the efforts of storage housekeeping and space management, tx coordinator
partitions would never evict data.
@andrwng andrwng force-pushed the v23.3.x-tx-stm-eviction branch from a6b7773 to d76447f Compare March 25, 2024 23:56
@andrwng
Copy link
Contributor Author

andrwng commented Mar 25, 2024

Force push: pulled in some of #15424 to be able to get the tm_stm in the test

@andrwng andrwng force-pushed the v23.3.x-tx-stm-eviction branch from d76447f to c45abbd Compare March 26, 2024 01:26
CONFLICT:
- different CMakeLists surroundings

We've seen in older versions of Redpanda that despite the 'delete'
cleanup policy being enabled, eviction may not be triggered. This is
because in older versions of Redpanda[1], we would not register the
eviction STM for tx coordinator partitions.

[1] https://github.com/redpanda-data/redpanda/blob/3ce012f9ccb64810eafcc4b8b2e0df7340172879/src/v/cluster/partition.cc#L446-L471

(cherry picked from commit 761dfe9)
@andrwng andrwng force-pushed the v23.3.x-tx-stm-eviction branch from c45abbd to e6b1b69 Compare March 26, 2024 02:09
@piyushredpanda piyushredpanda merged commit 02bd8b5 into redpanda-data:v23.3.x Mar 26, 2024
16 checks passed
@piyushredpanda piyushredpanda added this to the v23.3.10 milestone Mar 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants