Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

partition-allocator: default shard 0 reserved partitions to 0 #22841

Merged

Conversation

travisdowns
Copy link
Member

@travisdowns travisdowns commented Aug 11, 2024

We have the topic_partitions_reserve_shard0 property which makes the partition allocator have as if there are that many partitions allocated to shard0, beyond the amount actually allocated. The idea is to slightly bias the partition related work on shard 0 in order to account for extra duties it has to do.

This is by default set to 2. This change sets it to 0 instead. In general the default value of 2 does not do a good job of accounting for the bias: it will either be much too high or much too low in most cases. I.e., if every shard has 500 partitions, putting "only" 498 makes no difference, despite that shard 0 might be doing significantly more work in this case. On the other hand, if someone takes our "recommendation" of using 1 or 2 shards per partition for maximum tput, they might be surprised to find that shard 0 get no partitions at all, since it has these two reserved slots. In this case shard 0 will be seriously underutilized.

Especially due to this second effect, which does come up in practice, it seems like 0 is the best default value for this property. The idea of choosing a fixed value to reduce the load on shard 0 seems appealing on the surface but does not seem to work out in practice.

CORE-6852

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v24.2.x
  • v24.1.x
  • v23.3.x

Release Notes

Improvements

  • Set the default value of topic_partitions_reserve_shard0 to zero. This means that we no longer weight shard 0 as if it has 2 more partitions than it actually has, leading to more even partition distribution in cases where the total number of partitions is close to the vCPU count.

We have the topic_partitions_reserve_shard0 property which makes the
partition allocator have as if there are that many partitions
allocated to shard0, beyond the amount actually allocated. The idea is
to slightly bias the partition related work on shard 0 in order to
account for extra duties it has to do.

This is by default set to 2. This change sets it to 0 instead. In
general the default value of 2 does not do a good job of accounting
for the bias: it will either be much too high or much too low in most
cases. I.e., if every shard has 500 partitions, putting "only" 498
makes no difference, despite that shard 0 might be doing significantly
more work in this case. On the other hand, if someone takes our
"recommendation" of using 1 or 2 shards per partition for maximum tput,
they might be surprised to find that shard 0 get no partitions at all,
since it has these two reserved slots. In this case shard 0 will
be seriously underutilized.

Especially due to this second effect, which does come up in practice,
it seems like 0 is the best default value for this property. The idea
of choosing a fixed value to reduce the load on shard 0 seems appealing
on the surface but does not seem to work out in practice.

CORE-6852
@travisdowns travisdowns requested a review from a team as a code owner August 11, 2024 17:36
@travisdowns travisdowns changed the title partition-allocator: default shard 0 parts to 0 partition-allocator: default shard 0 reserved partitions to 0 Aug 11, 2024
@vbotbuildovich
Copy link
Collaborator

@travisdowns travisdowns merged commit 2c76829 into redpanda-data:dev Aug 13, 2024
20 of 22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants