-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[fix][storage] Autorecovery default reppDnsResolverClass to ZkBookieRackAffinityMapping #15640
[fix][storage] Autorecovery default reppDnsResolverClass to ZkBookieRackAffinityMapping #15640
Conversation
@michaeljmarshall:Thanks for your contribution. For this PR, do we need to update docs? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Please have a discussion under the dev mailing list first, the PR changed the default value. I support this change, but we should let it happen on the mailing list first. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed offline with @codelipenghui we should discuss about this change on the ML.
It is kind of a breaking change for people who changed the configuration on the broker.conf files.
@rdhabalia FYI
I'll put together a PIP on Thursday or Friday. I agree that we should discuss this on the mailing list. |
The pr had no activity for 30 days, mark with Stale label. |
Related explanation about ZkBookieRackAffinityMapping: #151 (comment) . On the broker side, ZkBookieRackAffinityMapping will get used for the bookie client when |
b7f19be
to
108a0d4
Compare
@codelipenghui @eolivelli - I agree that this PR changes a default. However, I think the current default in the |
edaafe7
to
804b250
Compare
The build is failing with the following error:
I wonder if we need to add a new scope? None of the available ones seem appropriate for this setting. I assume I'm supposed to use "storage", but it seems like it would be reasonable to indicate that the change is BK specific. |
/pulsarbot rerun-failure-checks |
804b250
to
d090bc2
Compare
Rebased and pushed with force to get new CI to run. |
Codecov Report
@@ Coverage Diff @@
## master #15640 +/- ##
=============================================
+ Coverage 34.91% 52.56% +17.65%
- Complexity 5707 7269 +1562
=============================================
Files 607 393 -214
Lines 53396 43418 -9978
Branches 5712 4462 -1250
=============================================
+ Hits 18644 22824 +4180
+ Misses 32119 18046 -14073
+ Partials 2633 2548 -85
Flags with carried forward coverage won't be shown. Click here to find out more.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…ackAffinityMapping (#15640) * [Autorecovery] Default reppDnsResolverClass to ZkBookieRackAffinityMapping * Improve documentation Fixes: #18012 ### Motivation The current Bookkeeper configuration defaults to using `org.apache.bookkeeper.net.ScriptBasedMapping` for the `DNSToSwitchMapping` implementation. However, this default configuration does not align with the Broker's default configuration, which is `org.apache.pulsar.zookeeper.ZkBookieRackAffinityMapping`. As such, the default configuration for a Pulsar cluster does not lead to ideal rack awareness when ledgers need to be recovered. The result is that a user can configure a cluster for rack awareness and the brokers will honor that configuration, but the autorecovery process will not because it does not have the correct bookkeeper cluster topology view. I propose we configure bookkeeper to use the broker's `ZkBookieRackAffinityMapping` class. That way, autorecovery will honor the operator's configured rack awareness policies out of the box. ### Modifications * Add default value for `reppDnsResolverClass` to the `conf/bookkeeper.conf` configuration. This change effectively switches the default from `org.apache.bookkeeper.net.ScriptBasedMapping` to `org.apache.pulsar.zookeeper.ZkBookieRackAffinityMapping`. ### Verifying this change I manually verified that the `ZkBookieRackAffinityMapping` works by running some tests in a minikube cluster deployed with the DataStax helm chart. I set up 3 racks, 4 bookies, and a topic with a E=2, Qw=2, and Qa=2. I then verified that the autorecovery pod correctly discovered the racks and then identified when an ensemble was not following the rack placement policy after two bookies were removed. I documented my testing a bit more here: datastax/pulsar-helm-chart#214. ### Does this pull request potentially affect one of the following parts: It changes a default value. The tradeoff is that a user relying on the `ScriptBasedMapping` default might accidentally get switched to using the `ZkBookieRackAffinityMapping` implementation. Given that `ScriptBasedMapping` doesn't work out of the box, and that the broker's default to `ZkBookieRackAffinityMapping`, I think this is an acceptable tradeoff. - [x] `doc` (cherry picked from commit 9812297)
…ackAffinityMapping (#15640) * [Autorecovery] Default reppDnsResolverClass to ZkBookieRackAffinityMapping * Improve documentation Fixes: #18012 ### Motivation The current Bookkeeper configuration defaults to using `org.apache.bookkeeper.net.ScriptBasedMapping` for the `DNSToSwitchMapping` implementation. However, this default configuration does not align with the Broker's default configuration, which is `org.apache.pulsar.zookeeper.ZkBookieRackAffinityMapping`. As such, the default configuration for a Pulsar cluster does not lead to ideal rack awareness when ledgers need to be recovered. The result is that a user can configure a cluster for rack awareness and the brokers will honor that configuration, but the autorecovery process will not because it does not have the correct bookkeeper cluster topology view. I propose we configure bookkeeper to use the broker's `ZkBookieRackAffinityMapping` class. That way, autorecovery will honor the operator's configured rack awareness policies out of the box. ### Modifications * Add default value for `reppDnsResolverClass` to the `conf/bookkeeper.conf` configuration. This change effectively switches the default from `org.apache.bookkeeper.net.ScriptBasedMapping` to `org.apache.pulsar.zookeeper.ZkBookieRackAffinityMapping`. ### Verifying this change I manually verified that the `ZkBookieRackAffinityMapping` works by running some tests in a minikube cluster deployed with the DataStax helm chart. I set up 3 racks, 4 bookies, and a topic with a E=2, Qw=2, and Qa=2. I then verified that the autorecovery pod correctly discovered the racks and then identified when an ensemble was not following the rack placement policy after two bookies were removed. I documented my testing a bit more here: datastax/pulsar-helm-chart#214. ### Does this pull request potentially affect one of the following parts: It changes a default value. The tradeoff is that a user relying on the `ScriptBasedMapping` default might accidentally get switched to using the `ZkBookieRackAffinityMapping` implementation. Given that `ScriptBasedMapping` doesn't work out of the box, and that the broker's default to `ZkBookieRackAffinityMapping`, I think this is an acceptable tradeoff. - [x] `doc` (cherry picked from commit 9812297)
…ackAffinityMapping (#15640) * [Autorecovery] Default reppDnsResolverClass to ZkBookieRackAffinityMapping * Improve documentation Fixes: #18012 ### Motivation The current Bookkeeper configuration defaults to using `org.apache.bookkeeper.net.ScriptBasedMapping` for the `DNSToSwitchMapping` implementation. However, this default configuration does not align with the Broker's default configuration, which is `org.apache.pulsar.zookeeper.ZkBookieRackAffinityMapping`. As such, the default configuration for a Pulsar cluster does not lead to ideal rack awareness when ledgers need to be recovered. The result is that a user can configure a cluster for rack awareness and the brokers will honor that configuration, but the autorecovery process will not because it does not have the correct bookkeeper cluster topology view. I propose we configure bookkeeper to use the broker's `ZkBookieRackAffinityMapping` class. That way, autorecovery will honor the operator's configured rack awareness policies out of the box. ### Modifications * Add default value for `reppDnsResolverClass` to the `conf/bookkeeper.conf` configuration. This change effectively switches the default from `org.apache.bookkeeper.net.ScriptBasedMapping` to `org.apache.pulsar.zookeeper.ZkBookieRackAffinityMapping`. ### Verifying this change I manually verified that the `ZkBookieRackAffinityMapping` works by running some tests in a minikube cluster deployed with the DataStax helm chart. I set up 3 racks, 4 bookies, and a topic with a E=2, Qw=2, and Qa=2. I then verified that the autorecovery pod correctly discovered the racks and then identified when an ensemble was not following the rack placement policy after two bookies were removed. I documented my testing a bit more here: datastax/pulsar-helm-chart#214. ### Does this pull request potentially affect one of the following parts: It changes a default value. The tradeoff is that a user relying on the `ScriptBasedMapping` default might accidentally get switched to using the `ZkBookieRackAffinityMapping` implementation. Given that `ScriptBasedMapping` doesn't work out of the box, and that the broker's default to `ZkBookieRackAffinityMapping`, I think this is an acceptable tradeoff. - [x] `doc` (cherry picked from commit 9812297)
…ackAffinityMapping (#15640) * [Autorecovery] Default reppDnsResolverClass to ZkBookieRackAffinityMapping * Improve documentation Fixes: #18012 ### Motivation The current Bookkeeper configuration defaults to using `org.apache.bookkeeper.net.ScriptBasedMapping` for the `DNSToSwitchMapping` implementation. However, this default configuration does not align with the Broker's default configuration, which is `org.apache.pulsar.zookeeper.ZkBookieRackAffinityMapping`. As such, the default configuration for a Pulsar cluster does not lead to ideal rack awareness when ledgers need to be recovered. The result is that a user can configure a cluster for rack awareness and the brokers will honor that configuration, but the autorecovery process will not because it does not have the correct bookkeeper cluster topology view. I propose we configure bookkeeper to use the broker's `ZkBookieRackAffinityMapping` class. That way, autorecovery will honor the operator's configured rack awareness policies out of the box. ### Modifications * Add default value for `reppDnsResolverClass` to the `conf/bookkeeper.conf` configuration. This change effectively switches the default from `org.apache.bookkeeper.net.ScriptBasedMapping` to `org.apache.pulsar.zookeeper.ZkBookieRackAffinityMapping`. ### Verifying this change I manually verified that the `ZkBookieRackAffinityMapping` works by running some tests in a minikube cluster deployed with the DataStax helm chart. I set up 3 racks, 4 bookies, and a topic with a E=2, Qw=2, and Qa=2. I then verified that the autorecovery pod correctly discovered the racks and then identified when an ensemble was not following the rack placement policy after two bookies were removed. I documented my testing a bit more here: datastax/pulsar-helm-chart#214. ### Does this pull request potentially affect one of the following parts: It changes a default value. The tradeoff is that a user relying on the `ScriptBasedMapping` default might accidentally get switched to using the `ZkBookieRackAffinityMapping` implementation. Given that `ScriptBasedMapping` doesn't work out of the box, and that the broker's default to `ZkBookieRackAffinityMapping`, I think this is an acceptable tradeoff. - [x] `doc` (cherry picked from commit 9812297)
…ackAffinityMapping (apache#15640) * [Autorecovery] Default reppDnsResolverClass to ZkBookieRackAffinityMapping * Improve documentation Fixes: apache#18012 ### Motivation The current Bookkeeper configuration defaults to using `org.apache.bookkeeper.net.ScriptBasedMapping` for the `DNSToSwitchMapping` implementation. However, this default configuration does not align with the Broker's default configuration, which is `org.apache.pulsar.zookeeper.ZkBookieRackAffinityMapping`. As such, the default configuration for a Pulsar cluster does not lead to ideal rack awareness when ledgers need to be recovered. The result is that a user can configure a cluster for rack awareness and the brokers will honor that configuration, but the autorecovery process will not because it does not have the correct bookkeeper cluster topology view. I propose we configure bookkeeper to use the broker's `ZkBookieRackAffinityMapping` class. That way, autorecovery will honor the operator's configured rack awareness policies out of the box. ### Modifications * Add default value for `reppDnsResolverClass` to the `conf/bookkeeper.conf` configuration. This change effectively switches the default from `org.apache.bookkeeper.net.ScriptBasedMapping` to `org.apache.pulsar.zookeeper.ZkBookieRackAffinityMapping`. ### Verifying this change I manually verified that the `ZkBookieRackAffinityMapping` works by running some tests in a minikube cluster deployed with the DataStax helm chart. I set up 3 racks, 4 bookies, and a topic with a E=2, Qw=2, and Qa=2. I then verified that the autorecovery pod correctly discovered the racks and then identified when an ensemble was not following the rack placement policy after two bookies were removed. I documented my testing a bit more here: datastax/pulsar-helm-chart#214. ### Does this pull request potentially affect one of the following parts: It changes a default value. The tradeoff is that a user relying on the `ScriptBasedMapping` default might accidentally get switched to using the `ZkBookieRackAffinityMapping` implementation. Given that `ScriptBasedMapping` doesn't work out of the box, and that the broker's default to `ZkBookieRackAffinityMapping`, I think this is an acceptable tradeoff. - [x] `doc` (cherry picked from commit 9812297) (cherry picked from commit fc692c3)
Fixes: #18012
Motivation
The current Bookkeeper configuration defaults to using
org.apache.bookkeeper.net.ScriptBasedMapping
for theDNSToSwitchMapping
implementation. However, this default configuration does not align with the Broker's default configuration, which isorg.apache.pulsar.zookeeper.ZkBookieRackAffinityMapping
. As such, the default configuration for a Pulsar cluster does not lead to ideal rack awareness when ledgers need to be recovered. The result is that a user can configure a cluster for rack awareness and the brokers will honor that configuration, but the autorecovery process will not because it does not have the correct bookkeeper cluster topology view.I propose we configure bookkeeper to use the broker's
ZkBookieRackAffinityMapping
class. That way, autorecovery will honor the operator's configured rack awareness policies out of the box.Modifications
reppDnsResolverClass
to theconf/bookkeeper.conf
configuration. This change effectively switches the default fromorg.apache.bookkeeper.net.ScriptBasedMapping
toorg.apache.pulsar.zookeeper.ZkBookieRackAffinityMapping
.Verifying this change
I manually verified that the
ZkBookieRackAffinityMapping
works by running some tests in a minikube cluster deployed with the DataStax helm chart. I set up 3 racks, 4 bookies, and a topic with a E=2, Qw=2, and Qa=2. I then verified that the autorecovery pod correctly discovered the racks and then identified when an ensemble was not following the rack placement policy after two bookies were removed. I documented my testing a bit more here: datastax/pulsar-helm-chart#214.Does this pull request potentially affect one of the following parts:
It changes a default value. The tradeoff is that a user relying on the
ScriptBasedMapping
default might accidentally get switched to using theZkBookieRackAffinityMapping
implementation. Given thatScriptBasedMapping
doesn't work out of the box, and that the broker's default toZkBookieRackAffinityMapping
, I think this is an acceptable tradeoff.doc