Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Broker change dynamic configuration based on ZK latency to reduce zk-load #342

Closed
rdhabalia opened this issue Apr 10, 2017 · 6 comments
Closed
Labels
lifecycle/stale type/enhancement The enhancements for the existing features or docs. e.g. reduce memory usage of the delayed messages

Comments

@rdhabalia
Copy link
Contributor

Existing behavior

Sometimes broker sees high zk-latency due to gc pauses or high load on zookeeper, and it may cause brokers to lose zk-session which can create cold restart situation and that subsequently creates high pressure on zk as cold restart loads high number of concurrent topics and lookup-requests.

Broker has capability to throttle concurrent lookup-request and topic-loading which helps to reduce back-pressure on zookeeper. with PR #320 Broker will also have capability to monitor zk-latency at real time as well.

Change

So, broker should dynamically control throttling if it sees zk-latency is going higher than configured threshold (eg. threshold=0.75*zkSessionTimeOut) which can reduce zk-load and helps broker to keep zk-session alive and avoid cold-restart. So, we can implement controller which monitors zk-stats and takes appropriate actions (can be enhanced to consider additional variables).

@merlimat @saandrews any thought?

@saandrews
Copy link
Contributor

  1. Can we list down what actions(e.g: topic load/unload, new subscription, ledger rollover) will be impacted when we implement this change?
  2. If all of them are impacted, can we prioritize among them as to who can access zk? If we do this, would it indirectly address some of the benefits we would get if we had 2 zk clusters?
  3. Should we have different thresholds for read vs. write latency?

@rdhabalia
Copy link
Contributor Author

Can we list down what actions(e.g: topic load/unload, new subscription, ledger rollover) will be impacted when we implement this change?

for now, we have mechanism to control concurrent topic-loading and lookup-request dynamically. So, for now, action could be changing throttling limits for topic-loading and lookup-request.

If all of them are impacted, can we prioritize among them as to who can access zk? If we do this, would it indirectly address some of the benefits we would get if we had 2 zk clusters?

actually, we don't have any gate at broker which can be enabled dynamically to prioritize zk access. So, not sure what exactly we can do here.

Should we have different thresholds for read vs. write latency?

yes, but considering write latency could be enough.?

@sijie sijie added type/enhancement The enhancements for the existing features or docs. e.g. reduce memory usage of the delayed messages triage/week-34 labels Aug 22, 2018
@sijie
Copy link
Member

sijie commented Nov 28, 2018

@rdhabalia are you guys actively working on this?

@jiazhai
Copy link
Member

jiazhai commented Jan 3, 2020

ping @rdhabalia to confirm

@rdhabalia
Copy link
Contributor Author

@jiazhai I had some changes checked in local branch and wanted to get thoughts before creating the PR. Will try to look into it again soon. let me know if you guys have any thoughts on it.

hrsakai pushed a commit to hrsakai/pulsar that referenced this issue Dec 10, 2020
Fixes apache#342

### Motivation

Add support for KeySharedPolicy with AutoSplit or Sticky mode for consumer, which is a useful for some user cases like scalable request-reply pattern.

### Modifications

add key shared policy options for consumer, and a helpful constructor for validating hash range list
@tisonkun
Copy link
Member

Closed as stale.

Feel free to open a new issue if it's still relevant to the maintained versions.

@tisonkun tisonkun closed this as not planned Won't fix, can't repro, duplicate, stale Nov 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/stale type/enhancement The enhancements for the existing features or docs. e.g. reduce memory usage of the delayed messages
Projects
None yet
Development

No branches or pull requests

6 participants