-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
introduce zk-stats: zk-latency + zk-rate #320
Conversation
@@ -222,7 +222,9 @@ protected void recoverFromLedger(final ManagedCursorInfo info, final VoidCallbac | |||
// a new ledger and write the position into it | |||
ledger.mbean.startCursorLedgerOpenOp(); | |||
long ledgerId = info.getCursorsLedgerId(); | |||
final long now = System.currentTimeMillis(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Always use System.nanoTime()
for measurements because it guarantees monotonically increasing timestamp. currentTimeMillis()
can go backward (NTP syncup, Daylight saving time adjustments, etc.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, that make sense. will make the change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, scratch the "Daylight saving time adjustment" since the timestamp in UTC which is not affected.. though the same point still holds
@@ -1823,8 +1825,10 @@ void internalFlushPendingMarkDeletes() { | |||
|
|||
void createNewMetadataLedger(final VoidCallback callback) { | |||
ledger.mbean.startCursorLedgerCreateOp(); | |||
final long now = System.currentTimeMillis(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Creating a BK ledger involves 3 ZK write operations. Either we account for that or we treat it separately
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we measure the latency at the lower level call, in places like zkutils instead of doing it at higher level?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, we are recording zk-stat mainly at two places
-
Bookkeeper-ledger create/open in ManagedLedger/ManagedCursor: we use bkClient and we can't modify bkClient so, we do in ManagedLedger/Cursor level.
-
Update managed-ledger data in MetaStore: which directly uses zk-client so, we do at MetaStore level.
Also ZkUtils
is part of bk client.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Creating a BK ledger involves 3 ZK write operations. Either we account for that or we treat it separately
That's correct. Now, we add the write-op-count = 3 but we are not recording latency here because createLedger() does more than the just zk operations so, totalLatency/3
will not give correct latency number.
@@ -248,6 +248,7 @@ public void operationComplete(ManagedLedgerInfo mlInfo, Stat stat) { | |||
if (log.isDebugEnabled()) { | |||
log.debug("[{}] Opening legder {}", name, id); | |||
} | |||
store.recordRead(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For reads as well it would be interesting to know the latency
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, right now we are having single historgram to record read and write latency combined. So, do you think we should introduce separate histogram to record read-latency as well?
And for this L251 instance: we are not recording latency because we don't have callback implementation in this scope so, we couldn't record the read-latency here and just record the rate.
store.recordRead();
mbean.startDataLedgerOpenOp();
bookKeeper.asyncOpenLedger(id, config.getDigestType(), config.getPassword(), opencb, null);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, they need to be separate, since they have very different profiles :
- Reads are served from memory directly by the ZK server we are connected to
- Writes needs to be routed through the ZK leader and be accepted by the majority of the quorum, including syncing on the local disk.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, have thought about it (read will be quick as coming from memory directly) but then concluded to keep one by considering that we are anyway having mean, percentile : 95, 99, 99.9, 99.99 so, write will be automatically reflected as part of 99.9 or 99.99. So, avoided having additional histogram.
But will introduce separate histogram.
} | ||
|
||
public void recordValue(long dimensionLatencyMs) { | ||
dimensionTimeRecorder.recordValue(dimensionLatencyMs); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HdrHistogram is not thread-safe by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually new Recorder(int,int) returns AtomicHistogram and as per documentation is a thread-safe. However, new Recorder(int) creates ConcurrentHistogram. So, do you think we should use ConcurrentHistogram
which is also thread-safe. ??
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, then either is good :)
|
||
public double elapsedIntervalMs; | ||
|
||
private Recorder dimensionTimeRecorder = new Recorder(TimeUnit.MINUTES.toMillis(10), 2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using a very high max value, reduces a lot the precision of the computed quantiles. I would suggest to use max 30sec and cap the latency to 30sec before sending to the recorder.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. However, I kept highestTrackableValue=120 sec
as sometimes one can keep zkSessionTimeout=120 sec. so, we can keep our cap to 120sec ? any thought?
@merlimat addressed comments. |
11b7ea3
to
bb43c26
Compare
also I haven't added zk-stats recording at broker-startup cleanup. |
9ec0b82
to
3ef7fe0
Compare
…ime, measuring time using nano-sec, 3 write-count for create-ledger
ping @merlimat @saandrews |
@rdhabalia Please take a look at #436. I think we should use the same trick on the client side. Instrumenting ZooKeeper to add the metrics for rates and latencies. |
Actually at zookeeper-server side, request itself contains creation-time and we can measure latency based on difference of creation-time and current time. |
@merlimat : In many cases it requires to get zk-latency at zk-client side. As per my previous comment do you think we should take aspectJ approach or we can keep this approach? |
My main concern is that this is a kind of a wack-a-mole to find all the ZK client library usages. Also, this is not accounting for the ZK requests made by BK client. That's way I'd prefer to have some automated approach to monitor all the calls. Another option could be to use the |
we can close this PR once #507 is merged. |
* Fix NPE if kafkaAdvertisedListeners was not set * Set kafkaAdvertisedListeners for integration tests * Change advertised address to 127.0.0.1
Motivation
Right now, zk
mntr -> nc stats
gives limited information and couldn't give complete stats about zk-latency which broker is seeing and number of read/write request rate. So, adding zk-op-stats into existing broker's stat-metrics which covers zk-latency and read/write rate.Modifications
Add zk-op-stats into existing broker's stat-metrics which covers zk-latency and read/write rate.
Result
Broker can provide zk-stats.