introduce zk-stats: zk-latency + zk-rate #320

rdhabalia · 2017-03-29T05:00:07Z

Motivation

Right now, zk mntr -> nc stats gives limited information and couldn't give complete stats about zk-latency which broker is seeing and number of read/write request rate. So, adding zk-op-stats into existing broker's stat-metrics which covers zk-latency and read/write rate.

Modifications

Add zk-op-stats into existing broker's stat-metrics which covers zk-latency and read/write rate.

Result

Broker can provide zk-stats.

merlimat · 2017-03-29T16:46:35Z

managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedCursorImpl.java

@@ -222,7 +222,9 @@ protected void recoverFromLedger(final ManagedCursorInfo info, final VoidCallbac
        // a new ledger and write the position into it
        ledger.mbean.startCursorLedgerOpenOp();
        long ledgerId = info.getCursorsLedgerId();
+        final long now = System.currentTimeMillis();


Always use System.nanoTime() for measurements because it guarantees monotonically increasing timestamp. currentTimeMillis() can go backward (NTP syncup, Daylight saving time adjustments, etc.)

yes, that make sense. will make the change.

ok, scratch the "Daylight saving time adjustment" since the timestamp in UTC which is not affected.. though the same point still holds

merlimat · 2017-03-29T16:47:27Z

managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedCursorImpl.java

@@ -1823,8 +1825,10 @@ void internalFlushPendingMarkDeletes() {

    void createNewMetadataLedger(final VoidCallback callback) {
        ledger.mbean.startCursorLedgerCreateOp();
+        final long now = System.currentTimeMillis();


Creating a BK ledger involves 3 ZK write operations. Either we account for that or we treat it separately

Can we measure the latency at the lower level call, in places like zkutils instead of doing it at higher level?

Actually, we are recording zk-stat mainly at two places

Bookkeeper-ledger create/open in ManagedLedger/ManagedCursor: we use bkClient and we can't modify bkClient so, we do in ManagedLedger/Cursor level.

Update managed-ledger data in MetaStore: which directly uses zk-client so, we do at MetaStore level.

Also ZkUtils is part of bk client.

Creating a BK ledger involves 3 ZK write operations. Either we account for that or we treat it separately

That's correct. Now, we add the write-op-count = 3 but we are not recording latency here because createLedger() does more than the just zk operations so, totalLatency/3 will not give correct latency number.

merlimat · 2017-03-29T16:48:14Z

managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedLedgerImpl.java

@@ -248,6 +248,7 @@ public void operationComplete(ManagedLedgerInfo mlInfo, Stat stat) {
                    if (log.isDebugEnabled()) {
                        log.debug("[{}] Opening legder {}", name, id);
                    }
+                    store.recordRead();


For reads as well it would be interesting to know the latency

Actually, right now we are having single historgram to record read and write latency combined. So, do you think we should introduce separate histogram to record read-latency as well?
And for this L251 instance: we are not recording latency because we don't have callback implementation in this scope so, we couldn't record the read-latency here and just record the rate.

store.recordRead(); mbean.startDataLedgerOpenOp(); bookKeeper.asyncOpenLedger(id, config.getDigestType(), config.getPassword(), opencb, null);

Yes, they need to be separate, since they have very different profiles :

Reads are served from memory directly by the ZK server we are connected to

Writes needs to be routed through the ZK leader and be accepted by the majority of the quorum, including syncing on the local disk.

yes, have thought about it (read will be quick as coming from memory directly) but then concluded to keep one by considering that we are anyway having mean, percentile : 95, 99, 99.9, 99.99 so, write will be automatically reflected as part of 99.9 or 99.99. So, avoided having additional histogram.
But will introduce separate histogram.

merlimat · 2017-03-29T16:58:05Z

managed-ledger/src/main/java/org/apache/bookkeeper/mledger/util/DimensionStats.java

+    }
+
+    public void recordValue(long dimensionLatencyMs) {
+        dimensionTimeRecorder.recordValue(dimensionLatencyMs);


HdrHistogram is not thread-safe by default.

Actually new Recorder(int,int) returns AtomicHistogram and as per documentation is a thread-safe. However, new Recorder(int) creates ConcurrentHistogram. So, do you think we should use ConcurrentHistogram which is also thread-safe. ??

Ok, then either is good :)

merlimat · 2017-03-29T17:55:46Z

managed-ledger/src/main/java/org/apache/bookkeeper/mledger/util/DimensionStats.java

+
+    public double elapsedIntervalMs;
+
+    private Recorder dimensionTimeRecorder = new Recorder(TimeUnit.MINUTES.toMillis(10), 2);


Using a very high max value, reduces a lot the precision of the computed quantiles. I would suggest to use max 30sec and cap the latency to 30sec before sending to the recorder.

Sure. However, I kept highestTrackableValue=120 sec as sometimes one can keep zkSessionTimeout=120 sec. so, we can keep our cap to 120sec ? any thought?

rdhabalia · 2017-03-29T22:01:05Z

@merlimat addressed comments.

rdhabalia · 2017-03-29T22:51:30Z

also I haven't added zk-stats recording at broker-startup cleanup.

…ime, measuring time using nano-sec, 3 write-count for create-ledger

rdhabalia · 2017-04-10T21:30:06Z

ping @merlimat @saandrews

merlimat · 2017-05-30T21:05:39Z

@rdhabalia Please take a look at #436. I think we should use the same trick on the client side. Instrumenting ZooKeeper to add the metrics for rates and latencies.

rdhabalia · 2017-06-10T01:09:54Z

Please take a look at #436. I think we should use the same trick on the client side.

Actually at zookeeper-server side, request itself contains creation-time and we can measure latency based on difference of creation-time and current time.
However, at Zookeeper-client, request doesn’t store creation time so, client has to somehow inject creation time (may be value of “ctx” when we issue zk-command) and with aspectj we can intercept zk-response: ClientCnxn.procesEvent() which has operationType and ctx=creationTime, and can measure the latency. However, we still have to make changes at broker side to make sure we pass creation-time as a “ctx” to measure the latency. Do you think that would be a better solution to measure latency at zk-client?

rdhabalia · 2017-06-14T00:16:12Z

@merlimat : In many cases it requires to get zk-latency at zk-client side. As per my previous comment do you think we should take aspectJ approach or we can keep this approach?

merlimat · 2017-06-14T00:21:28Z

My main concern is that this is a kind of a wack-a-mole to find all the ZK client library usages. Also, this is not accounting for the ZK requests made by BK client.

That's way I'd prefer to have some automated approach to monitor all the calls.

Another option could be to use the ZooKeeperClient from BookKeeper master (https://github.com/apache/bookkeeper/blob/master/bookkeeper-server/src/main/java/org/apache/bookkeeper/zookeeper/ZooKeeperClient.java ). This one is collecting metrics on all the ZK calls. You need to instantiate that instead of the regular ZooKeeper class.

rdhabalia · 2017-06-20T23:13:26Z

we can close this PR once #507 is merged.

* Fix NPE if kafkaAdvertisedListeners was not set * Set kafkaAdvertisedListeners for integration tests * Change advertised address to 127.0.0.1

introduce zk-stats: zk-latency + zk-rate

b39f318

rdhabalia added the type/enhancement The enhancements for the existing features or docs. e.g. reduce memory usage of the delayed messages label Mar 29, 2017

rdhabalia added this to the 1.17 milestone Mar 29, 2017

rdhabalia self-assigned this Mar 29, 2017

merlimat reviewed Mar 29, 2017

View reviewed changes

rdhabalia force-pushed the zk_stats branch from ee5a151 to e497915 Compare March 29, 2017 21:48

rdhabalia force-pushed the zk_stats branch 2 times, most recently from 11b7ea3 to bb43c26 Compare March 29, 2017 22:38

rdhabalia force-pushed the zk_stats branch 5 times, most recently from 9ec0b82 to 3ef7fe0 Compare March 30, 2017 00:01

fix: added read-latency histogram, reduce histogram-highestTrackableT…

aff7157

…ime, measuring time using nano-sec, 3 write-count for create-ledger

rdhabalia force-pushed the zk_stats branch from 3ef7fe0 to aff7157 Compare March 30, 2017 00:40

merlimat modified the milestones: 1.17, 1.18 Mar 31, 2017

rdhabalia mentioned this pull request Apr 10, 2017

Broker change dynamic configuration based on ZK latency to reduce zk-load #342

Closed

merlimat modified the milestone: 1.18 Jun 14, 2017

merlimat modified the milestone: 1.19 Jun 14, 2017

rdhabalia closed this Jul 20, 2017

rdhabalia deleted the zk_stats branch September 1, 2017 18:30

xiaotongwang1 mentioned this pull request Aug 4, 2021

Pulsar 2.7.0+ KOP 2.7.2.x getPartitionedTopicMetadata timeout #11532

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

introduce zk-stats: zk-latency + zk-rate #320

introduce zk-stats: zk-latency + zk-rate #320

rdhabalia commented Mar 29, 2017

merlimat Mar 29, 2017

rdhabalia Mar 29, 2017

merlimat Mar 29, 2017

merlimat Mar 29, 2017

saandrews Mar 29, 2017

rdhabalia Mar 29, 2017

rdhabalia Mar 29, 2017

merlimat Mar 29, 2017

rdhabalia Mar 29, 2017

merlimat Mar 29, 2017

rdhabalia Mar 29, 2017

merlimat Mar 29, 2017

rdhabalia Mar 29, 2017

merlimat Mar 29, 2017

merlimat Mar 29, 2017

rdhabalia Mar 29, 2017

rdhabalia commented Mar 29, 2017

rdhabalia commented Mar 29, 2017

rdhabalia commented Apr 10, 2017

merlimat commented May 30, 2017

rdhabalia commented Jun 10, 2017

rdhabalia commented Jun 14, 2017

merlimat commented Jun 14, 2017

rdhabalia commented Jun 20, 2017


		public double elapsedIntervalMs;

		private Recorder dimensionTimeRecorder = new Recorder(TimeUnit.MINUTES.toMillis(10), 2);

introduce zk-stats: zk-latency + zk-rate #320

introduce zk-stats: zk-latency + zk-rate #320

Conversation

rdhabalia commented Mar 29, 2017

Motivation

Modifications

Result

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rdhabalia commented Mar 29, 2017

rdhabalia commented Mar 29, 2017

rdhabalia commented Apr 10, 2017

merlimat commented May 30, 2017

rdhabalia commented Jun 10, 2017

rdhabalia commented Jun 14, 2017

merlimat commented Jun 14, 2017

rdhabalia commented Jun 20, 2017