-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CCR] Add total fetch time took stat #34577
[CCR] Add total fetch time took stat #34577
Conversation
keeps track how much time was spent on fetches from the leader cluster perspective.
Pinging @elastic/es-distributed |
I think I'm good with the following:
|
yes, this pr keeps the original total_fetch_time_millis and adds a took
time field to the shard changes api response. I will rename
total_fetch_took_time_millis to total_fetch_leader_time_millis .
…On Thu, Oct 18, 2018 at 16:38 Boaz Leskes ***@***.***> wrote:
I think I'm good with the following:
1. Changes response has a took time field
2. Shard Follow Node Task sums the value of that feed under
total_fetch_leader_time_millis
3. We have a total_fetch_time_millis as we do today.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#34577 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAjbRXSuFRdq-nFYYflgBkrVwWdxVh6Zks5umJJrgaJpZM4Xs2D6>
.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wrote more words than necessary to say we should start the clock sooner.
} | ||
|
||
@Override | ||
protected void asyncShardOperation( | ||
final Request request, | ||
final ShardId shardId, | ||
final ActionListener<Response> listener) throws IOException { | ||
request.relativeStartNanos = System.nanoTime(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is the wrong place to start the clock. The goal with this metric is to capture the non-network time on the leader so we want to start the clock as soon as possible. Right now these requests are handled on the network thread before we go async, so there is not much in terms of queuing time, but it would be a bug if somehow this ever changed and handling of this request was not on the network thread before going async, this would miss queuing time in that other thread pool (we miss internal Netty queuing, but I don't think there's anything obvious that we can do about that). This is also something to think about in the context of the NIO work. I also don't like that we are mutating the request, it's doing to make it harder to have immutable request objects.
I think we should start the clock the moment the request is de-serialized.
@bleskes @jasontedor I've updated this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Add total fetch time leader stat, that keeps track how much time was spent on fetches from the leader cluster perspective.
* master: (24 commits) ingest: better support for conditionals with simulate?verbose (elastic#34155) [Rollup] Job deletion should be invoked on the allocated task (elastic#34574) [DOCS] .Security index is never auto created (elastic#34589) CCR: Requires soft-deletes on the follower (elastic#34725) re-enable bwc tests (elastic#34743) Empty GetAliases authorization fix (elastic#34444) INGEST: Document Processor Conditional (elastic#33388) [CCR] Add total fetch time leader stat (elastic#34577) SQL: Support pattern against compatible indices (elastic#34718) [CCR] Auto follow pattern APIs adjustments (elastic#34518) [Test] Remove dead code from ExceptionSerializationTests (elastic#34713) A small typo in migration-assistance doc (elastic#34704) ingest: processor stats (elastic#34724) SQL: Implement IN(value1, value2, ...) expression. (elastic#34581) Tests: Add checks to GeoDistanceQueryBuilderTests (elastic#34273) INGEST: Rename Pipeline Processor Param. (elastic#34733) Core: Move IndexNameExpressionResolver to java time (elastic#34507) [DOCS] Force Merge: clarify execution and storage requirements (elastic#33882) TESTING.asciidoc fix examples using forbidden annotation (elastic#34515) SQL: Implement `CONVERT`, an alternative to `CAST` (elastic#34660) ...
Add total fetch time leader stat, that keeps track how much time was spent on fetches from the leader cluster perspective.
This new stat keeps track how much time was spent on fetches from the leader cluster perspective.
The current name of this new stat is
total_fetch_took_time_millis
and it doesn't describe well what this stat represents. I'm not sure what would be a good name. Maybetotal_fetch_leader_time_millis
ortotal_fetch_remote_time_millis
to indicate it is the same astotal_fetch_time_millis
but from a different perspective? cc: @bleskes / @jasontedor