Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tracing in etcd server to range, put and compact requests #11179

Merged
merged 6 commits into from
Oct 8, 2019

Conversation

YoyinZyc
Copy link
Contributor

@YoyinZyc YoyinZyc commented Sep 24, 2019

Fix issue #11166.
This pr creates a standalone trace pkg for record the lifecycle of the request in etcd server. We only enable tracing for range request in this pr. Please refer to issue #11166 for the motivation of it.

Here is the example output of range request with no threshold:

{"level":"info","ts":"2019-09-25T11:10:35.671-0700","caller":"traceutil/trace.go:116","msg":"trace[1475838163] range","detail":"{range_begin:foo; range_end:fooo; response_count:100000; response_revision:191496;}","duration":"131.11503ms","start":"2019-09-25T11:10:35.540-0700","end":"2019-09-25T11:10:35.671-0700","steps":["trace[1475838163] step 'agreement among raft nodes before linearized reading' (duration: 56.363µs)","trace[1475838163] step 'authentication' (duration: 7.283µs)","trace[1475838163] step 'range keys from in-memory index tree' (duration: 10.166116ms)","trace[1475838163] step 'range keys from bolt db' (duration: 91.280638ms)","trace[1475838163] step 'filter and sort the key-value pairs' (duration: 22.58693ms)","trace[1475838163] step 'assemble the response' (duration: 241.774µs)"]}

To avoid log flood, we choose to log out only when exceed the threshold. In this pr, we set the default threshold to 100ms which is as same as warnApplyDuration. Here is the example output with threshold:

{"level":"info","ts":"2019-09-25T09:59:32.744-0700","caller":"traceutil/trace.go:116","msg":"trace[633331442] range","detail":"{range_begin:foo; range_end:fooo; response_count:100000; response_revision:191496;}","duration":"132.449773ms","start":"2019-09-25T09:59:32.611-0700","end":"2019-09-25T09:59:32.744-0700","steps":["trace[633331442] step 'range keys from bolt db' (duration: 92.521911ms)","trace[633331442] step 'filter and sort the key-value pairs' (duration: 22.789099ms)"]}

Some steps disappear because the duration of these steps are smaller than stepThreshold which is threshold / (len(steps)+1).

As for performance, there is no obvious difference with and without trace when I run the benchmark with a range request of key count 100000.

@jingyih
Copy link
Contributor

jingyih commented Sep 24, 2019

Thanks @YoyinZyc. For completeness, could you also share the benchmark result?

pkg/traceutil/trace.go Outdated Show resolved Hide resolved
pkg/traceutil/trace.go Outdated Show resolved Hide resolved
@YoyinZyc
Copy link
Contributor Author

YoyinZyc commented Sep 24, 2019

Here is the benchmark results.
I ran range request to get 100001 key's value which will trigger trace logging.
Without Trace:
image
With Trace:
image

As for small range request which will not exceed the trace logging threshold, I ran request to get 100 key's value.
Without Trace:
image
With Trace
image

pkg/traceutil/trace.go Outdated Show resolved Hide resolved
@gyuho
Copy link
Contributor

gyuho commented Sep 25, 2019

@YoyinZyc Can you provide example tracing log outputs with new changes?

@YoyinZyc
Copy link
Contributor Author

@gyuho The example of new tracing log outputs were updated in my first comment.

@jingyih
Copy link
Contributor

jingyih commented Sep 25, 2019

Can you also benchmark with short range requests to make sure there is no noticeable impact on throughput?

@YoyinZyc
Copy link
Contributor Author

@jingyih The benchmark result for short range requests were updated in my benchmark comment above.

@jingyih
Copy link
Contributor

jingyih commented Sep 26, 2019

Looks good. This PR makes breaking change to the following APIs which are not user facing. I think it is fine.

Range(ctx context.Context, txn mvcc.TxnRead, r *pb.RangeRequest) (*pb.RangeResponse, error)

etcd/mvcc/kv.go

Line 106 in 59fd194

Read(trace *traceutil.Trace) TxnRead

@gyuho @xiang90 Any concern?

pkg/traceutil/trace_test.go Outdated Show resolved Hide resolved
mvcc/kvstore_txn.go Outdated Show resolved Hide resolved
pkg/traceutil/trace.go Outdated Show resolved Hide resolved
@gyuho
Copy link
Contributor

gyuho commented Sep 26, 2019

Overall approach looks good, and consistent with kubernetes tracing. /cc @xiang90 @jpbetz

@jpbetz
Copy link
Contributor

jpbetz commented Sep 26, 2019

Looks great @YoyinZyc! I pre-reviewed this so I'll let the other maintainers weigh in, but LGTM.

@jingyih I see what you were saying about the sort and filter being relatively expensive. In the above example it takes 22.58693ms! This data is going to be super useful.

cc @wojtek-t

@jingyih
Copy link
Contributor

jingyih commented Sep 26, 2019

@jingyih I see what you were saying about the sort and filter being relatively expensive. In the above example it takes 22.58693ms! This data is going to be super useful.

Potentially, the step "Range keys from in-memory index tree" could also be relatively expensive if client is listing a lot of keys in pages. (e.g. listing 1 million events in 500-sized pages).

@YoyinZyc YoyinZyc requested a review from gyuho September 26, 2019 21:04
@wojtek-t
Copy link
Contributor

@mborsz @mm4tt - FYI (it will be super useful for us)

@jingyih @jpbetz - is that something we may consider patching to 3.4 etcd? It seems like extremely useful thing for debugging both tests and production

@gyuho
Copy link
Contributor

gyuho commented Sep 28, 2019

Yeah, let's try to merge and release in v3.4.2 next week. I will take another look next week.

Copy link
Contributor

@gyuho gyuho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor clean up, overall looks great!

@YoyinZyc Can we squash commits? We need cherry-pick this to 3.4, and squashing commits would make it easier.

pkg/traceutil/trace.go Outdated Show resolved Hide resolved
pkg/traceutil/trace.go Outdated Show resolved Hide resolved
pkg/traceutil/trace.go Outdated Show resolved Hide resolved
pkg/traceutil/trace_test.go Outdated Show resolved Hide resolved
pkg/traceutil/trace_test.go Outdated Show resolved Hide resolved
pkg/traceutil/trace_test.go Outdated Show resolved Hide resolved
etcdserver/v3_server.go Outdated Show resolved Hide resolved
etcdserver/v3_server.go Outdated Show resolved Hide resolved
etcdserver/v3_server.go Outdated Show resolved Hide resolved
pkg/traceutil/trace.go Outdated Show resolved Hide resolved
@xiang90
Copy link
Contributor

xiang90 commented Sep 30, 2019

do we have plan to add tracing to other operations?

steps:range from the in-memory index tree; range from boltdb.
etcdserver: add tracing steps: agreement among raft nodes before
linerized reading; authentication; filter and sort kv pairs; assemble
the response.
@gyuho
Copy link
Contributor

gyuho commented Sep 30, 2019

do we have plan to add tracing to other operations?

Any other use case @jpbetz @jingyih?

Since we are backporting this to v3.4, it would be better if we add them all at once.

@jingyih
Copy link
Contributor

jingyih commented Sep 30, 2019

Giving that K8s uses Range and Txn heavily, I think it makes sense to also add tracing for Txn.

@xiang90
Copy link
Contributor

xiang90 commented Sep 30, 2019

I guess we should add tracing to all operations to make it more consistent inside etcd.

@YoyinZyc
Copy link
Contributor Author

YoyinZyc commented Oct 1, 2019

I think it makes sense to add tracing to kv operations. I will put them into another pr. Btw, I opened an issue #11191 on txn tracing.

@wojtek-t
Copy link
Contributor

wojtek-t commented Oct 1, 2019

If we would like to cherrypick it to 3.4 release (which in my opinion would be highly valuable), I think we should try to do everything we want in a single PR to avoid cherrypicking a number of PRs.

@YoyinZyc YoyinZyc requested a review from gyuho October 1, 2019 16:41
@gyuho
Copy link
Contributor

gyuho commented Oct 1, 2019

@YoyinZyc Let's trace Write and Compaction in this PR, with additional commits. For first iteration, I think we are fine with same approach for Txn as well (just pass context to read and write transaction inside txn)?

mvcc: add put request steps; add trace to KV.Write() as input parameter.
@YoyinZyc
Copy link
Contributor Author

YoyinZyc commented Oct 2, 2019

Added trace to put and compaction.
Here are the example outputs:
Put:

"level":"info","ts":"2019-10-01T17:15:44.362-0700","caller":"traceutil/trace.go:137","msg":"trace[1967314171] put","detail":"{key:foo; value:prev; }","duration":"316.876µs","start":"2019-10-01T17:15:44.362-0700","end":"2019-10-01T17:15:44.362-0700","steps":["trace[1967314171] 'process raft request' (duration: 165.181µs)","trace[1967314171] 'get previous kv pair' (duration: 37.337µs)","trace[1967314171] 'get key's previous created_revision and leaseID' (duration: 3.716µs)","trace[1967314171] 'marshal mvccpb.KeyValue' (duration: 2.749µs)","trace[1967314171] 'store kv pair into bolt db' (duration: 38.504µs)","trace[1967314171] 'attach lease to kv pair' (duration: 198ns)"]}

Compaction:

{"level":"info","ts":"2019-10-01T17:17:11.971-0700","caller":"traceutil/trace.go:137","msg":"trace[672567340] compact","detail":"{revision:25000; }","duration":"277.299646ms","start":"2019-10-01T17:17:11.694-0700","end":"2019-10-01T17:17:11.971-0700","steps":["trace[672567340] 'process raft request' (duration: 120.154µs)","trace[672567340] 'check and update compact revision' (duration: 9.448069ms)","trace[672567340] 'compact in-memory index tree' (duration: 8.871586ms)","trace[672567340] 'schedule compaction' (duration: 15.347µs)","trace[672567340] 'physically apply compaction' (duration: 258.812035ms)"]}

Put(txn mvcc.TxnWrite, p *pb.PutRequest) (*pb.PutResponse, error)
Range(txn mvcc.TxnRead, r *pb.RangeRequest) (*pb.RangeResponse, error)
Put(txn mvcc.TxnWrite, p *pb.PutRequest) (*pb.PutResponse, *traceutil.Trace, error)
Range(ctx context.Context, txn mvcc.TxnRead, r *pb.RangeRequest) (*pb.RangeResponse, error)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make method signatures consistent for Put and Compaction?

For me, passing context.Context seems cleaner.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is hard to make it consistent with range. For put and compaction request, they need to make raft request first. When etcdserver receives the signal from raft, it will trigger the following apply process.

resp, err := s.raftRequest(ctx, pb.InternalRaftRequest{Put: r})

case ap := <-s.r.apply():

I don't think it is a good way to change too many apis inside raft. Therefore, the way I use now is to trace in apply then measure the raft request time by calculate the time difference between apply start time and req start time
Do you think it makes sense?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, makes sense.

Btw, we don't really need ctx parameter to Range unless we want to trace more in auth applier? Can we just create trace object at the top level of Range method?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, wait. It has to be passed from v3 server. So, nvm.

Copy link
Contributor

@gyuho gyuho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks for the great work.

Defer to @xiang90 @jingyih @jpbetz for final review.

Copy link
Contributor

@jingyih jingyih left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @YoyinZyc! Overall looks good!

etcdserver/apply.go Outdated Show resolved Hide resolved
etcdserver/v3_server.go Outdated Show resolved Hide resolved
pkg/traceutil/trace.go Outdated Show resolved Hide resolved
etcdserver/v3_server.go Outdated Show resolved Hide resolved
pkg/traceutil/trace.go Outdated Show resolved Hide resolved
etcdserver/v3_server.go Outdated Show resolved Hide resolved
Copy link
Contributor

@jingyih jingyih left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added one comment. LGTM otherwise.

pkg/traceutil/trace.go Outdated Show resolved Hide resolved
… applierV3.Compaction()

mvcc: trace compaction request; add input parameter 'trace' to KV.Compact()
@jingyih
Copy link
Contributor

jingyih commented Oct 8, 2019

LGTM

@gyuho gyuho merged commit 340f0ac into etcd-io:master Oct 8, 2019
@gyuho
Copy link
Contributor

gyuho commented Oct 8, 2019

@YoyinZyc Please highlight this in our CHANGELOG. And can you work on backporting this to v3.4 branch?

@YoyinZyc YoyinZyc changed the title Add tracing to range request in etcd server. Add tracing to range, put, compact request in etcd server. Oct 9, 2019
@YoyinZyc YoyinZyc changed the title Add tracing to range, put, compact request in etcd server. Add tracing in etcd server to range, put and compact requests Oct 9, 2019
YoyinZyc added a commit to YoyinZyc/etcd that referenced this pull request Oct 9, 2019
gyuho added a commit that referenced this pull request Oct 9, 2019
gyuho added a commit that referenced this pull request Oct 9, 2019
…79-origin-release-3.4

Automated cherry pick of #11179
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

6 participants