Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wal: add "etcd_wal_writes_bytes_total" #11738

Merged
merged 3 commits into from
Apr 1, 2020
Merged

Conversation

gyuho
Copy link
Contributor

@gyuho gyuho commented Apr 1, 2020

While we track latencies around WAL fsync, actual number of written data aren't being tracked. This will be useful when we look into the write throughput of etcd.

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
@gyuho
Copy link
Contributor Author

gyuho commented Apr 1, 2020

/cc @jingyih @jpbetz @xiang90 @spzala

@codecov-io
Copy link

Codecov Report

Merging #11738 into master will decrease coverage by 0.33%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #11738      +/-   ##
==========================================
- Coverage   66.29%   65.95%   -0.34%     
==========================================
  Files         403      403              
  Lines       36771    36787      +16     
==========================================
- Hits        24376    24264     -112     
- Misses      10895    11017     +122     
- Partials     1500     1506       +6     
Impacted Files Coverage Δ
pkg/ioutil/pagewriter.go 91.48% <100.00%> (+1.01%) ⬆️
wal/encoder.go 85.45% <100.00%> (+1.14%) ⬆️
wal/metrics.go 100.00% <100.00%> (ø)
wal/repair.go 55.35% <100.00%> (+1.65%) ⬆️
wal/wal.go 56.07% <100.00%> (+0.37%) ⬆️
pkg/transport/timeout_conn.go 60.00% <0.00%> (-20.00%) ⬇️
client/client.go 65.03% <0.00%> (-18.96%) ⬇️
auth/jwt.go 64.04% <0.00%> (-13.49%) ⬇️
proxy/grpcproxy/register.go 72.50% <0.00%> (-10.00%) ⬇️
etcdserver/api/v3rpc/lease.go 77.21% <0.00%> (-8.87%) ⬇️
... and 24 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update dd816f0...381e77c. Read the comment docs.

Copy link
Member

@spzala spzala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @gyuho
LGTM and a small comment, we should document the usage of this new metric. I guess probably in here https://github.com/etcd-io/etcd/blob/master/Documentation/metrics/latest ? Thanks!

@gyuho
Copy link
Contributor Author

gyuho commented Apr 1, 2020

Will do. Thanks!

@gyuho gyuho merged commit 5ceb701 into etcd-io:master Apr 1, 2020
@gyuho gyuho deleted the wal-metrics branch April 1, 2020 16:29
Comment on lines +98 to +102
// Flush flushes buffered data.
func (pw *PageWriter) Flush() error {
_, err := pw.flush()
return err
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should Flush() be deprecated or even removed? Otherwise, the metrics where FlushN captured isn't complete?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was an external package, so there could be other projects importing this method. So we rather keep it.

Comment on lines +89 to 90
start := time.Now()
if err = fileutil.Fsync(f.File); err != nil {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to calculate metrics inside fileutil.Fsync so that a) consumer won't forget to add metrics in new places where Fsync is called and 2) capture the latency even for the failed calls?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fileutil.Fsync is used in other packages, so we don't want to break the interface and introduce a new prometheus metrics for a simple package. We may consider wrapper in the next release, so we need not do this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants