Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Doc: provide more clarify to the usage of check perf command #14111

Merged
merged 1 commit into from
Jun 17, 2022

Conversation

patrocinio
Copy link
Contributor

@patrocinio patrocinio commented Jun 13, 2022

This PR describes the fact that different workloads in the check perf command
are different, and the results might vary.

Related #13455

Copy link
Member

@serathius serathius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -126,7 +126,7 @@ func NewCheckPerfCommand() *cobra.Command {
}

// TODO: support customized configuration
cmd.Flags().StringVar(&checkPerfLoad, "load", "s", "The performance check's workload model. Accepted workloads: s(small), m(medium), l(large), xl(xLarge)")
cmd.Flags().StringVar(&checkPerfLoad, "load", "s", "The performance check's workload model. Accepted workloads: s(small), m(medium), l(large), xl(xLarge). Different workload models use different configurations in terms of number of clients and expected throughtput.")
Copy link
Contributor

@ptabor ptabor Jun 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The load flag controls:

  • number of concurrent clients (50-1000)
  • maximal rate of issued put requests (across all clients) (150-15000)

The check to succeed expects server's avg. effective throughput to be >90% of the issued requests and latency of all the requests <0.5s.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It indeed means "expected throughput" to me. We set an expected throughput, then measure the real throughput.

For example, if the actual throughput is less than 90% of the expected throughput, then the test fails.

Copy link
Member

@ahrtr ahrtr Jun 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably it makes more sense to clearly document the standard on the pass of test.

  1. throughput > 90%
  2. s.Slowest <= 0.5
  3. s.Stddev <= 0.1

Please update the doc etcdctl/README.md to get this documented.

check.go#L242-L254

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the document as such as a concise doc. Adding more details in the README is also a good idea. Also, another thought is to add more doc in the command function to clarify how it works - and then just point to that in the readme. In case, in future, we add new threshold or change in the existing values then we don't have to update doc. Thanks @patrocinio

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @spzala for the feedback.

add more doc in the command function to clarify how it works - and then just point to that in the readme

This might not work. When we update the source code, the link might be out of date. So the best approach would be to get all detailed included in the README directly instead of just adding a link.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ahrtr thanks and yes I agree with you! It's more about pointing to the function, not a particular LOC. ReadMe sounds good to me.

@@ -1505,6 +1505,10 @@ CHECK provides commands for checking properties of the etcd cluster.

CHECK PERF checks the performance of the etcd cluster for 60 seconds. Running the `check perf` often can create a large keyspace history which can be auto compacted and defragmented using the `--auto-compact` and `--auto-defrag` options as described below.

Notice that different workload models use different configurations in terms of number of clients and expected throughtput.
Copy link
Contributor

@ptabor ptabor Jun 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please reference:

var checkPerfCfgMap = map[string]checkPerfCfg{
// TODO: support read limit
"s": {
limit: 150,
clients: 50,
duration: 60,
},
"m": {

Copy link
Member

@ahrtr ahrtr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

But I'd leave @ptabor to merge this PR, in case he has any different opinion.

Copy link
Member

@ahrtr ahrtr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see files#r896538376

Copy link
Member

@spzala spzala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @patrocinio lgtm with one inline comment.

@@ -126,7 +126,7 @@ func NewCheckPerfCommand() *cobra.Command {
}

// TODO: support customized configuration
cmd.Flags().StringVar(&checkPerfLoad, "load", "s", "The performance check's workload model. Accepted workloads: s(small), m(medium), l(large), xl(xLarge)")
cmd.Flags().StringVar(&checkPerfLoad, "load", "s", "The performance check's workload model. Accepted workloads: s(small), m(medium), l(large), xl(xLarge). Different workload models use different configurations in terms of number of clients and expected throughtput.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the document as such as a concise doc. Adding more details in the README is also a good idea. Also, another thought is to add more doc in the command function to clarify how it works - and then just point to that in the readme. In case, in future, we add new threshold or change in the existing values then we don't have to update doc. Thanks @patrocinio

etcdctl/README.md Outdated Show resolved Hide resolved
@codecov-commenter
Copy link

codecov-commenter commented Jun 16, 2022

Codecov Report

Merging #14111 (5cc91ef) into main (fc69053) will decrease coverage by 0.23%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main   #14111      +/-   ##
==========================================
- Coverage   75.23%   75.00%   -0.24%     
==========================================
  Files         452      452              
  Lines       36781    36775       -6     
==========================================
- Hits        27674    27583      -91     
- Misses       7373     7448      +75     
- Partials     1734     1744      +10     
Flag Coverage Δ
all 75.00% <100.00%> (-0.24%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
etcdctl/ctlv3/command/check.go 12.61% <100.00%> (ø)
server/etcdserver/api/rafthttp/peer_status.go 87.87% <0.00%> (-12.13%) ⬇️
server/etcdserver/api/rafthttp/peer.go 87.01% <0.00%> (-8.45%) ⬇️
client/pkg/v3/tlsutil/tlsutil.go 83.33% <0.00%> (-8.34%) ⬇️
client/v3/namespace/watch.go 87.87% <0.00%> (-6.07%) ⬇️
raft/rafttest/node.go 95.00% <0.00%> (-5.00%) ⬇️
server/etcdserver/api/v3rpc/member.go 93.54% <0.00%> (-3.23%) ⬇️
server/etcdserver/cluster_util.go 70.35% <0.00%> (-3.17%) ⬇️
server/etcdserver/api/v3rpc/interceptor.go 74.47% <0.00%> (-3.13%) ⬇️
server/etcdserver/api/v3rpc/watch.go 85.90% <0.00%> (-2.69%) ⬇️
... and 24 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fc69053...5cc91ef. Read the comment docs.

@ahrtr
Copy link
Member

ahrtr commented Jun 16, 2022

Looks good to me. Please squash the commits, thx.

This PR describes the fact that different workloads in the `check perf` command
are different, and the results might vary.

Signed-off-by: Eduardo Patrocinio <epatro@gmail.com>
@patrocinio
Copy link
Contributor Author

Hey @ahrtr I squashed the commits. Thanks!

Copy link
Member

@ahrtr ahrtr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Thank you! @patrocinio

@ahrtr ahrtr merged commit 43434af into etcd-io:main Jun 17, 2022
@ahrtr
Copy link
Member

ahrtr commented Jun 17, 2022

@patrocinio Can you please backport this PR to 3.5?

@patrocinio
Copy link
Contributor Author

@ahrtr
Copy link
Member

ahrtr commented Jun 18, 2022

@patrocinio Please cherry pick this PR to release-3.5.

@serathius
Copy link
Member

I'm not sure if we should do a backport as we usually backport only fixes. Is there any benefit from backporting documentation update?

@ahrtr
Copy link
Member

ahrtr commented Jun 20, 2022

Yes, it isn't a big deal. But I'd prefer to backport it, because it's useful when users run etcdctl 3.5 and read the 3.5 doc.

@serathius
Copy link
Member

I mean we have a official backport policy that lists The [backport] commits should be restricted to bug fixes and security patches.
https://github.com/etcd-io/etcd/blob/3cde98ca8cacddecba2bf6864f5f8fae0bbdab2d/Documentation/contributor-guide/release.md#patch-version-release

I would prefer to avoid breaking our own policies without having a strong incentive.

@serathius
Copy link
Member

serathius commented Jun 21, 2022

Removing backport label for now. Feel free to re-add it when you get support from other maintainers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

6 participants