Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reimplement tools/rw-heatmaps in go #17428

Merged
merged 1 commit into from
Mar 11, 2024

Conversation

ivanvc
Copy link
Member

@ivanvc ivanvc commented Feb 14, 2024

This PR translates Python's tools/rw-heatmaps to Go. It covers the use case in which the Python script takes a single input and plots a grid of heatmap (tripcolor) charts.

The Go implementation uses gonum.org/v1/plot, a plotting solution that could generate similar style charts while creating images. Unfortunately, it doesn't provide an equivalent to Python's tripcolor, so I'm doing a heatmap instead. While trying to keep the Go implementation similar to the original one, I used the closest color palette to the one that Python used (however, the color scale for Go's palette is inverted, so it needs to be inverted to have a similar color palette).

I introduced a go.mod in the tools/rw-heatmaps directory. Because the tools/ directory doesn't have a module defined. Therefore, any dependencies from this rewrite were getting into the top-level's go.mod. In the future, adding a module in tools/ would be good if more tools are written in Go. Right now, the ones written in Go are not introducing any new dependencies, so the top level doesn't carry these dependencies.

Python implementation charts

python_read
python_write

Go implementation charts

go_read
go_write

Used CSV

The CSV file used to generate the previous plots was this one: result-202401270029.csv

Please read https://github.com/etcd-io/etcd/blob/main/CONTRIBUTING.md#contribution-flow.

@k8s-ci-robot
Copy link

Hi @ivanvc. Thanks for your PR.

I'm waiting for a etcd-io member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

tools/rw-heatmaps/README.md Outdated Show resolved Hide resolved
@ivanvc
Copy link
Member Author

ivanvc commented Feb 15, 2024

@jmhbnz, any feedback on the directory structure? Is it too much? Should I flatten pkg/ subdirectories?

@jmhbnz
Copy link
Member

jmhbnz commented Feb 15, 2024

@jmhbnz, any feedback on the directory structure? Is it too much? Should I flatten pkg/ subdirectories?

We don't have many files so feel free to flatten it for now. I'm pretty comfortable with it either way. Other reviewers may have stronger opinions on it, let's see.

@jmhbnz jmhbnz requested a review from ahrtr February 15, 2024 22:40
@ahrtr
Copy link
Member

ahrtr commented Feb 22, 2024

Thanks @ivanvc for the great work.

Please upload the CSV file(s) which you used to generate the heatmap diagrams above in this PR.

Previous implementation has an performance comparison diagram as well, please refer to https://etcd.io/blog/2021/announcing-etcd-3.5/. But I do not see such diagram in this PR.

Personally I think the line chart generated in #15060 is clearer, but just as we discussed in the community meeting, it doesn't conflict with this PR. We can consider to add more visualisation in future.

@ivanvc
Copy link
Member Author

ivanvc commented Feb 22, 2024

Thanks @ivanvc for the great work.

Please upload the CSV file(s) which you used to generate the heatmap diagrams above in this PR.

Hi @ahrtr, thank you. I uploaded it in the pull request description, but it is here too: https://github.com/etcd-io/etcd/files/14378480/result-202401270029.csv

Previous implementation has an performance comparison diagram as well, please refer to https://etcd.io/blog/2021/announcing-etcd-3.5/. But I do not see such diagram in this PR.

I'll work on this, but I'm unfamiliar with how to generate this diagram, can you point me on how to do it? Seems like it wasn't covered in the original README.

@ahrtr
Copy link
Member

ahrtr commented Feb 22, 2024

I'll work on this, but I'm unfamiliar with how to generate this diagram, can you point me on how to do it? Seems like it wasn't covered in the original README.

I am not familiar with that either. But technically speaking, I think it's should be same way as you generate the heatmap diagram above; you just need to replace the each value (displayed on the right y-axis) with (new value) / (old value).

@ahrtr
Copy link
Member

ahrtr commented Feb 22, 2024

you just need to replace the each value (displayed on the right y-axis) with (new value) / (old value).

For example, assuming when connection is 32 and value size is 256, the throughput of the main branch is 1000 qps, while your PR is 1200, then use the value 1200/1000 = 1.2.

@ivanvc ivanvc force-pushed the reimplement-rw-heatmaps-in-go branch 2 times, most recently from b8c4391 to d1652ad Compare March 5, 2024 21:25
@ivanvc
Copy link
Member Author

ivanvc commented Mar 5, 2024

I finished porting the comparison charts to generate the blog post you mentioned @ahrtr. The following is the Go output of running the tool with the following two datasets:

Running

go run . -o comparison_go -t compare ../../result-202401270029.csv ../../result-202402281701.csv  -f png

Generates:

Read

comparison_go_read

Write

comparison_go_write

Passing the non-zero centered option:

go run . --zero-centered=false -o non_zero_centered_comparison_go -t compare ../../result-202401270029.csv ../../result-202402281701.csv -t png

Generates:

Read

non_zero_centered_comparison_go_write

Write

non_zero_centered_comparison_go_read

As comparison, this is the Python output:

Read

python_comparison_write

Write

python_comparison_read

By the way, I noticed a performance improvement in running this with Python and Go.

# Python
real    0m9.052s
user    0m8.520s
sys     0m2.628s
# Go
real    0m0.534s
user    0m0.672s
sys     0m0.151s

@ivanvc
Copy link
Member Author

ivanvc commented Mar 5, 2024

I squashed the commits, and I'm undrafting the PR.

@ivanvc ivanvc marked this pull request as ready for review March 5, 2024 21:39
@ivanvc ivanvc requested a review from jmhbnz March 5, 2024 21:40
@ivanvc
Copy link
Member Author

ivanvc commented Mar 5, 2024

@jmhbnz, a heads up that this is the first commit that fails the Go Vulnerability Checker because of the just-released GO-2024-2611 (see https://github.com/etcd-io/etcd/actions/runs/8163213119/job/22315941148?pr=17428). This wasn't caught by Dependabot (as it was released hours ago). But can potentially block many PRs from now on. Would it make sense to create a new issue and have a PR soon to address it? I can work on this.

@jmhbnz
Copy link
Member

jmhbnz commented Mar 5, 2024

@jmhbnz, a heads up that this is the first commit that fails the Go Vulnerability Checker because of the just-released GO-2024-2611 (see https://github.com/etcd-io/etcd/actions/runs/8163213119/job/22315941148?pr=17428). This wasn't caught by Dependabot (as it was released hours ago). But can potentially block many PRs from now on. Would it make sense to create a new issue and have a PR soon to address it? I can work on this.

If you have time please lend a hand with this weeks dependabot bumping and include an additional commit bumping that CVE impacted dependency also.

Feel free to ping me on k8s slack if you do want to tackle the weekly dependency bump and have any questions. We have instructions and the rotation worksheet link here https://github.com/etcd-io/etcd/blob/main/Documentation/contributor-guide/dependency_management.md

@ahrtr
Copy link
Member

ahrtr commented Mar 7, 2024

Please rebase this PR.

@ivanvc ivanvc force-pushed the reimplement-rw-heatmaps-in-go branch from d1652ad to e7ee996 Compare March 7, 2024 22:41
@ivanvc
Copy link
Member Author

ivanvc commented Mar 8, 2024

/retest

@ivanvc
Copy link
Member Author

ivanvc commented Mar 8, 2024

Done @ahrtr

@ahrtr
Copy link
Member

ahrtr commented Mar 8, 2024

@ivanvc Great work! The code looks super clean and easy to understand.

  • I did not read through the code under pkg/chart; instead I just compared a couple of values in the the CSV files and the images being generated, and they matches.
  • I suggest to integrate the line charts in Reimplement the tool rw-heatmaps using golang and rename it to rw-benchmark #15060 in the next step. They can share the same data structure defined in pkg/dataset in this PR. Please feel free to do it in a followup PR if you are interested. But we may want to rename the rw-heatmaps to something like rw-benchmark, in which there will be two visualisations: heatmap and line charts. All these can be in followup PRs.

tools/rw-heatmaps/README.md Outdated Show resolved Hide resolved
tools/rw-heatmaps/README.md Outdated Show resolved Hide resolved
@ivanvc ivanvc force-pushed the reimplement-rw-heatmaps-in-go branch from e7ee996 to 0aec52f Compare March 8, 2024 23:27
@ivanvc
Copy link
Member Author

ivanvc commented Mar 8, 2024

@ivanvc Great work! The code looks super clean and easy to understand.

* I did not read through the code under `pkg/chart`; instead I just compared a couple of values in the the CSV files and the images being generated, and they matches.

* I suggest to integrate the line charts in [Reimplement the tool `rw-heatmaps` using golang and rename it to `rw-benchmark` #15060](https://github.com/etcd-io/etcd/pull/15060) in the next step. They can share the same data structure defined in `pkg/dataset` in this PR. Please feel free to do it in a followup PR if you are interested. But we may want to rename the `rw-heatmaps` to something like `rw-benchmark`, in which there will be two visualisations: heatmap and line charts. All these can be in followup PRs.

Thank you!

I can definitely work on the line charts in a follow-up PR.

@ivanvc ivanvc requested a review from ahrtr March 8, 2024 23:32
tools/rw-heatmaps/README.md Outdated Show resolved Hide resolved
tools/rw-heatmaps/README.md Outdated Show resolved Hide resolved
Signed-off-by: Ivan Valdes <ivan@vald.es>
@ivanvc ivanvc force-pushed the reimplement-rw-heatmaps-in-go branch from 0aec52f to 1576e2e Compare March 10, 2024 03:26
Copy link
Member

@ahrtr ahrtr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Thank you! Great work!

Copy link
Member

@jmhbnz jmhbnz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking great, thanks @ivanvc.

One suggestion that I'm happy to be addressed in a follow-up would be to add a .gitignore in the rw-heatmaps subdirectory to ensure users of the tool who run it from that directory (which may be fairly common) don't inadvertantly commit resulting files.

Here is what I see after building and running rw-heatmaps:

Untracked files:
  (use "git add <file>..." to include in what will be committed)
        result-202403111419.csv
        rw-heatmaps
        test.png_read.jpg
        test.png_write.jpg

Overall I think this is ready to merge and we can continue iterating with smaller pr's moving forward.

@ivanvc
Copy link
Member Author

ivanvc commented Mar 11, 2024

Let me know if you want me to add a .gitignore in this PR. That's fine with me.

@jmhbnz
Copy link
Member

jmhbnz commented Mar 11, 2024

Let me know if you want me to add a .gitignore in this PR. That's fine with me.

Let's leave this pr as is now to merge and come back with smaller follow-ups 🙏🏻

@ahrtr ahrtr merged commit a23bee3 into etcd-io:main Mar 11, 2024
39 checks passed
@ivanvc ivanvc deleted the reimplement-rw-heatmaps-in-go branch March 11, 2024 16:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

5 participants