-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
server: set multiple concurrentReadTx instances share one txReadBuffer #12933
Conversation
a4bda87
to
34793bf
Compare
Codecov Report
@@ Coverage Diff @@
## master #12933 +/- ##
==========================================
- Coverage 73.31% 72.84% -0.47%
==========================================
Files 430 430
Lines 34182 34212 +30
==========================================
- Hits 25060 24922 -138
- Misses 7194 7350 +156
- Partials 1928 1940 +12
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
0639cca
to
2f9612b
Compare
This is a patch inspired by #12529 |
Another different approach patch is #12935 |
Here are the performance graphs comparing the latest master branch and the concurrent read tx optimization code. Data file: |
79ee062
to
0cbc33f
Compare
0cbc33f
to
07aabea
Compare
Test results look great. The |
1484273
to
bcc8aae
Compare
i have to mention that the integration test with 4 cpus failed once and this time it passed. I want to make sure it is not a bug with my code which can fail the integration test. |
@jingyih please take a look at this patch if you have time and give some suggestions. |
Can someone help add |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we explain why serving from potentially stale read buffer is safe?
Thank you Gyuho for the review comments. I updated the code as per your comments. I like the new code logic you posted in the review. I am following it. But please check my updated logic and decide whether the last if-condition is correct. 🙏 |
Ah, I see. Do you have an idea why previous tests were much better? Either way, this was a great exercise. Once the script is contributed, we can look into some ways to automate. |
I was comparing my first patch with the latest updated patch so ideally it should be 1.0 because we didn't change the real logic much with all the addressed issue fixed. The difference should be within the acceptable variance. I am going to post a comparison between main branch and latest shared_txReadBuffer branch soon. The file I compared were concurrentReadTx-optimized2.csv and concurrentReadTx-optimized.csv. That plot is a little bit misleading. I am going to post a new update soon comparing latest patch vs. new main branch. |
Hi guys, this is the graph of the performance. The first graph is the latest main branch performance. The second graph is the performance of this patch. The third one is the comparison between the two. It looks like we still can get similar performance improvement. |
I will file another PR to check in the benchmark script. |
@ptabor There are no new changes except merging with the latest code. |
@wilsonwang371 The latest graph above is way better than #12933 (comment). Why there's so much delta when the logic was the same? |
Oh, they are different comparisons. A. Main branch ----> B. Initial Pull Patch Diff ----> C. Latest Pull Patch Diff B vs C: Graph in #12933 (comment) |
This PR is rebased from the latest main branch, right?
So, this is based on the current |
Main branch result is based on : commit 932d42b
The result of this patch is rebased on: commit aeb9b5f
Shall I rebase it on the latest main branch and do the comparison again? It seems like recent new changes do not have some observable difference in performance. |
As long as it has your PR #12896, shouldn't affect. That said, 46b49a6 is the recent commit around storage. If we can rebase on the latest commit rather than the one in |
This is already done a few hours ago. I did this through the github UI. The checking result is generated after that merge. But, in order to generate the performance graphs using python done above, it will take around 24 ~48 hours. |
Hi Gyuho, This is the performance result of the pull request based on 88d26c1 The performance is the same as the code that used in previous comments. |
@wilsonwang371 Really solid testing data. Greatly appreciate it. Give me some time for me to go through this again, and will make some suggestions for comments. |
Hi Gyuho and Piotr, Is it still possible to have this patch merged, tested and get shipped in 3.5 release? What is our next step before the final 3.5 release available? |
@wilsonwang371 I will get back to you by this weekend. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add the requested comments to explain the logic inline?
@wilsonwang371 Can we also remove merge commit, and squash/rebase from latest main branch? Thanks. |
88d26c1
to
9c82e8c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm. Thanks for the awesome work @wilsonwang371.
Can we also backport this to release-3.5
branch?
Thank you, Wilson. +1 to backporting (I think there are 3 chained PRs to cherry pick), and the performance impact of this change is very significant. According to: https://www.kubernetes.dev/resources/release/, the k8s code freeze is |
Currently, the concurrentReadTx does a copy of the txReadBuffer inside backend readTx. However, this buffer is never modified by concurrentReadTx. Therefore, we can just use a single copy of the txReadBuffer as long as there are no new changes committed to backend DB.
Therefore, we use a cached txReadBuffer to avoid excessive deep copy operations for the new read transaction.
More performance testing results will be uploaded later.
@gyuho @ptabor
Guys, please take a look at this and let me know if there is any concern or issue.