Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix delete inconsistencies in read buffer #17195

Merged
merged 2 commits into from
Jan 11, 2024
Merged

Conversation

siyuanfoundation
Copy link
Contributor

@siyuanfoundation siyuanfoundation commented Jan 3, 2024

@siyuanfoundation
Copy link
Contributor Author

cc @ahrtr @serathius

server/storage/backend/tx_buffer.go Outdated Show resolved Hide resolved
server/storage/backend/batch_tx_test.go Outdated Show resolved Hide resolved
@ahrtr
Copy link
Member

ahrtr commented Jan 4, 2024

The PR includes fix for

This PR is changing one of the critical parts of etcd, please let's breakdown it into two separate PRs. Each only fixes one problem. For each PR, the first commit only includes test cases to reproduce the issue you are going to fix in the 2nd or 3rd commits. Ensure you can reproduce the issue before you adding the 2nd or 3rd commit/patch, and can't reproduce it anymore after adding the patch.

@siyuanfoundation siyuanfoundation force-pushed the txBuf1 branch 2 times, most recently from 2828c9e to 656fae1 Compare January 5, 2024 00:51
@siyuanfoundation siyuanfoundation changed the title Fix some inconsistencies in read buffer Fix delete inconsistencies in read buffer Jan 5, 2024
@siyuanfoundation
Copy link
Contributor Author

/retest

Copy link
Member

@fuweid fuweid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@serathius
Copy link
Member

This PR is changing one of the critical parts of etcd, please let's breakdown it into two separate PRs. Each only fixes one problem. For each PR, the first commit only includes test cases to reproduce the issue you are going to fix in the 2nd or 3rd commits. Ensure you can reproduce the issue before you adding the 2nd or 3rd commit/patch, and can't reproduce it anymore after adding the patch.

Agree with that, we need to make sure that each of those issues are properly understood and tested. Having a single PR for both adds unnecessary cognitive overhead.

if t.pending >= t.backend.batchLimit {
// We need to commit the transaction if
// 1. t.pending >= t.backend.batchLimit for performance,
// because We don't want buffer to grow too big as it has worse complexity.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's my current understanding, however would be good to confirm. Could you check PRs when the batchlimit was introduced and see what was the motivation back then. Referencing PR would be also useful.

cc @ahrtr

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is one relevant pr #10283

checkForEachResponseMatch(t, b.BatchTx(), b.ReadTx(), schema.Test, nil, nil)
}

func checkRangeResponseMatch(t *testing.T, tx backend.BatchTx, rtx backend.ReadTx, key, endKey []byte, limit int64, expectedKeys, expectedValues [][]byte) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By adding validation of expected values you mixed assertion based testing with property based testing. Let's pick one of them.

Originally I implemented the test as property testing. The property I was testing was "ReadTx should always return the same response as BatchTX". This property should always hold, no matter what operations are executed before. I did it to allow make validation independent from test scenario and potentially randomize operations before validation.

Recent changes to the PR added expectedKeys and expectedValues testing, this means you hardeded the validation to specific cases. This is still correct, however the code can be much more simplified.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In unit test, the test cases are hard coded. I think it is important to test the actual key and values in case ReadTx and BatchTX make the same mistake.
Changed the code to just check the results in checkForEach

server/storage/backend/batch_tx.go Outdated Show resolved Hide resolved
@@ -55,6 +55,66 @@ func TestUserError(t *testing.T) {
}
}

func TestAddUserAfterDelete(t *testing.T) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you confirmed that the test case can reproduce the issue?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed the test case if the defensive commit is removed. Because auth force commits for every change, this is not actually a problem for auth right now.

I am adding this test for completeness.

Copy link
Member

@serathius serathius Jan 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, nice find. This defensive commit seems scary, similar to #17119 we noticed some inconsistency and didn't root cause it.

I think we should consider a followup where we add adding verifiers #17158 (comment), remove the defensive commit and see if we find anything.

tests/integration/clientv3/user_test.go Outdated Show resolved Hide resolved
tests/integration/member_test.go Show resolved Hide resolved
tests/integration/member_test.go Outdated Show resolved Hide resolved
tests/integration/member_test.go Outdated Show resolved Hide resolved
@ahrtr ahrtr mentioned this pull request Jan 8, 2024
4 tasks
@siyuanfoundation siyuanfoundation force-pushed the txBuf1 branch 3 times, most recently from 923a035 to 1a249b1 Compare January 9, 2024 00:48
@siyuanfoundation
Copy link
Contributor Author

/retest

Signed-off-by: Siyuan Zhang <sizhang@google.com>
Signed-off-by: Siyuan Zhang <sizhang@google.com>
@siyuanfoundation
Copy link
Contributor Author

/retest

Copy link
Member

@ahrtr ahrtr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Could you backport the fix to 3.5 and 3.4? thx

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants