-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tests/linearizability: added sleep fail point injection #14796
Conversation
Injected inside *raftNode.start() to test functionality Signed-off-by: Ramil Mirhasanov <ramil600@yahoo.com>
fix: #14729 |
@@ -157,7 +157,7 @@ func (r *raftNode) tick() { | |||
// to modify the fields after it has been started. | |||
func (r *raftNode) start(rh *raftReadyHandler) { | |||
internalTimeout := time.Second | |||
|
|||
// gofail: var raftBeforeStart struct{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean that this is not a useful gofail point so let's not add it. If you want to test sleep failpoint you can just use other ones like backend/defragBeforeCopy
.
Path: failpoint, | ||
} | ||
//check whether sleep was enabled | ||
r, err := http.NewRequest("GET", failpointsUrl.String(), nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't think this is needed. If setupGoFailpoint
succeeded then we can assume that sleep was enabled.
I think there was a misunderstanding about #14729.
Based on http response success we can assume that failpoint was enabled. What we don't know if it really was executed. After we setup sleep failpoint we need a way to verify if it was really triggered to make sure that tests do what they claim to do. I incorrectly assumed that failpoint is cleared after it was executed. This is true for triggering What I propose is for sleep failpoint to work like:
Question is how to know that failpoint was executed. It could be either done by:
I prefer the second approach. What do you think? |
The linearizability test is based on the e2e test framework, each time each case creates a new cluster and start new processes, so no need to disable the failpoints in this case.
The test case needs to explicitly trigger the related code path. |
Since the random failpoint will be triggered multiple times in one test case, and each time it might select a different failpoint, so we need to disable the failpoint each time after it's triggered. |
One suggestion I have is resetting sleep duration value to 0 and deleting description in the terms field of the failpoint, once sleep is executed. |
Not sure I understand why this is needed. We don't need ensure exact number of sleep triggers. Sleep failpoint for linearizability tests should ensure that order of goroutine execution doesn't influence the result, so running it multiple times doesn't matter as long we can recover the state before we run other tests. Length of the sleep also matters, for rarely run code (compact loop) we might set a long sleep (100ms) and wait for one execution . On the other hand short loops (like raft/apply code) we cannot set long sleep as it could trigger probe failures. It might be better to set short sleep (1-10ms) and wait until we have 10-100 runs. This is why I would prefer implement a run counter for failpoint. Proposed flow:
This requires implementing counter for each failpoint. Please let me know if you need a help with that. |
Added PR in gofail to implement the counter: runtime: added counter to failpoint.terms field |
Thinking that maybe we should add tests to gofail to avoid merging broken PRs like it was for |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions. |
tests/linearizability: added sleep failpoint injection functionality
Injected sleep fail point inside *raftNode.start() to test functionality
Signed-off-by: Ramil Mirhasanov ramil600@yahoo.com
Please read https://github.com/etcd-io/etcd/blob/main/CONTRIBUTING.md#contribution-flow.