Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: configmgr need wait for all processed avoid tests data race #3891

Merged
merged 1 commit into from
Dec 24, 2024

Conversation

zedongh
Copy link
Contributor

@zedongh zedongh commented Dec 13, 2024

configmgr Start test need ensure all queue source be processed then do assert action. avoid call l.called when it write at SyncConfig.

@lowang-bh
/close #3884

@volcano-sh-bot
Copy link
Contributor

Welcome @zedongh!

It looks like this is your first PR to volcano-sh/volcano.

Thank you, and welcome to Volcano. 😃

@volcano-sh-bot volcano-sh-bot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Dec 13, 2024
@zedongh zedongh force-pushed the master branch 2 times, most recently from e352aa2 to 0ad0054 Compare December 13, 2024 15:57
@Monokaix
Copy link
Member

seems add a rwlock is enough: )

@lowang-bh
Copy link
Member

seems add a rwlock is enough: )

Agree!

@zedongh zedongh closed this Dec 19, 2024
@volcano-sh-bot volcano-sh-bot added retest-not-required-docs-only approved Indicates a PR has been approved by an approver from all required OWNERS files. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Dec 19, 2024
…be processed

Signed-off-by: zedongh <248348907@qq.com>
@zedongh zedongh reopened this Dec 19, 2024
@volcano-sh-bot volcano-sh-bot removed retest-not-required-docs-only approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Dec 19, 2024
@zedongh
Copy link
Contributor Author

zedongh commented Dec 19, 2024

seems add a rwlock is enough: )

Agree!

Thanks a lot. After review the code of fake source, I make last simple commit in Stop() replace queue.Shutdown with queue.ShutdownWithDrain. Let me to cleaify the reason.

I believe there should be no concurrency here, so it is not suitable to use locks. The only scenario where concurrency occurs is as follows:

Mgr.Start goroutine

[Order 1]   key, quit := queue.Get()  // consume item and queue size -1

[Order 3a]  m.notifyListeners()

Test goroutine

[Order 2]. s.Stop() // queue length is zero, break loop, queue Shutdown, but item not Done yet
          // if f.queue.Len() == 0 {
          //  f.queue.ShutDown()
          //  return
          // }

[Order 3b] assert.Equal(t, tc.expectedNotifyCalled, l.called)

3a and 3b have no order guarantee. may cause data race

Simple to make origin test failed is set a time out in SyncConfig

func (l *FakeListener) SyncConfig(cfg *api.ColocationConfig) error {
        // time out will make this calling after testing assert, make it failed
	// time.Sleep(1*time.Second + 500*time.Millisecond)
	l.called++

@lowang-bh
Copy link
Member

As the issue described,l.called++ at line 224 and l.called at line 305 has a read & write conflict.

@zedongh
Copy link
Contributor Author

zedongh commented Dec 21, 2024

As the issue described,l.called++ at line 224 and l.called at line 305 has a read & write conflict.

Yeah, the reason is fakeSource Stop use queue.Shutdown not queue.ShutdownWithDrain cause the last queue item processing data race. So just fixed the Stop rather than add a rwlock.

@lowang-bh
Copy link
Member

/ok-to-test

@volcano-sh-bot volcano-sh-bot added the ok-to-test Indicates a non-member PR verified by an org member that is safe to test. label Dec 21, 2024
@JesseStutler
Copy link
Member

Can we push forward this PR? I also meet this e2e failing frequently @hwdef @Monokaix

@Monokaix
Copy link
Member

/approve

@volcano-sh-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Monokaix

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@volcano-sh-bot volcano-sh-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 24, 2024
@Monokaix
Copy link
Member

I think we can merge it first, this is a reasoable change, if it still fails, we can re-work ion that. @lowang-bh

@lowang-bh
Copy link
Member

/lgtm

@volcano-sh-bot volcano-sh-bot added the lgtm Indicates that a PR is ready to be merged. label Dec 24, 2024
@volcano-sh-bot volcano-sh-bot merged commit b169623 into volcano-sh:master Dec 24, 2024
29 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

data race in configmgr_test.go
5 participants