Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix race condition issue #391

Merged
merged 1 commit into from
Jul 30, 2019

Conversation

TommyLike
Copy link
Contributor

For #359

@volcano-sh-bot volcano-sh-bot requested review from hex108 and k82cn July 26, 2019 03:46
@volcano-sh-bot volcano-sh-bot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Jul 26, 2019
@@ -253,6 +253,7 @@ func (cc *Controller) syncJob(jobInfo *apis.JobInfo, updateStatus state.UpdateSt

waitCreationGroup := sync.WaitGroup{}
waitCreationGroup.Add(len(podToCreate))
stateMutex := sync.Mutex{}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move mutex into function

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

classifyAndAddUpPodBaseOnPhase is not the only place that we need to protect.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the other also should move to function, do not put mutex in more than on place.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI:

 // A Mutex must not be copied after first use.

So this is not right, you can change to stateMutex := &sync.Mutex{}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hzxuzhonghu just curious, for which reason why this is not working? Which code leads to the multiple copies of mutex instance?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, this var is used by many go routines, and if you print the address of it in the go routines, you can find they are not same. So can not protect the critical-section using different lock.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the mutex instance is not used as a parameter for those go routines.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make sense, my memory messed up.

@TommyLike TommyLike force-pushed the bug/fix_race_issue branch from 07c5adb to 5c44771 Compare July 29, 2019 07:09
@TravisBuddy
Copy link

Hey @TommyLike,
Something went wrong with the build.

TravisCI finished with status errored, which means the build failed because of something unrelated to the tests, such as a problem with a dependency or the build process itself.

View build log

TravisBuddy Request Identifier: 07b62a00-b1d5-11e9-9ed2-8f218eb7efe2

@hzxuzhonghu
Copy link
Collaborator

/lgtm

@volcano-sh-bot volcano-sh-bot added the lgtm Indicates that a PR is ready to be merged. label Jul 29, 2019
@k82cn
Copy link
Member

k82cn commented Jul 30, 2019

/approve

@volcano-sh-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: k82cn, TommyLike

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@volcano-sh-bot volcano-sh-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 30, 2019
@k82cn
Copy link
Member

k82cn commented Jul 30, 2019

/lgtm

@TommyLike
Copy link
Contributor Author

@asifdxtreme what's the command to retest the patch?

/retest

@volcano-sh-bot volcano-sh-bot merged commit da80e4f into volcano-sh:master Jul 30, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants