Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix race condition in port reintroduction test #11735

Merged

Conversation

jrhizor
Copy link
Contributor

@jrhizor jrhizor commented Apr 5, 2022

fixes #9650 (comment)

This locks properly around accessing the ports now. The problem before was that it could be in the middle of a take in the background thread when it wasn't supposed to be.

@jrhizor jrhizor requested a review from davinchia April 5, 2022 17:48
@github-actions github-actions bot added area/platform issues related to the platform area/worker Related to worker labels Apr 5, 2022
@jrhizor jrhizor temporarily deployed to more-secrets April 5, 2022 17:49 Inactive
@jrhizor jrhizor temporarily deployed to more-secrets April 5, 2022 17:49 Inactive
}

doneTakingPorts.countDown();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how does the latch help this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We wait on this assertTrue(doneTakingPorts.await(5, TimeUnit.SECONDS));

What this allows us to do is handle the case where it was taking an additional element in the thread before draining. Now it actually waits for the thread to complete using this mechanism so we guarantee the thread isn't touching the portsTaken list.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Pity. I thought this would help with the socket hanging issue we are seeing.

Copy link
Contributor Author

@jrhizor jrhizor Apr 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sadly no.

Copy link
Contributor

@davinchia davinchia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you manage to run this locally in succession without running into the transient error I did?

@jrhizor
Copy link
Contributor Author

jrhizor commented Apr 5, 2022

I ran this a bunch of times and it works at least way more commonly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/platform issues related to the platform area/worker Related to worker
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Evaluate Options for fixing flakiness in K8s acceptance tests
2 participants