-
-
Notifications
You must be signed in to change notification settings - Fork 415
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Strengthen the ordering for some atomic operations #4083
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
SeanTAllen
added
the
changelog - fixed
Automatically add "Fixed" CHANGELOG entry on merge
label
Apr 2, 2022
ponylang-main
added
the
discuss during sync
Should be discussed during an upcoming sync
label
Apr 2, 2022
When this is merged, the top comment should be used as the commit message when squashing. |
The changelog and release notes failures can be ignored. I opened this and then pushed a few changes quickly and we hit the secondary rate limits that GH has. Quicker than normal, but that is the cause, and we don't have any changelog or release notes changes in this to run. |
jemc
approved these changes
Apr 2, 2022
ponylang-main
removed
the
discuss during sync
Should be discussed during an upcoming sync
label
Apr 2, 2022
SeanTAllen
added a commit
that referenced
this pull request
Oct 28, 2022
In PR #4083, I reversed some relaxing of atomic orderings that were introduced in 2016 in #1209. When I was doing that change, I missed an additional strengthing that should have been done in the IOCP subsystem. If you note the original commit, I made a corresponding change in the EPoll subsystem, but missed doing it in the IOCP system.
SeanTAllen
added a commit
that referenced
this pull request
Oct 28, 2022
In PR #4083, I reversed some relaxing of atomic orderings that were introduced in 2016 in #1209. When I was doing that change, I missed an additional strengthing that should have been done in the IOCP subsystem. If you note the original commit, I made a corresponding change in the EPoll subsystem, but missed doing it in the IOCP system.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This commit undoes the changes introduced in 2016 in
#1209.
While we don't know that every change in #1209, we do know that a number of the
changes are the source of incorrect operations on Arm where we see evidence of
"bad things" like actors being run by more than 1 scheduler at a time.
Matt Parkinson from MSR Cambridge identified a number of issues with compare
and swap on Arm that Joe Eli McIlvain then found were introduced in #1209. The
issues would not impact x86, only CPUs with weaker memory models like Arm.
Given that we know that some of the changes in #1209 are wrong, Joe and I
decided that the best course of action is to roll them all back. While all
might not need to be rollback, we are following the Pony philosophy and valuing
correctness over performance.
We have much better testing now then we did 5 years ago and can put any future
relaxing of atomic order operations through a considerably more vigorous testing
process than we did years ago for #1209.
With these sorts of bugs, we can never say, they have been fixed. What we can
say is that we ran with Matt's suggested changes for over 48 hours without
encountering any crashes or other signs of incorrect behavior. As such, we
believe that these changes are an improvement but our nightly runtime stress
tests might find additional problems in the future.
It should be noted, that due to changes in the last 5 years, this is not an
exact reversion. I've done my best to make sure that all changes were reverted
as they would be seen in their current form today.
Closes #4069