Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix cycle detector issue with checking for blocked actors #3154

Merged
merged 2 commits into from
May 28, 2019

Conversation

dipinhora
Copy link
Contributor

As part of PR #2800, I broke the cycle detector due to incorrect
incremental logic for querying actors for if they're blocked. I
relied on the hashmap internal array/bucket index value instead
of keeping tracking of # of actors queried for if they're blocked
during the current function call.

This commit resolves the logic issue described above by keeping
a running count of actors queried during the current function
call.

This commit also implements @plprofetes suggestion to make the
threshold for # of actors to query for if they're blocked into
a dynamic value. Specifically, the threshold is now the larger
of the CD_MAX_CHECK_BLOCKED constant (which is 1000) or 10%
of the total number of actors currently known to exist by the
cycle detector.

Resolves #2975.

As part of PR ponylang#2800, I broke the cycle detector due to incorrect
incremental logic for querying actors for if they're blocked. I
relied on the hashmap internal array/bucket index value instead
of keeping tracking of # of actors queried for if they're blocked
during the current function call.

This commit resolves the logic issue described above by keeping
a running count of actors queried during the current function
call.

This commit also implements @plprofetes suggestion to make the
threshold for # of actors to query for if they're blocked into
a dynamic value. Specifically, the threshold is now the larger
of the `CD_MAX_CHECK_BLOCKED` constant (which is 1000) or 10%
of the total number of actors currently known to exist by the
cycle detector.

Resolves ponylang#2975.
@dipinhora
Copy link
Contributor Author

This was tested using @plprofetes' example program in #2975.

The change regarding the dynamic threshold for # of actors to query if they're blocked is somewhat controversial as it can result in message storms originating from the cycle detector. Let me know if that change should be reverted or separated out into its own PR.

@dipinhora dipinhora force-pushed the i2975 branch 2 times, most recently from f95c7e9 to 3a757ae Compare May 15, 2019 10:55
@jemc
Copy link
Member

jemc commented May 21, 2019

Specifically, the threshold is now the larger
of the CD_MAX_CHECK_BLOCKED constant (which is 1000) or 10%
of the total number of actors currently known to exist by the
cycle detector.

Just wondering, does the name CD_MAX_CHECK_BLOCKED no longer make sense, then?

@dipinhora
Copy link
Contributor Author

Possibly. Got any ideas for a better name?

Copy link
Member

@jemc jemc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I keep coming up with bad naming ideas involving the "minimum maximum", so maybe not. To avoid confusion for future readers, I'm just suggesting we add this comment to make sure the reader knows how the constant is intended to be used, so that they can tell it is working as intended, and not see it as a bug that we may have a limit that is larger than the "maximum".

src/libponyrt/gc/cycle.c Outdated Show resolved Hide resolved
Co-Authored-By: Joe Eli McIlvain <joe.eli.mac@gmail.com>
@SeanTAllen SeanTAllen added the changelog - fixed Automatically add "Fixed" CHANGELOG entry on merge label May 28, 2019
@SeanTAllen
Copy link
Member

@dipinhora can you add release notes for this?

@SeanTAllen SeanTAllen merged commit a623447 into ponylang:master May 28, 2019
@dipinhora
Copy link
Contributor Author

release notes (possibly):

Fix issue #2975 where the cycle detector wasn't correctly reaping cycles of dead actors.
Change cycle detector to use the maximum of 1000 or 10% of all actors currently alive for how many actors to query when the cycle detector periodically checks to see which actors are currently blocked. This is based on a suggestion by @plprofetes to ensure that the cycle detector is effective even for applications with a large number (millions) of temporary actors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
changelog - fixed Automatically add "Fixed" CHANGELOG entry on merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

no GC happening with cycles of 50k+ actors
3 participants