Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The "wrong" worker wakes up when other workers have designated work. #793

Open
wks opened this issue Apr 18, 2023 · 1 comment · Fixed by #794
Open

The "wrong" worker wakes up when other workers have designated work. #793

wks opened this issue Apr 18, 2023 · 1 comment · Fixed by #794
Labels
C-bug Category: Bug P-normal Priority: Normal.

Comments

@wks
Copy link
Collaborator

wks commented Apr 18, 2023

After PR #782, only the coordinator can open new buckets.

However, workers may park and then receive designated work packets. In this case, the worker will not wake up. When all workers parked, the coordinator will look if any workers have any designated work. If some workers do, the coordinator will reset the group_sleep state and notifies all workers, hoping workers will wake up and execute their designated work.

However, the first unparked worker can be a "wrong" worker that doesn't have any designated work. It will not find any designated work or any work in any buckets. Then it will set group_sleep = true. That will prevent the "right" workers that actually have designated work from unparking. Then the coordinator will notify all workers again because some still have designated work. This can happen again and again, as shown in the following log.

[2023-04-15T08:22:46Z INFO  mmtk::util::heap::gc_trigger] [POLL] nursery: Triggering collection (128007/128000 pages)
[2023-04-15T08:22:46Z INFO  mmtk::plan::generational::global] Nursery GC
[2023-04-15T08:22:46Z WARN  mmtk::scheduler::controller] Controller received an event! AllParked
[2023-04-15T08:22:46Z WARN  mmtk::scheduler::controller]   Some workers have designated work
[2023-04-15T08:22:46Z WARN  mmtk::scheduler::controller] Controller received an event! AllParked
[2023-04-15T08:22:46Z WARN  mmtk::scheduler::controller]   Some workers have designated work
[2023-04-15T08:22:46Z WARN  mmtk::scheduler::controller] Controller received an event! AllParked
[2023-04-15T08:22:46Z WARN  mmtk::scheduler::controller]   Some workers have designated work
[2023-04-15T08:22:46Z WARN  mmtk::scheduler::controller] Controller received an event! AllParked
[2023-04-15T08:22:46Z WARN  mmtk::scheduler::controller]   Some workers have designated work
[2023-04-15T08:22:46Z WARN  mmtk::scheduler::controller] Controller received an event! AllParked
[2023-04-15T08:22:46Z WARN  mmtk::scheduler::controller]   Some workers have designated work
[2023-04-15T08:22:46Z WARN  mmtk::scheduler::controller] Controller received an event! AllParked
[2023-04-15T08:22:46Z WARN  mmtk::scheduler::controller]   Some workers have designated work
[2023-04-15T08:22:46Z WARN  mmtk::scheduler::controller] Controller received an event! AllParked
[2023-04-15T08:22:46Z WARN  mmtk::scheduler::controller]   Some workers have designated work
[2023-04-15T08:22:46Z WARN  mmtk::scheduler::controller] Controller received an event! AllParked
[2023-04-15T08:22:46Z WARN  mmtk::scheduler::controller]   Some workers have designated work
[2023-04-15T08:22:46Z WARN  mmtk::scheduler::controller] Controller received an event! AllParked
[2023-04-15T08:22:46Z WARN  mmtk::scheduler::controller]   Some workers have designated work
[2023-04-15T08:22:46Z WARN  mmtk::scheduler::controller] Controller received an event! AllParked
[2023-04-15T08:22:46Z WARN  mmtk::scheduler::controller]   Some workers have designated work
[2023-04-15T08:22:46Z WARN  mmtk::scheduler::controller] Controller received an event! AllParked
[2023-04-15T08:22:46Z WARN  mmtk::scheduler::controller]   Some workers have designated work
[2023-04-15T08:22:46Z WARN  mmtk::scheduler::controller] Controller received an event! AllParked
[2023-04-15T08:22:46Z WARN  mmtk::scheduler::controller]   Some workers have designated work
[2023-04-15T08:22:46Z WARN  mmtk::scheduler::controller] Controller received an event! AllParked
[2023-04-15T08:22:46Z WARN  mmtk::scheduler::controller]   Some workers have designated work
[2023-04-15T08:22:46Z WARN  mmtk::scheduler::controller] Controller received an event! AllParked
[2023-04-15T08:22:46Z WARN  mmtk::scheduler::controller]   Some workers have designated work
[2023-04-15T08:22:46Z WARN  mmtk::scheduler::controller] Controller received an event! AllParked
[2023-04-15T08:22:46Z WARN  mmtk::scheduler::controller]   Some workers have designated work
[2023-04-15T08:22:46Z WARN  mmtk::scheduler::controller] Controller received an event! AllParked
[2023-04-15T08:22:46Z WARN  mmtk::scheduler::controller]   Some workers have designated work
[2023-04-15T08:22:46Z WARN  mmtk::scheduler::controller] Controller received an event! AllParked
[2023-04-15T08:22:46Z WARN  mmtk::scheduler::controller]   Some workers have designated work
[2023-04-15T08:22:46Z WARN  mmtk::scheduler::controller] Controller received an event! AllParked
[2023-04-15T08:22:46Z WARN  mmtk::scheduler::controller]   Some workers have designated work
[2023-04-15T08:22:46Z WARN  mmtk::scheduler::controller] Controller received an event! AllParked
[2023-04-15T08:22:46Z WARN  mmtk::scheduler::controller]   Some workers have designated work
[2023-04-15T08:22:46Z WARN  mmtk::scheduler::controller] Controller received an event! AllParked
[2023-04-15T08:22:46Z WARN  mmtk::scheduler::controller]   Some workers have designated work
[2023-04-15T08:22:46Z WARN  mmtk::scheduler::controller] Controller received an event! AllParked
[2023-04-15T08:22:46Z WARN  mmtk::scheduler::controller]   Some workers have designated work
[2023-04-15T08:22:46Z WARN  mmtk::scheduler::controller] Controller received an event! AllParked
[2023-04-15T08:22:46Z WARN  mmtk::scheduler::controller]   Some workers have designated work
[2023-04-15T08:22:46Z WARN  mmtk::scheduler::controller] Controller received an event! AllParked
[2023-04-15T08:22:46Z WARN  mmtk::scheduler::controller]   Some workers have designated work
[2023-04-15T08:22:46Z WARN  mmtk::scheduler::controller] Controller received an event! AllParked
[2023-04-15T08:22:46Z WARN  mmtk::scheduler::controller]   Some workers have designated work
[2023-04-15T08:22:46Z WARN  mmtk::scheduler::controller] Controller received an event! AllParked
[2023-04-15T08:22:46Z WARN  mmtk::scheduler::controller]   Some workers have designated work
[2023-04-15T08:22:46Z WARN  mmtk::scheduler::controller] Controller received an event! AllParked
[2023-04-15T08:22:46Z WARN  mmtk::scheduler::controller]   Some workers have designated work
[2023-04-15T08:22:46Z WARN  mmtk::scheduler::controller] Controller received an event! AllParked
[2023-04-15T08:22:47Z WARN  mmtk::scheduler::controller]   Some workers have designated work
[2023-04-15T08:22:47Z WARN  mmtk::scheduler::controller] Controller received an event! AllParked
[2023-04-15T08:22:47Z WARN  mmtk::scheduler::controller]   Some workers have designated work
[2023-04-15T08:22:47Z WARN  mmtk::scheduler::controller] Controller received an event! AllParked
[2023-04-15T08:22:47Z WARN  mmtk::scheduler::controller]   Some workers have designated work
[2023-04-15T08:22:47Z WARN  mmtk::scheduler::controller] Controller received an event! AllParked
[2023-04-15T08:22:47Z WARN  mmtk::scheduler::controller]   Some workers have designated work
[2023-04-15T08:22:47Z WARN  mmtk::scheduler::controller] Controller received an event! AllParked
[2023-04-15T08:22:47Z WARN  mmtk::scheduler::controller]   Some workers have designated work
[2023-04-15T08:22:47Z WARN  mmtk::scheduler::controller] Controller received an event! AllParked
[2023-04-15T08:22:47Z WARN  mmtk::scheduler::controller]   Some workers have designated work
[2023-04-15T08:22:47Z WARN  mmtk::scheduler::controller] Controller received an event! AllParked
[2023-04-15T08:22:47Z WARN  mmtk::scheduler::controller]   Bucket opened
[2023-04-15T08:22:47Z WARN  mmtk::scheduler::controller] Controller received an event! AllParked
[2023-04-15T08:22:47Z WARN  mmtk::scheduler::controller]   Bucket opened
[2023-04-15T08:22:47Z WARN  mmtk::scheduler::controller] Controller received an event! AllParked
[2023-04-15T08:22:47Z WARN  mmtk::scheduler::controller]   Bucket opened
[2023-04-15T08:22:47Z WARN  mmtk::scheduler::controller] Controller received an event! AllParked
[2023-04-15T08:22:47Z WARN  mmtk::scheduler::controller]   Bucket opened
[2023-04-15T08:22:47Z WARN  mmtk::scheduler::controller] Controller received an event! AllParked
[2023-04-15T08:22:47Z WARN  mmtk::scheduler::controller]   Some workers have designated work
[2023-04-15T08:22:47Z WARN  mmtk::scheduler::controller] Controller received an event! AllParked
[2023-04-15T08:22:47Z WARN  mmtk::scheduler::controller]   Some workers have designated work
[2023-04-15T08:22:47Z WARN  mmtk::scheduler::controller] Controller received an event! AllParked
[2023-04-15T08:22:47Z WARN  mmtk::scheduler::controller]   Some workers have designated work
[2023-04-15T08:22:47Z WARN  mmtk::scheduler::controller] Controller received an event! AllParked
[2023-04-15T08:22:47Z WARN  mmtk::scheduler::controller]   ... GC finished.
[2023-04-15T08:22:47Z INFO  mmtk::scheduler::gc_work] End of GC (114450/128000 pages, took 1104 ms)

It will repeat until the "right" worker wakes up first, but the time for this to happen is unbounded, and the problem will get more serious when there are many GC worker threads.

@wks wks closed this as completed in #794 Apr 27, 2023
@wks wks reopened this Apr 27, 2023
@k-sareen k-sareen added C-bug Category: Bug P-normal Priority: Normal. labels Nov 6, 2023
@wks
Copy link
Collaborator Author

wks commented Nov 14, 2023

A user observed, when running GCBench with ScalaNative, that Prepare and Release took an unreasonably long time due to some PrepareCollector and ReleaseCollector work packets being scheduled too late. It is likely caused by this problem.

image

See this Zulip conversation: https://mmtk.zulipchat.com/#narrow/stream/315620-Porting/topic/ScalaNative.2FMMTK/near/401861203

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: Bug P-normal Priority: Normal.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants