Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nim sets don't support negative ID in thief ID checks #106

Closed
mratsim opened this issue Mar 14, 2020 · 3 comments
Closed

Nim sets don't support negative ID in thief ID checks #106

mratsim opened this issue Mar 14, 2020 · 3 comments
Labels
bug 🪲 Something isn't working

Comments

@mratsim
Copy link
Owner

mratsim commented Mar 14, 2020

From @liquid600pgm:

/path/to/.nimble/pkgs/weave-#master/weave/scheduler.nim(260) worker_entry_fn
/path/to/.nimble/pkgs/weave-#master/weave/scheduler.nim(225) schedulingLoop
/path/to/.nimble/pkgs/weave-#master/weave/workers.nim(40) runTask
/path/to/.nimble/pkgs/weave-#master/weave/parallel_for.nim(221) weaveTask_ParallelFor_
/path/to/.nimble/pkgs/weave-#master/weave/parallel_for.nim(38) weaveParallelForSection
/path/to/.nimble/pkgs/weave-#master/weave/runtime.nim(114) loadBalance
/path/to/.nimble/pkgs/weave-#master/weave/victims.nim(326) shareWork
/path/to/.nimble/pkgs/weave-#master/weave/victims.nim(302) distributeWork
/path/to/.nimble/pkgs/weave-#master/weave/victims.nim(246) splitAndSend
/path/to/.nimble/pkgs/weave-#master/weave/instrumentation/contracts.nim(86) evalSplit
/path/to/.choosenim/toolchains/nim-#devel/lib/system/fatal.nim(49) sysFatal
Error: unhandled exception: value out of range: -1 notin 0 .. 65535 [RangeError]
@mratsim mratsim added the bug 🪲 Something isn't working label Mar 14, 2020
mratsim added a commit that referenced this issue Apr 4, 2020
@mratsim mratsim changed the title Loop splitting: Weave can try to split into the negative Nim sets don't support negative ID in thief ID checks Apr 4, 2020
@mratsim
Copy link
Owner Author

mratsim commented Apr 4, 2020

Analysis:

With the following compilation command

nim c -d:WV_debugSplit -d:WV_LazyFlowvar --verbosity:0 --hints:off --warnings:off --threads:on --outdir:build -r weave/parallel_for.nim

After running it a couple times you will get a stacktrace similar to:

Worker 15: 44 steps left (start: 156, current: 156, stop: 200, stride: 1, 6 thieves)
Worker 14: loop task 0x54504389 (iterations [140, 142)) waiting for the remainder
Worker 30: Finished loop task 0x54504389 (iterations [144, 145)) (futures: 0x00000000)
Worker 14: loop task 0x54504389 (iterations [140, 142)) complete
Worker  4: Sending [127, 129) to worker 10 (2 steps) (hasFuture: 1, dependsOnFutures: 0x00000000)
Worker  4: Finished loop task 0x54504389 (iterations [126, 127)) (futures: 0x3001d800)
Worker 10: has 0 steal requests queued
Worker 10: workSharing split, thiefID 22, total subtree thieves 3, left{id: 21, waiting: 1, requests: 1}, right{id: 22, waiting: 1, requests: 2}
Worker 10: 2 steps left (start: 127, current: 127, stop: 129, stride: 1, 3 thieves)
/home/beta/Programming/Nim/weave/weave/scheduler.nim(260) worker_entry_fn
/home/beta/Programming/Nim/weave/weave/scheduler.nim(225) schedulingLoop
/home/beta/Programming/Nim/weave/weave/workers.nim(40) runTask
/home/beta/Programming/Nim/weave/weave/parallel_for.nim(233) weaveTask_ParallelForAwaitable_
/home/beta/Programming/Nim/weave/weave/parallel_for.nim(77) weaveParallelForAwaitableSection
/home/beta/Programming/Nim/weave/weave/runtime.nim(114) loadBalance
/home/beta/Programming/Nim/weave/weave/victims.nim(326) shareWork
/home/beta/Programming/Nim/weave/weave/victims.nim(302) distributeWork
/home/beta/Programming/Nim/weave/weave/victims.nim(246) splitAndSend
/home/beta/Programming/Nim/weave/weave/instrumentation/contracts.nim(86) evalSplit
/home/beta/.choosenim/toolchains/nim-1.2.0/lib/system/fatal.nim(49) sysFatal
Error: unhandled exception: value out of range: -1 notin 0 .. 65535 [RangeError]
Worker 10: Sending [128, 129) to worker 22 (1 steps) (hasFuture: 1, dependsOnFutures: 0x00000000)
Worker 10: Finished loop task 0x54504389 (iterations [127, 128)) (futures: 0xd400c200)
Worker 10: Finished loop task 0x54504389 (iterations [127, 128)) (futures: 0xd400c200)
Worker 10: loop task 0x54504389 (iterations [127, 128)) waiting for the remainder
Worker  0: Sending [123, 126) to worker 1 (3 steps) (hasFuture: 1, dependsOnFutures: 0x

The first thing to notice is that this is not a Weave error. Weave does not throw RangeError everything is an AssertionError this is further confirmed by the fact that all Weave errors prepend the Worker thread ID to ease debugging.

The second thing is that int16 has 65536 values and sets are only allowed to hold int16.

And we happen to have a set there:

weave/weave/victims.nim

Lines 203 to 206 in 052ae40

if workSharing:
# The real splitting will be done by the child worker
# We need to send it enough work for its own children and all the steal requests pending
ascertain: req.thiefID in {myWorker().left, myWorker().right}

However I'm pretty sure that sets in the past could hold negative values from -32768 to 32767 instead of 0 to 65535. it seems like not but my particular construct wasn't properly checked.

@mratsim mratsim closed this as completed in 0b58468 Apr 4, 2020
@mratsim
Copy link
Owner Author

mratsim commented Apr 4, 2020

Probably linked to nim-lang/Nim#13764

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 🪲 Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant