Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-threading hangs combine on Julia nightly #3275

Closed
bkamins opened this issue Jan 25, 2023 · 2 comments
Closed

Multi-threading hangs combine on Julia nightly #3275

bkamins opened this issue Jan 25, 2023 · 2 comments

Comments

@bkamins
Copy link
Member

bkamins commented Jan 25, 2023

Reproducer:

julia> using DataFrames, Test, Random

julia> const ≅ = isequal
isequal (generic function with 44 methods)

julia> y = Any[1, missing, missing, 2, 4]
5-element Vector{Any}:
 1
  missing
  missing
 2
 4

julia> x = 1:length(y)
1:5

julia>         df = DataFrame(x=x, y1=y, y2=reverse(y))
5×3 DataFrame
 Row │ x      y1       y2
     │ Int64  Any      Any
─────┼─────────────────────────
   1 │     1  1        4
   2 │     2  missing  2
   3 │     3  missing  missing
   4 │     4  2        missing
   5 │     5  4        1

julia>         gd = groupby(df, :x)
GroupedDataFrame with 5 groups based on key: x
First Group (1 row): x = 1
 Row │ x      y1   y2
     │ Int64  Any  Any
─────┼─────────────────
   1 │     1  1    4
⋮
Last Group (1 row): x = 5
 Row │ x      y1   y2
     │ Int64  Any  Any
─────┼─────────────────
   1 │     5  4    1

julia>               combine(gd, [:x, :y1] => ((x, y) -> (sleep((x == [5])/10); y[1])) => :y1,
                                 [:x, :y2] => ((x, y) -> (sleep((x == [5])/10); y[end])) => :y2)
5×3 DataFrame
 Row │ x      y1       y2
     │ Int64  Int64?   Int64?
─────┼─────────────────────────
   1 │     1        1        4
   2 │     2  missing        2
   3 │     3  missing  missing
   4 │     4        2  missing
   5 │     5        4        1

julia>               combine(gd, [:x, :y1] => ((x, y) -> (sleep((x == [5])/10); y[1])) => :y1,
                                 [:x, :y2] => ((x, y) -> (sleep((x == [5])/10); y[end])) => :y2)
5×3 DataFrame
 Row │ x      y1       y2
     │ Int64  Int64?   Int64?
─────┼─────────────────────────
   1 │     1        1        4
   2 │     2  missing        2
   3 │     3  missing  missing
   4 │     4        2  missing
   5 │     5        4        1

julia>               combine(gd, [:x, :y1] => ((x, y) -> (sleep((x == [5])/10); y[1])) => :y1,
                                 [:x, :y2] => ((x, y) -> (sleep((x == [5])/10); y[end])) => :y2)

And the operation does not terminate (note that I run the same operation three times). Most likely we have some race condition in handling of multi-threading that got exposed on Julia nightly.

For the issue to show up we need to pass two operations to combine. If one is passed things work OK.

@bkamins bkamins added this to the patch milestone Jan 25, 2023
@bkamins
Copy link
Member Author

bkamins commented Jan 25, 2023

OK - I have narrowed it down. The issue is with sleep:

julia> @sync begin
           Threads.@spawn sleep(0.01)
       end
Task (done) @0x000001e7a9a71750

julia> @sync begin
           Threads.@spawn sleep(0.01)
       end
Task (done) @0x000001e7fa5c45e0

julia> @sync begin
           Threads.@spawn sleep(0.01)
       end # hangs

@bkamins
Copy link
Member Author

bkamins commented Feb 3, 2023

Fixed in Base Julia

@bkamins bkamins closed this as completed Feb 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant