Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Core dumped, unreachable reached #2326

Closed
pdeffebach opened this issue Jul 22, 2020 · 8 comments
Closed

Core dumped, unreachable reached #2326

pdeffebach opened this issue Jul 22, 2020 · 8 comments

Comments

@pdeffebach
Copy link
Contributor

This is a scary error I get from the following operation

## Average population of republican states
pop_party = let d = votes_bills_legislators_states
	d2 = d[end-99:end, :]

	gd = groupby(d2, "state")
	# make sure all states represented
	@assert length(gd) == 50 && all(sdf -> nrow(sdf) == 2, gd)

	d3 = @pipe d2 |>
		groupby(_, "party") |>
		combine(_, "population" => mean) # errors on this last command
end

I don't have an MWE available. If I try to isloate this by saving d2 to a CSV and reading it in, this works fine.

Unreachable reached at 0x7fd0ec4c000d

signal (4): Illegal instruction
in expression starting at REPL[10]:1
groupreduce at /home/peterwd/.julia/packages/DataFrames/htZzm/src/groupeddataframe/splitapplycombine.jl:1023
unknown function (ip: 0x7fd0ec4c003a)
Reduce at /home/peterwd/.julia/packages/DataFrames/htZzm/src/groupeddataframe/splitapplycombine.jl:1032
unknown function (ip: 0x7fd0ec4bfef6)
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2158 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2322
_combine at /home/peterwd/.julia/packages/DataFrames/htZzm/src/groupeddataframe/splitapplycombine.jl:1150
#combine_helper#391 at /home/peterwd/.julia/packages/DataFrames/htZzm/src/groupeddataframe/splitapplycombine.jl:589
combine_helper##kw at /home/peterwd/.julia/packages/DataFrames/htZzm/src/groupeddataframe/splitapplycombine.jl:585
 pkg> st
Status `~/Documents/Projects/Senate Voting/Project.toml`
  [336ed68f] CSV v0.7.3
  [a93c6f00] DataFrames v0.21.4
  [da1fdf0e] FreqTables v0.4.0
  [d96e819e] Parameters v0.12.1
  [08abe8d2] PrettyTables v0.9.1
  [b8865327] UnicodePlots v1.2.0
@pdeffebach
Copy link
Contributor Author

Tagging @quinnj I think this is an issue with SentinelArrays

julia> d2.population
100-element SentinelArrays.SentinelArray{Float64,1,Float64,Missing,Array{Float64,1}}:

@quinnj
Copy link
Member

quinnj commented Jul 22, 2020

Hmmm, I'm not very familiar with the combine/reduce code; is there a more minimal repro on just the SentinelArray vector?

@pdeffebach
Copy link
Contributor Author

pdeffebach commented Jul 22, 2020

got it! It seems to happen after a call to stack

using DataFrames, SentinelArrays, Statistics

julia> df = DataFrame(g = rand(1:100, 100), v1 = SentinelArray(Union{Float64, Missing}[rand() for i in 1:100]), v2 = rand(100));

julia> long_df = stack(df, Not(1), variable_name = "state", value_name = "population");

julia> combine(groupby(long_df, "state"), "population" => mean) # errors here

@bkamins
Copy link
Member

bkamins commented Jul 22, 2020

I cannot reproduce it on 1.5:

               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.5.0-rc1.0 (2020-06-26)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> using DataFrames, CSV

julia> using Statistics

julia> df = DataFrame(g = rand(1:100, 100), v1 = CSV.SentinelArray(Union{Float64, Missing}[rand() for i in 1:100]), v2 = rand(100));

julia> long_df = stack(df, Not(1), variable_name = "state", value_name = "population");

julia> combine(groupby(long_df, "state"), "population" => mean)
2×2 DataFrame
│ Row │ state │ population_mean │
│     │ Cat…  │ Float64         │
├─────┼───────┼─────────────────┤
│ 1   │ v1    │ 0.514076        │
│ 2   │ v2    │ 0.546164        │

@quinnj
Copy link
Member

quinnj commented Jul 22, 2020

I can reproduce on OSX v"1.5.0-rc1.0".

@pdeffebach
Copy link
Contributor Author

julia> versioninfo()
Julia Version 1.4.0
Commit b8e9a9ecc6 (2020-03-21 16:36 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Core(TM) i5-8250U CPU @ 1.60GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, skylake)
Environment:
  JULIA_PKG_DEVDIR = /home/peterwd/Documents/Development
  JULIA_EDITOR = subl

@quinnj
Copy link
Member

quinnj commented Jul 22, 2020

I reproduced this with an rr trace and sent it to keno for investigation

quinnj added a commit that referenced this issue Jul 27, 2020
Works around crash seen in #2326. The inferred return type of
`groupreduce_init` is `Union{Vector{Any}, SentinelVector{Float64}}` and
it seems the compiler then crashes when trying to correctly identify `U`
from that union of types. Part of my conclusion here is based on the
fact that if you remove all other argument type constraints and just
make `groupreduce!` return `res` directly, it still crashes; thus, by
deduction, the crash has something to do with the compiler having
trouble with the `::AbstractVector{U}` argument type
constraint/specialization.

The work-around is pretty uncontroversial; we were already calling
`eltype(res)` in several other places, and I've checked that it infers
the same. I didn't add a test since this seems like such an obscure
compiler bug that it doesn't really seem necessary for DataFrames to be
testing core compiler behavior.

Also note that this bug exists in Julia <= 1.5, so current Julia master
(pending 1.6), which includes a number of compiler refactorings/changes,
seems to have resolved whatever the issue was.

cc: @Keno, @vtjnash, @JeffBezanson
@quinnj
Copy link
Member

quinnj commented Jul 27, 2020

Worked around in #2335

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants