Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfault in MurmurHash3_x64_128 #48553

Closed
MikeInnes opened this issue Feb 6, 2023 · 3 comments · Fixed by #48562
Closed

Segfault in MurmurHash3_x64_128 #48553

MikeInnes opened this issue Feb 6, 2023 · 3 comments · Fixed by #48562
Labels
bug Indicates an unexpected problem or unintended behavior

Comments

@MikeInnes
Copy link
Member

The following script segfaults (reliably on my x86 mac, sporadically on arm) for any input N where N ≥ 2^31 && N % 16 ≠ 0.

N = 2^31+1

open("test.data", "w") do io
  truncate(io, N)
end

s = String(read("test.data"))

@show objectid(s)

I discovered this working with (unintentionally) large strings that were constructed in code rather than read from a file, but I haven't been able to reproduce the bug minimally without a read.

➜  dev git:(master) ✗ jd fault.jl

[43504] signal (11.1): Segmentation fault: 11
in expression starting at julia/dev/fault.jl:9
MurmurHash3_x64_128 at julia/dev/src/support/MurmurHash3.c:277
memhash_seed at julia/dev/src/support/hashing.c:74
objectid at ./reflection.jl:359
unknown function (ip: 0x119404ac2)
_jl_invoke at julia/dev/src/gf.c:0 [inlined]
ijl_apply_generic at julia/dev/src/gf.c:2873
jl_apply at julia/dev/src/./julia.h:1880 [inlined]
do_call at julia/dev/src/interpreter.c:125
eval_body at julia/dev/src/interpreter.c:0
jl_interpret_toplevel_thunk at julia/dev/src/interpreter.c:758
jl_toplevel_eval_flex at julia/dev/src/toplevel.c:910
jl_toplevel_eval_flex at julia/dev/src/toplevel.c:853
ijl_toplevel_eval at julia/dev/src/toplevel.c:919 [inlined]
ijl_toplevel_eval_in at julia/dev/src/toplevel.c:969
eval at ./boot.jl:370 [inlined]
include_string at ./loading.jl:1850
_jl_invoke at julia/dev/src/gf.c:0 [inlined]
ijl_apply_generic at julia/dev/src/gf.c:2873
_include at ./loading.jl:1910
include at ./Base.jl:457
jfptr_include_26413 at julia/dev/usr/lib/julia/sys.dylib (unknown line)
_jl_invoke at julia/dev/src/gf.c:0 [inlined]
ijl_apply_generic at julia/dev/src/gf.c:2873
exec_options at ./client.jl:307
_start at ./client.jl:522
jfptr__start_55134 at julia/dev/usr/lib/julia/sys.dylib (unknown line)
_jl_invoke at julia/dev/src/gf.c:0 [inlined]
ijl_apply_generic at julia/dev/src/gf.c:2873
jl_apply at julia/dev/src/./julia.h:1880 [inlined]
true_main at julia/dev/src/jlapi.c:573
jl_repl_entrypoint at julia/dev/src/jlapi.c:717
Allocations: 23154 (Pool: 23120; Big: 34); GC: 1
[1]    43504 segmentation fault  jd fault.jl
➜  dev git:(master) ✗ jd
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.10.0-DEV.503 (2023-02-06)
 _/ |\__'_|_|_|\__'_|  |  Commit a7317c3c72* (0 days old master)
|__/                   |

julia> versioninfo()
Julia Version 1.10.0-DEV.503
Commit a7317c3c72* (2023-02-06 13:15 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin22.2.0)
  CPU: 12 × Intel(R) Core(TM) i5-10500 CPU @ 3.10GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-14.0.6 (ORCJIT, skylake)
  Threads: 1 on 12 virtual cores
@ViralBShah ViralBShah added the bug Indicates an unexpected problem or unintended behavior label Feb 6, 2023
@gbaraldi
Copy link
Member

gbaraldi commented Feb 6, 2023

I can reproduce this on linux as well

73073] signal (11.1): Segmentation fault
in expression starting at REPL[4]:1
MurmurHash3_x64_128 at /home/gabrielbaraldi/julia/src/support/MurmurHash3.c:277
memhash_seed at /home/gabrielbaraldi/julia/src/support/hashing.c:74
objectid at ./reflection.jl:361
unknown function (ip: 0x7f5f0c0b6ec2)
jl_apply at /home/gabrielbaraldi/julia/src/julia.h:1880 [inlined]
do_call at /home/gabrielbaraldi/julia/src/interpreter.c:125
eval_value at /home/gabrielbaraldi/julia/src/interpreter.c:222
eval_stmt_value at /home/gabrielbaraldi/julia/src/interpreter.c:173 [inlined]
eval_body at /home/gabrielbaraldi/julia/src/interpreter.c:620
jl_interpret_toplevel_thunk at /home/gabrielbaraldi/julia/src/interpreter.c:758
jl_toplevel_eval_flex at /home/gabrielbaraldi/julia/src/toplevel.c:909
jl_toplevel_eval_flex at /home/gabrielbaraldi/julia/src/toplevel.c:853
ijl_toplevel_eval_in at /home/gabrielbaraldi/julia/src/toplevel.c:968
eval at ./boot.jl:370 [inlined]
eval_user_input at /home/gabrielbaraldi/julia/usr/share/julia/stdlib/v1.10/REPL/src/REPL.jl:153
repl_backend_loop at /home/gabrielbaraldi/julia/usr/share/julia/stdlib/v1.10/REPL/src/REPL.jl:249
#start_repl_backend#46 at /home/gabrielbaraldi/julia/usr/share/julia/stdlib/v1.10/REPL/src/REPL.jl:234
kwcall at /home/gabrielbaraldi/julia/usr/share/julia/stdlib/v1.10/REPL/src/REPL.jl:231
#run_repl#59 at /home/gabrielbaraldi/julia/usr/share/julia/stdlib/v1.10/REPL/src/REPL.jl:377
run_repl at /home/gabrielbaraldi/julia/usr/share/julia/stdlib/v1.10/REPL/src/REPL.jl:363
jfptr_run_repl_60593 at /home/gabrielbaraldi/julia/usr/lib/julia/sys.so (unknown line)
#1019 at ./client.jl:421
jfptr_YY.1019_55442 at /home/gabrielbaraldi/julia/usr/lib/julia/sys.so (unknown line)
jl_apply at /home/gabrielbaraldi/julia/src/julia.h:1880 [inlined]
jl_f__call_latest at /home/gabrielbaraldi/julia/src/builtins.c:778
#invokelatest#2 at ./essentials.jl:823 [inlined]
invokelatest at ./essentials.jl:820 [inlined]
run_main_repl at ./client.jl:405
exec_options at ./client.jl:322
_start at ./client.jl:522
jfptr__start_55448 at /home/gabrielbaraldi/julia/usr/lib/julia/sys.so (unknown line)
jl_apply at /home/gabrielbaraldi/julia/src/julia.h:1880 [inlined]
true_main at /home/gabrielbaraldi/julia/src/jlapi.c:573
jl_repl_entrypoint at /home/gabrielbaraldi/julia/src/jlapi.c:717
main at /home/gabrielbaraldi/julia/cli/loader_exe.c:58
unknown function (ip: 0x7f5f28c29d8f)
__libc_start_main at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
_start at /home/gabrielbaraldi/julia/julia (unknown line)
Allocations: 169729 (Pool: 169478; Big: 251); GC: 1
fish: Job 1, 'julia +master' terminated by signal SIGSEGV (Address boundary error)

@vtjnash
Copy link
Sponsor Member

vtjnash commented Feb 6, 2023

Looks like MurmurHash3_x64_128 takes int len, which presumably gets a bit weird when len overflows that. This also failed for me, and avoids the file:

julia> s = String('\0'^N);

@vtjnash
Copy link
Sponsor Member

vtjnash commented Feb 6, 2023

(fwiw, for this specific input, len is going to be -2147483647)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Indicates an unexpected problem or unintended behavior
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants