Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

no stacktrace shown when signalHandler called (eg SIGSEGV) #9

Closed
timotheecour opened this issue Mar 14, 2021 · 1 comment · Fixed by #10
Closed

no stacktrace shown when signalHandler called (eg SIGSEGV) #9

timotheecour opened this issue Mar 14, 2021 · 1 comment · Fixed by #10

Comments

@timotheecour
Copy link
Contributor

timotheecour commented Mar 14, 2021

I'm trying to get a stacktrace for nim-lang/Nim#17157:

  • it works with nim's stacktraces
  • it doesn't work with d:nimStackTraceOverride:
# regular nim stacktrace:
nim c -o:bin/nim1 --stacktrace:on compiler/nim.nim

# nimStackTraceOverride:
nim c -lib:lib -o:bin/nim2 --stacktrace:off -d:nimStackTraceOverride --import:libbacktrace -g --passc:"-fno-omit-frame-pointer -fno-optimize-sibling-calls" compiler/nim.nim

bin/nim1 c -d:case2 main # shows a complete backtrace
bin/nim2 c -d:case2 main # shows a very limited backtrace:
Traceback (most recent call last, using override)
lib/system/excpt.nim(630) signalHandler
SIGSEGV: Illegal storage access. (Attempt to read from nil?)
[1]    50275 segmentation fault  bin/nim2 c --lib:lib main.nim

details

main.nim: example from nim-lang/Nim#17157:

template something(op: proc (v: untyped): void): void = discard

nim: 1.5.1 95697d00fa7e6f642c1e99b2baa4d76e468d1c77
nim-libbacktrace: 40be351

note 1

adding --passc:"-fno-omit-frame-pointer -fno-optimize-sibling-calls" didn't help. Note that this flag is useful in combination with optimized builds to get more complete stacktraces, however it didn't help here. It did help with the example in #5, a full stacktrace was obtained thanks to it even with -d:release or -d:danger. IMO we should mention this flag in docs for -d:nimStackTraceOverride, at least if this issue can get fixed

note 2

with -d:noSignalHandler i'll just get:

[1]    56734 segmentation fault  bin/nim.r5 c -d:case2 --lib:lib main.nim

note 3

if i add doAssert false right before the place that causes SIGSEGV in compiler, it works:

proc isMagic(sym: PSym): bool =
  if sym.ast == nil:
    doAssert false # added this
  let nPragmas = sym.ast[pragmasPos]
compiler/nim.nim(125)    nim
compiler/nim.nim(84)     handleCmdLine
compiler/main.nim(242)   mainCommand
compiler/main.nim        compileToBackend
compiler/main.nim(90)    commandCompileToC
compiler/modules.nim     compileProject
compiler/modules.nim(97) compileModule
compiler/passes.nim(180) processModule
compiler/passes.nim(73)  processTopLevelStmt
compiler/sem.nim(607)    myProcess
compiler/sem.nim(575)    semStmtAndGenerateGenerics
compiler/semstmts.nim(2306) semStmt
compiler/semexprs.nim(1042) semExprNoType
compiler/semexprs.nim(2892) semExpr
compiler/semstmts.nim(2248) semStmtList
compiler/semexprs.nim    semExpr
compiler/semexprs.nim(2340) semWhen
compiler/semexprs.nim(2892) semExpr
compiler/semstmts.nim(2248) semStmtList
compiler/semexprs.nim(2915) semExpr
compiler/semtempl.nim(630) semTemplateDef
compiler/semstmts.nim(1421) semParamList
compiler/semtypes.nim(1233) semProcTypeNode
compiler/semtypes.nim(1183) semParamType
compiler/semtypes.nim(1980) semTypeNode
compiler/semtypes.nim(1662) semProcTypeWithScope
compiler/semtypes.nim(1236) semProcTypeNode
compiler/semtypes.nim(1198) isMagic

but then interestingly, it doesn't show:

Traceback (most recent call last, using override)

on top, which also seems like a bug

without --passc:"-fno-omit-frame-pointer -fno-optimize-sibling-calls" it shows 25 frames instead of 29, which makes sense.

So it looks like SIGSEGV (which triggers signal handler) causes the stacktraces to disappear

note 4

I'm not getting absolute paths even if I specify --listfullpaths or --excessivestacktrace

note 5: reduced example

when true:
  proc fun() =
    var x: ref int
    echo x[]
  fun()

nim r --stacktrace:on t12012.nim

Traceback (most recent call last)
/Users/timothee/git_clone/nim/timn/tests/nim/all/t12012.nim(45) t12012
/Users/timothee/git_clone/nim/timn/tests/nim/all/t12012.nim(44) fun
SIGSEGV: Illegal storage access. (Attempt to read from nil?)

nim r --stacktrace:off -d:nimStackTraceOverride --import:libbacktrace -g t12012.nim

Traceback (most recent call last, using override)
/Users/timothee/git_clone/nim/Nim_devel/lib/system/excpt.nim(630) signalHandler
SIGSEGV: Illegal storage access. (Attempt to read from nil?)

note 6: lldb works

lldb t12012 (compiled as in note 5) does work and shows the relevant parts of the stacktrace

so this seems like a bug in nim-libbacktrace

(lldb) r
Process 91149 launched: '/Users/timothee/git_clone/nim/timn/build/nimcache/t12012' (x86_64)
Process 91149 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
    frame #0: 0x000000010000ee73 start-addr:0x000000010000ee73  t12012 fun__1TeY7TkO6UnbBIEVdjnblg + 51 at  /Users/timothee/git_clone/nim/timn/tests/nim/all/t12012.nim:44:24
   41   when defined case5:
   42     proc fun() =
   43       var x: ref int
-> 44       echo x[]
   45     fun()
Target 0: (t12012) stopped.
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
  * frame #0: 0x000000010000ee73 start-addr:0x000000010000ee73  t12012 fun__1TeY7TkO6UnbBIEVdjnblg + 51 at  /Users/timothee/git_clone/nim/timn/tests/nim/all/t12012.nim:44:24
    frame #1: 0x000000010000ef99 start-addr:0x000000010000ef94  t12012 NimMainModule + 9 at  /Users/timothee/git_clone/nim/timn/tests/nim/all/t12012.nim:45:2
    frame #2: 0x000000010000ef89 start-addr:0x000000010000ef84  t12012 NimMainInner + 9 at  /Users/timothee/git_clone/nim/Nim_devel/lib/system.nim:2222:2
    frame #3: 0x000000010000efca start-addr:0x000000010000efc8  t12012 NimMain + 42 at  /Users/timothee/git_clone/nim/Nim_devel/lib/system.nim:2230:2
    frame #4: 0x000000010000f00e start-addr:0x000000010000f009  t12012 main(argc=1, args=0x00007ffeefbfae88, env=0x00007ffeefbfae98) + 62 at  /Users/timothee/git_clone/nim/Nim_devel/lib/system.nim:2237:2
  frame #5: 0x00007fff6a526cc9 libdyld.dylib`start + 1
  frame #6: 0x00007fff6a526cc9 libdyld.dylib`start + 1
(lldb)

note 6

after instrumenting and playing with nim-libbacktrace code, i found NIM_LIBBACKTRACE_DEBUG, it turns out with this flag it works:

NIM_LIBBACKTRACE_DEBUG=1 nim r -b:cpp --stacktrace:off -d:nimStackTraceOverride --import:libbacktrace -g -d:case5 -f $timn_D/tests/nim/all/t12012.nim

Traceback (most recent call last, using override)
/Users/timothee/git_clone/nim/Nim_devel/lib/system.nim(2237) main
/Users/timothee/git_clone/nim/Nim_devel/lib/system.nim(2230) NimMain
/Users/timothee/git_clone/nim/Nim_devel/lib/system.nim(2222) NimMainInner
/Users/timothee/git_clone/nim/timn/tests/nim/all/t12012.nim(45) t12012
/Users/timothee/git_clone/nim/timn/tests/nim/all/t12012.nim(44) fun
/Users/timothee/git_clone/nim/Nim_devel/lib/system/excpt.nim(630) signalHandler
/Users/timothee/git_clone/nim/Nim_devel/lib/system/excpt.nim(314) rawWriteStackTrace
/Users/timothee/git_clone/nim/Nim_devel/lib/system/stacktraces.nim(59) auxWriteStackTraceWithOverride
/Users/timothee/git_clone/nim/nim-libbacktrace/libbacktrace.nim(47) getBacktrace
./libbacktrace_wrapper.c(352) get_backtrace_max_length_c
./libbacktrace_wrapper.c(286) get_program_counters_c
SIGSEGV: Illegal storage access. (Attempt to read from nil?)
Error: execution of an external program failed: '/Users/timothee/git_clone/nim/timn/build/nimcache/t12012 '

=> the relevant part of stacktrace is now shown:

/Users/timothee/git_clone/nim/timn/tests/nim/all/t12012.nim(45) t12012
/Users/timothee/git_clone/nim/timn/tests/nim/all/t12012.nim(44) fun

so this is definitely a bug in nim-libbacktrace

note 7

going back to original example, NIM_LIBBACKTRACE_DEBUG=1 makes it work.
with nim compiled with: --passc:"-fno-omit-frame-pointer -fno-optimize-sibling-calls" -d:danger

i get:

Traceback (most recent call last, using override)
lib/system.nim(2311) main
lib/system.nim(2304) NimMain
lib/system.nim(2296) NimMainInner
compiler/nim.nim(125) nim
compiler/nim.nim(84) handleCmdLine
compiler/main.nim(242) mainCommand
compiler/main.nim(0) compileToBackend
compiler/main.nim(90) commandCompileToC
compiler/modules.nim(0) compileProject
compiler/modules.nim(97) compileModule
compiler/passes.nim(180) processModule
compiler/passes.nim(73) processTopLevelStmt
compiler/sem.nim(607) myProcess
compiler/sem.nim(575) semStmtAndGenerateGenerics
compiler/semstmts.nim(2306) semStmt
compiler/semexprs.nim(1042) semExprNoType
compiler/semexprs.nim(2892) semExpr
compiler/semstmts.nim(2248) semStmtList
compiler/semexprs.nim(0) semExpr
compiler/semexprs.nim(2340) semWhen
compiler/semexprs.nim(2892) semExpr
compiler/semstmts.nim(2248) semStmtList
compiler/semexprs.nim(2915) semExpr
compiler/semtempl.nim(630) semTemplateDef
compiler/semstmts.nim(1421) semParamList
compiler/semtypes.nim(1233) semProcTypeNode
compiler/semtypes.nim(1183) semParamType
compiler/semtypes.nim(1980) semTypeNode
compiler/semtypes.nim(1662) semProcTypeWithScope
compiler/semtypes.nim(1236) semProcTypeNode
compiler/semtypes.nim(1200) isMagic
lib/system/excpt.nim(630) signalHandler
lib/system/excpt.nim(314) rawWriteStackTrace
lib/system/stacktraces.nim(59) auxWriteStackTraceWithOverride
/Users/timothee/git_clone/nim/nim-libbacktrace/libbacktrace.nim(47) getBacktrace
/Users/timothee/git_clone/nim/nim-libbacktrace/libbacktrace_wrapper.c(352) get_backtrace_max_length_c
/Users/timothee/git_clone/nim/nim-libbacktrace/libbacktrace_wrapper.c(286) get_program_counters_c
SIGSEGV: Illegal storage access. (Attempt to read from nil?)
  • without -d:danger i get 38 frames
  • with -d:danger i get 32 frames (32 < 38 as expected)
  • with --passc:"-fno-omit-frame-pointer -fno-optimize-sibling-calls" -d:danger i get 38 frames; notice the frames with line = 0; I'm curious whether we can report correct line numbers or whether that's the best we can do here

note that --passc:"-fno-omit-frame-pointer -fno-optimize-sibling-calls" -d:danger gives best of both worlds:

  • similar speed as nim built with -d:danger (0.6s runtime up to the crash)
  • almost same stacktrace as the one built with --stacktrace:on

note 8

with:
NIM_LIBBACKTRACE_DEBUG=1 lldb -- bin/nim.r10 c -d:case2 --lib:lib $timn_D/tests/nim/all/t12012.nim
I also get some line numbers being 0, so maybe there's the line info is unavailable for those frames:

(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x18)
  * frame #0: 0x000000010017d9b1 start-addr:0x000000010017d9b1  nim.r10 semProcTypeNode__MdjmOB9curTzVOmkS3WHYZg [inlined] isMagic__jKjjve5xVHAVUj9b7USTS9bA(sym=<unavailable>) + 4 at  /Users/timothee/git_clone/nim/Nim_prs/compiler/semtypes.nim:1200:35
    frame #1: 0x000000010017d9ad start-addr:0x000000010017d9ad  nim.r10 semProcTypeNode__MdjmOB9curTzVOmkS3WHYZg(c=0x0000000105dd3050, n=0x0000000105d00f90, genericParams=0x0000000000000000, prev=<unavailable>, kind=<unavailable>, isType=true) + 605
    frame #2: 0x000000010019a054 start-addr:0x000000010019a039  nim.r10 semProcTypeWithScope__VVQPTRRjDGNV17td9cICm7g(c=0x0000000105dd3050, n=0x0000000105cfb8d0, prev=0x0000000000000000, kind='\f') + 180 at  /Users/timothee/git_clone/nim/Nim_prs/compiler/semtypes.nim:1662:11
    frame #3: 0x0000000100170b7d start-addr:0x0000000100170b6a  nim.r10 semTypeNode__tYJL0pqPjh78ZSGOkWDaSA(c=0x0000000105dd3050, n=0x0000000105cfb8d0, prev=0x0000000000000000) + 1181 at  /Users/timothee/git_clone/nim/Nim_prs/compiler/semtypes.nim:1980:13
    frame #4: 0x0000000100179283 start-addr:0x0000000100179276  nim.r10 semParamType__al5PH9cqwbxAfK9bAuQImZdw(c=0x0000000105dd3050, n=0x0000000105cfb8d0, constraint=0x00007ffeefbfa000) + 179 at  /Users/timothee/git_clone/nim/Nim_prs/compiler/semtypes.nim:1183:12
    frame #5: 0x000000010017d98f start-addr:0x000000010017d980  nim.r10 semProcTypeNode__MdjmOB9curTzVOmkS3WHYZg(c=0x0000000105dd3050, n=0x0000000105d009d0, genericParams=0x0000000105bc7110, prev=<unavailable>, kind=<unavailable>, isType=false) + 575 at  /Users/timothee/git_clone/nim/Nim_prs/compiler/semtypes.nim:1233:13
    frame #6: 0x00000001001a23eb start-addr:0x00000001001a23da  nim.r10 semTemplateDef__7YxUILpPxooC5qqa89a54dw_37 [inlined] semParamList__Eyq3IMn6PJevR76ZZ2uLQQ(c=0x0000000105dd3050, n=<unavailable>, genericParams=0x0000000105bc7110, s=0x0000000105d2aad0) + 22 at  /Users/timothee/git_clone/nim/Nim_prs/compiler/semstmts.nim:1421:32
    frame #7: 0x00000001001a23d5 start-addr:0x00000001001a23d5  nim.r10 semTemplateDef__7YxUILpPxooC5qqa89a54dw_37(c=0x0000000105dd3050, n=0x0000000105d00c90) + 629
    frame #8: 0x000000010016f354 start-addr:0x000000010016f349  nim.r10 semExpr__vJZwz9bfROKrfrtM50y74CQ_3(c=0x0000000105dd3050, n=0x0000000105d00c90, flags=0) + 3604 at  /Users/timothee/git_clone/nim/Nim_prs/compiler/semexprs.nim:2915:12
    frame #9: 0x00000001001a8492 start-addr:0x00000001001a8487  nim.r10 semStmtList__vJZwz9bfROKrfrtM50y74CQ_30(c=0x0000000105dd3050, n=0x0000000105d00f10, flags=0) + 162 at  /Users/timothee/git_clone/nim/Nim_prs/compiler/semstmts.nim:2248:13
    frame #10: 0x000000010016eb46 start-addr:0x000000010016eb39  nim.r10 semExpr__vJZwz9bfROKrfrtM50y74CQ_3(c=0x0000000105dd3050, n=0x0000000105d00f10, flags=0) + 1542 at  /Users/timothee/git_clone/nim/Nim_prs/compiler/semexprs.nim:2892:12
    frame #11: 0x0000000100196e95 start-addr:0x0000000100196e88  nim.r10 semWhen__5C4HPXQuhxlMYIzPbe36lg(c=0x0000000105dd3050, n=<unavailable>, semCheck=true) + 1077 at  /Users/timothee/git_clone/nim/Nim_prs/compiler/semexprs.nim:2340:18
    frame #12: 0x000000010016fa42 start-addr:0x000000010016fa26  nim.r10 semExpr__vJZwz9bfROKrfrtM50y74CQ_3(c=0x0000000105dd3050, n=0x0000000105d00cd0, flags=64) + 5378
    frame #13: 0x00000001001a8492 start-addr:0x00000001001a8487  nim.r10 semStmtList__vJZwz9bfROKrfrtM50y74CQ_30(c=0x0000000105dd3050, n=0x0000000105d00e90, flags=64) + 162 at  /Users/timothee/git_clone/nim/Nim_prs/compiler/semstmts.nim:2248:13
    frame #14: 0x000000010016eb46 start-addr:0x000000010016eb39  nim.r10 semExpr__vJZwz9bfROKrfrtM50y74CQ_3(c=0x0000000105dd3050, n=0x0000000105d00e90, flags=64) + 1542 at  /Users/timothee/git_clone/nim/Nim_prs/compiler/semexprs.nim:2892:12
    frame #15: 0x00000001001b68d8 start-addr:0x00000001001b68c8  nim.r10 semStmtAndGenerateGenerics__7YxUILpPxooC5qqa89a54dw_65 [inlined] semExprNoType__7YxUILpPxooC5qqa89a54dw_2(c=0x0000000105dd3050, n=0x0000000105d00e90) + 71 at  /Users/timothee/git_clone/nim/Nim_prs/compiler/semexprs.nim:1042:11
    frame #16: 0x00000001001b6891 start-addr:0x00000001001b6891  nim.r10 semStmtAndGenerateGenerics__7YxUILpPxooC5qqa89a54dw_65 [inlined] semStmt__vJZwz9bfROKrfrtM50y74CQ_4(c=0x0000000105dd3050, n=0x0000000105d00e90, flags=0)
    frame #17: 0x00000001001b6891 start-addr:0x00000001001b6891  nim.r10 semStmtAndGenerateGenerics__7YxUILpPxooC5qqa89a54dw_65(c=0x0000000105dd3050, n=<unavailable>) + 65
    frame #18: 0x00000001001b6c15 start-addr:0x00000001001b6c0a  nim.r10 myProcess__QAJdPOwMq9buTNGNGQO3ImA(context=0x0000000105dd3050, n=0x0000000105d00e90) + 69 at  /Users/timothee/git_clone/nim/Nim_prs/compiler/sem.nim:607:12
    frame #19: 0x00000001000fe62a start-addr:0x00000001000fe625  nim.r10 processModule__5pRk9bz8P1JWU809a9b9cPz0Dg + 56 at  /Users/timothee/git_clone/nim/Nim_prs/compiler/passes.nim:73:10
    frame #20: 0x00000001000fe5f2 start-addr:0x00000001000fe5f2  nim.r10 processModule__5pRk9bz8P1JWU809a9b9cPz0Dg(graph=0x000000010067e050, module=0x0000000105c5add0, idgen=<unavailable>, stream=<unavailable>) + 1362
    frame #21: 0x0000000100232e4d start-addr:0x0000000100232e3c  nim.r10 compileModule__JSGrRuKEntiB6El8fj8npQ(graph=0x000000010067e050, fileIdx=<unavailable>, flags=<unavailable>) + 701 at  /Users/timothee/git_clone/nim/Nim_prs/compiler/modules.nim:97:33
    frame #22: 0x0000000100233340 start-addr:0x000000010023333b  nim.r10 compileProject__KlDcF6rnOBDQLo439cvjPjg(graph=0x000000010067e050, projectFileIdx=<unavailable>) + 544
    frame #23: 0x00000001002847d0 start-addr:0x00000001002847c3  nim.r10 commandCompileToC__ikP0vuP6oxqlYdG7q9cY7sA_2(graph=0x000000010067e050) + 192 at  /Users/timothee/git_clone/nim/Nim_prs/compiler/main.nim:90:2
    frame #24: 0x0000000100284b3c start-addr:0x0000000100284b23  nim.r10 compileToBackend__vfArexHrJ9cN5iCnRabJjQg(ClE_0=0x0000000104d52350) + 76 at  /Users/timothee/git_clone/nim/Nim_prs/compiler/main.nim:0:2
    frame #25: 0x00000001002855a7 start-addr:0x000000010028559f  nim.r10 mainCommand__ikP0vuP6oxqlYdG7q9cY7sA(graph=<unavailable>) + 727 at  /Users/timothee/git_clone/nim/Nim_prs/compiler/main.nim:242:167
    frame #26: 0x0000000100287d19 start-addr:0x0000000100287d11  nim.r10 handleCmdLine__I9aitIr13Z1B6wHv5vzMf2w(cache=0x0000000100643030, conf=0x0000000100639050) + 217 at  /Users/timothee/git_clone/nim/Nim_prs/compiler/nim.nim:84:2
    frame #27: 0x00000001002883b6 start-addr:0x00000001002883ae  nim.r10 NimMainModule + 118 at  /Users/timothee/git_clone/nim/Nim_prs/compiler/nim.nim:125:2
    frame #28: 0x0000000100288339 start-addr:0x0000000100288334  nim.r10 NimMainInner + 9 at  /Users/timothee/git_clone/nim/Nim_prs/lib/system.nim:2296:2
    frame #29: 0x000000010028844b start-addr:0x0000000100288449  nim.r10 main [inlined] NimMain + 59 at  /Users/timothee/git_clone/nim/Nim_prs/lib/system.nim:2304:2
    frame #30: 0x000000010028842c start-addr:0x000000010028842c  nim.r10 main(argc=<unavailable>, args=<unavailable>, env=<unavailable>) + 28
  frame #31: 0x00007fff6a526cc9 libdyld.dylib`start + 1

note 9

what about column numbers? it should be available as shown in lldb stacktrace, so we should expose it instead of just showing line numbers

@timotheecour timotheecour changed the title no stacktrace shown no stacktrace shown upon SIGSEGV Mar 14, 2021
@timotheecour timotheecour changed the title no stacktrace shown upon SIGSEGV no stacktrace shown when signalHandler called (eg SIGSEGV) Mar 14, 2021
@stefantalpalaru
Copy link
Contributor

You're on to something here, but I wonder what makes libbacktrace encounter program counters that have no corresponding debugging info, in the middle of a stack trace. Function inlining? Something else?

As for "t12012.nim", the problem is that both "NimMainModule()" and "fun()" are inlined into "main()", which we skip. Problem described upstream here: ianlancetaylor/libbacktrace#60 (comment)

Fortunately, there is already a mechanism in place for dealing with these inlines, but it only covers NimMainInner() and NimMain(). I extended it to main() for these extra-small programs that get better inlining. Please test it, and don't forget to pass "-f" to the Nim compiler. It fails to do some nimcache invalidation otherwise.

Every env var and Nim/GCC option you discovered by reading the code is already documented in the README. Let me know if I missed something.

Don't get too excited about "-fno-optimize-sibling-calls" - it disables the C compiler's tail-call optimisation, with serious performance hits in some programs.

We can't do all that debuggers can and we're limited to the debugging info that libbacktrace passes to us, so no column info, if we want to keep it simple.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants