Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lldb can't run static initializers on AArch64 Ubuntu and FreeBSD #43398

Open
rovka opened this issue Nov 19, 2019 · 22 comments
Open

lldb can't run static initializers on AArch64 Ubuntu and FreeBSD #43398

rovka opened this issue Nov 19, 2019 · 22 comments
Labels
bugzilla Issues migrated from bugzilla confirmed Verified by a second party lldb test-suite

Comments

@rovka
Copy link
Collaborator

rovka commented Nov 19, 2019

Bugzilla Link 44053
Version unspecified
OS Linux
CC @JDevlieghere,@jimingham,@Teemperor

Extended Description

TestStaticInitializers.py fails on AArch64 Linux for dwarf and dwo with:
runCmd: expr -p -- struct Foo { Foo() { inc_counter(); } }; Foo f;
runCmd failed!
error: couldn't run static initializers: couldn't run static initializer:

FAILURE
FAIL: LLDB (/home/diana.picus/llvm-envs/lldb/build/bin/clang-10-aarch64) :: test_dwo (TestStaticInitializers.StaticInitializers)
<bound method SBProcess.Kill of <lldb.SBProcess; proxy of <Swig Object of type 'lldb::SBProcess *' at 0xffff9f513d20> >>: success

@Teemperor
Copy link
Collaborator

It probably also doesn't help that LLDB's error message here is pretty much useless.

@Teemperor
Copy link
Collaborator

It would probably help with debugging if you could add a self.runCmd("log enable lldb all -f ~/thread.log") before the line self.expect("expr -p -- struct Foo { Foo() { inc_counter(); } }; Foo f;") in the file lldb/packages/Python/lldbsuite/test/commands/expression/static-initializers/TestStaticInitializers.py.

Then rerun the test and attach the ~/thread.log file which should contain more information about what is going on on your system.

@rovka
Copy link
Collaborator Author

rovka commented Nov 19, 2019

Thanks for having a look!

I can't attach the log because it's too big, but you can download it from here:
https://drive.google.com/file/d/1__nSJw8UuCTia4_ZvwMF4K47bN8BWOWh/view?usp=sharing

@Teemperor
Copy link
Collaborator

I am not sure where the error in that log is but it seems like it is not in the expression parser but something is going on when we run the ThreadPlan. CC'ing Jim because he knows that code.

I added more logging to the function in 4a6d03a so if you rerun the test with the log command we should see if something is going on when we actually try to find the static initialisers (even though I doubt this is happening but instead running them just fails).

https://reviews.llvm.org/D70433 could improve the error message in these situations, so maybe then you get a more useful error.

@jimingham
Copy link
Collaborator

The interesting bit in the log related to "ThreadPlans" is:

intern-state 0xffff7c356560 'Communication::SyncronizeWithReadThread' Listener::FindNextEventInternal(broadcaster=(nil), broadcaster_names=(nil)[0], event_type_mask=0x00000000, remove=1) event 0xffff840008d0
intern-state 0xffff7c356560 Listener::Clear('Communication::SyncronizeWithReadThread')
intern-state 0xffff7c356560 Listener::~Listener('Communication::SyncronizeWithReadThread')
intern-state 0x10712ab0: tid = 0xdc2c: stop info = (stop_id = 11)
intern-state 0x10712ab0: tid = 0xdc2c: stop info = signal SIGILL: illegal instruction (stop_id = 11)
intern-state
intern-state ThreadList::ShouldStop: 1 threads, 1 unsuspended threads
intern-state Thread::ShouldStop(0x10712ab0) for tid = 0xdc2c 0xdc2c, pc = 0x0000ffffb7ff809c
intern-state ^^^^^^^^ Thread::ShouldStop Begin ^^^^^^^^
intern-state Plan stack initial state:
thread #​1: tid = 0xdc2c:
Active plan stack:
Element 0: Base thread plan.
Element 1: Thread plan to call 0xffffb7ff8024
Element 2: Run to address: 0x00000000004004c0 using breakpoint: -7 -

The expression evaluation crashed running some part of the expression (at 0x0000ffffb7ff809c) It looks like that code belongs to .text._ZN4Foo2C2Ev. We usually print the disassembly of the expression we're injecting in the expr log but I didn't see that there.

@rovka
Copy link
Collaborator Author

rovka commented Nov 20, 2019

@rovka
Copy link
Collaborator Author

rovka commented Nov 20, 2019

The interesting bit in the log related to "ThreadPlans" is:

intern-state 0xffff7c356560 'Communication::SyncronizeWithReadThread'
Listener::FindNextEventInternal(broadcaster=(nil),
broadcaster_names=(nil)[0], event_type_mask=0x00000000, remove=1) event
0xffff840008d0
intern-state 0xffff7c356560
Listener::Clear('Communication::SyncronizeWithReadThread')
intern-state 0xffff7c356560
Listener::~Listener('Communication::SyncronizeWithReadThread')
intern-state 0x10712ab0: tid = 0xdc2c: stop info = (stop_id = 11)
intern-state 0x10712ab0: tid = 0xdc2c: stop info = signal SIGILL:
illegal instruction (stop_id = 11)
intern-state
intern-state ThreadList::ShouldStop: 1 threads, 1 unsuspended threads
intern-state Thread::ShouldStop(0x10712ab0) for tid = 0xdc2c 0xdc2c, pc
= 0x0000ffffb7ff809c
intern-state ^^^^^^^^ Thread::ShouldStop Begin ^^^^^^^^
intern-state Plan stack initial state:
thread #​1: tid = 0xdc2c:
Active plan stack:
Element 0: Base thread plan.
Element 1: Thread plan to call 0xffffb7ff8024
Element 2: Run to address: 0x00000000004004c0 using breakpoint: -7 -

The expression evaluation crashed running some part of the expression (at
0x0000ffffb7ff809c) It looks like that code belongs to .text._ZN4Foo2C2Ev.
We usually print the disassembly of the expression we're injecting in the
expr log but I didn't see that there.

Is there any way I can get it? I can run lldb manually if needed, potentially with other patches on top.

@Teemperor
Copy link
Collaborator

Maybe we only print that with verbose logging? You could try by changing the added line to:

self.runCmd("log enable lldb expr -F -v -f ~/expr.log")

and then upload the expr.log file.

You could also check what exact part of the expression causes us to fail.

Try changing the line
self.expect("expr -p -- struct Foo { Foo() { inc_counter(); } }; Foo f;")
to 1:
self.expect("expr -p -- struct Foo { Foo() {} }; Foo f;")
and afterwards 2:
self.expect("expr inc_counter();")

You also have to remove all lines after that line otherwise the following checks will fail.

@rovka
Copy link
Collaborator Author

rovka commented Nov 20, 2019

Verbose expr log

@rovka
Copy link
Collaborator Author

rovka commented Nov 20, 2019

Try changing the line
self.expect("expr -p -- struct Foo { Foo() { inc_counter(); } }; Foo f;")
to 1:
self.expect("expr -p -- struct Foo { Foo() {} }; Foo f;")
and afterwards 2:
self.expect("expr inc_counter();")

It's option 1:
runCmd: expr inc_counter();
output:

runCmd: expr -p -- struct Foo { Foo() { } }; Foo f;
runCmd failed!
error: couldn't run static initializers:
error: Execution interrupted: 0xffff8c001f70 Event: broadcaster = 0x3e753028 (lldb.process), type = 0x00000001 (state-changed), data = { process = 0x3e752ff0 (pid = 28437), state = stopped} <1 threads> <0x6f15 [ip 0x4005ec] breakpoint 1.1>

@Teemperor
Copy link
Collaborator

I think we had a bug where calling constructors sometimes doesn't work on some platforms/configurations. Are you using LLD by chance to link the test binaries for LLDB (i.e. LLD is your default linker or you enabled the LLD project in your LLVM build setup)?

Also does it work if you split up the expression like that?

self.expect("expr -p -- struct Foo { Foo() { inc_counter(); } };")
self.expect("expr Foo f; f")

(This way we call the constructor like a normal expression and not in as a static initialiser).

@rovka
Copy link
Collaborator Author

rovka commented Nov 20, 2019

I think we had a bug where calling constructors sometimes doesn't work on
some platforms/configurations. Are you using LLD by chance to link the test
binaries for LLDB (i.e. LLD is your default linker or you enabled the LLD
project in your LLVM build setup)?

Yes, LLD is in LLVM_ENABLE_PROJECTS.

Also does it work if you split up the expression like that?

self.expect("expr -p -- struct Foo { Foo() { inc_counter(); } };")
self.expect("expr Foo f; f")

(This way we call the constructor like a normal expression and not in as a
static initialiser).

Yes, this works.

@Teemperor
Copy link
Collaborator

Maybe the call to __cxx_global_var_init fails because we don't use it in the actual program but we call it from the expression according to the log. Can you change the source file for the test (lldb/packages/Python/lldbsuite/test/commands/expression/static-initializers/main.cpp) to the code below and see if the test then works (revert all other changes to the other files before doing that).

#include <cstdlib>
#include <string>
#include <iostream>

int counter = 0;

void inc_counter() { ++counter; }

void do_abort() { abort(); }

std::string s;

int main() {
  std::cout << s << std::endl;
  return 0; // break here
}

@rovka
Copy link
Collaborator Author

rovka commented Nov 20, 2019

Maybe the call to __cxx_global_var_init fails because we don't use it in
the actual program but we call it from the expression according to the log.
Can you change the source file for the test
(lldb/packages/Python/lldbsuite/test/commands/expression/static-initializers/
main.cpp) to the code below and see if the test then works (revert all other
changes to the other files before doing that).

#include <cstdlib>
#include <string>
#include <iostream>

int counter = 0;

void inc_counter() { ++counter; }

void do_abort() { abort(); }

std::string s;

int main() {
  std::cout << s << std::endl;
  return 0; // break here
}

Nope, still errors out :(

@jimingham
Copy link
Collaborator

I don't remember now whether the static initializers get run with the expression or as a separate pass, but if it is the former, it might be worth trying to run the expression as:

(lldb) expr -u 0 --

In the case of normal expressions, if the expression crashes this will leave lldb stopped at the point of the crash, and you can actually poke around and see why more clearly.

@Teemperor
Copy link
Collaborator

We run them separately here from what I understand:

@jimingham
Copy link
Collaborator

Most importantly we create a fresh set of options to run the function, so we're going to ignore breakpoints and clean up on crashes.

There are a bunch of these utility function calls that happen under the covers and it's not easy to ensure that stopping in for a breakpoint or crash and handing back control to the user would actually work... We get into the same situation when something causes the Object Description functions to crash, where it would be really useful to stop at the crash.

In this case, it would be interesting to try copying over the options of the containing expression, let it crash, and see what happens. It might work, or at least well enough to get a backtrace out.

If that doesn't work we could always take and log a backtrace when these expressions crash (maybe only when the "expr" log is turned on). That also might show more information and wouldn't be hard to do.

@rovka
Copy link
Collaborator Author

rovka commented Nov 22, 2019

Sorry about the delay. I tried to run
(lldb) expr -u 0 -- struct Foo{ Foo() { inc_counter(); } }; Foo f;
(lldb)

It doesn't seem to do anything, it "just runs", doesn't print anything, doesn't crash, nothing. I can still step or do other things afterwards.

Am I doing something wrong here?

@rovka
Copy link
Collaborator Author

rovka commented Nov 22, 2019

@jimingham
Copy link
Collaborator

Sorry, I was being a little unclear in my back and forth with Raphael. The problem is that this is not crashing in the main expression evaluation, but in a separate expression evaluation that happens under the covers in lldb. That separate expression evaluation does not use the options passed to the "expr" command that triggered it. So you would have to change the lldb code Raphael pointed to to actually get the expression not to automatically unwind when it crashes.

So it is entirely expected in the current code that passing "-u 0" has no effect on your crash.

If you want to play around with this yourself, try modifying the code Raphael cited below line 1360, and just do:

options->SetUnwindOnError(true);

That will force lldb to stop when this crash is encountered. I don't actually know whether lldb will recover from stopping there gracefully, but it might...

@llvmbot
Copy link
Member

llvmbot commented Mar 3, 2021

I wonder if #​49407 is related, as it also has some SIGILL.

In any case, FreeBSD seems to suffer from this issue as well.

@llvmbot llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 10, 2021
@llvmbot llvmbot added the confirmed Verified by a second party label Jan 26, 2022
DavidSpickett added a commit that referenced this issue May 29, 2024
PR #92245 fixed these tests on Linux. They likely work on FreeBSD too
but leaving the xfail for that so it can be confirmed later.

Also updated a bugzilla link to one that redirects to Github issues.

Relates to issues #43398 and #48751.
@DavidSpickett
Copy link
Collaborator

#92245 fixed this on Linux, FreeBSD maybe too but needs confirming.

@DavidSpickett DavidSpickett changed the title lldb can't run static initializers on AArch64 Ubuntu lldb can't run static initializers on AArch64 Ubuntu and FreeBSD May 29, 2024
vg0204 pushed a commit to vg0204/llvm-project that referenced this issue May 29, 2024
PR llvm#92245 fixed these tests on Linux. They likely work on FreeBSD too
but leaving the xfail for that so it can be confirmed later.

Also updated a bugzilla link to one that redirects to Github issues.

Relates to issues llvm#43398 and llvm#48751.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugzilla Issues migrated from bugzilla confirmed Verified by a second party lldb test-suite
Projects
None yet
Development

No branches or pull requests

6 participants