Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libgkrust.a(webrender_bindings-ee4f1448dc823c6b.webrender_bindings.7qjxnz4e-cgu.0.rcgu.o) DWARF DIE at 0x001b2d28 (class closure) has a member variable 0x001b2d2f (__0) whose type is a forward declaration, not a complete definition. #57822

Closed
jrmuizel opened this issue Jan 21, 2019 · 28 comments · Fixed by #63875
Assignees
Labels
A-debuginfo Area: Debugging information in compiled programs (DWARF, PDB, etc.) E-help-wanted Call for participation: Help is requested to fix this issue. E-mentor Call for participation: This issue has a mentor. Use #t-compiler/help on Zulip for discussion. P-high High priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@jrmuizel
Copy link
Contributor

Some idea as #36185

@jrmuizel
Copy link
Contributor Author

@jrmuizel
Copy link
Contributor Author

@philipc is this something that's easy for ddbug to check for and give more information about?

@estebank estebank added A-debuginfo Area: Debugging information in compiled programs (DWARF, PDB, etc.) T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jan 22, 2019
@philipc
Copy link
Contributor

philipc commented Jan 22, 2019

Maybe, but I don't understand what the problem is and what needs to be checked.

0x001b2d28:   DW_TAG_structure_type
                DW_AT_name      ("closure")
                DW_AT_byte_size (0x18)
                DW_AT_alignment (8)

0x001b2d2f:     DW_TAG_member
                  DW_AT_name    ("__0")
                  DW_AT_type    (0x001b2d3d "closure")
                  DW_AT_alignment       (8)
                  DW_AT_data_member_location    (DW_OP_plus_uconst 0x0)

0x001b2d3c:     NULL

0x001b2d3d:   DW_TAG_structure_type
                DW_AT_name      ("closure")
                DW_AT_byte_size (0x18)
                DW_AT_alignment (8)

0x001b2d44:     DW_TAG_member
                  DW_AT_name    ("__0")
                  DW_AT_type    (0x00064c79 "Vec<webrender_bindings::moz2d_renderer::Job>")
                  DW_AT_alignment       (8)
                  DW_AT_data_member_location    (DW_OP_plus_uconst 0x0)

0x001b2d51:     NULL

The type of 0x001b2d2f is 0x001b2d3d, and I don't know why it claims 0x001b2d3d is a forward declaration since its members are defined too.

@jrmuizel
Copy link
Contributor Author

So the crash doesn't happen with rust-lldb, so this probably pretty low to fix. Otherwise it seems like the next steps would be to figure out what's going on in lldb's DWARFFastParserClang and why it's sad about our info.

@xixixao
Copy link

xixixao commented Apr 7, 2019

I have this occurring with rust-lldb.

Sample code:

extern crate rexpect;
use rexpect::session::spawn_command;
use std::process::Command;
use tempfile::*;

type R<T> = Result<T, Box<std::error::Error>>;
type Session = rexpect::session::PtySession;


#[test]
fn test() -> R<()> {
    let mut p = run(&["example.txt"])?;
    p.exp_regex("Deleting example.txt")?;
    p.exp_eof()?;
    Ok(())
}


fn run(args: &[&str]) -> R<Session> {
    let directory = tempdir()?;
    let directory_path = directory.path();
    let mut command = Command::new("./target/debug/rust-rm");
    command.args(args);
    command.current_dir(directory_path);
    println!("Tmp folder: {:?}", directory_path);
    let mut p = spawn_command(command, Some(2000))?;
    Ok(p)
}

And I'm stepping into the p.exp_regex call. I have rexpect installed locally via:

[patch.crates-io]
rexpect = { path = "../rexpect" }

and am running:

sudo rust-lldb target/debug/test-38093ee6c91e9277

I'm on macOS 10.14.2 with just installed cargo and rust (cargo 1.33.0 (f099fe94b 2019-02-12)).

@EricRahm
Copy link
Contributor

@michaelwoerister It seems like #36185 has resurfaced. From what I understand this is causing some issues for WebRender devs.

@michaelwoerister
Copy link
Member

I'm going to look into this asap. Thanks for the test case, @xixixao, that should be very helpful!

@michaelwoerister michaelwoerister self-assigned this Aug 15, 2019
@michaelwoerister
Copy link
Member

I can reproduce this on macOS and Linux. On Linux I can reproduce with a recent debug build of LLDB.

@michaelwoerister
Copy link
Member

@rust-lang/compiler, maybe we should make this p-high. At least it's what I'll be working on until it's resolved.

@nikomatsakis nikomatsakis added the P-high High priority label Aug 15, 2019
@michaelwoerister
Copy link
Member

I've been able to make some progress here. I was not able to pinpoint why exactly LLDB fails to load the type info for fields in question but for the test cases I had, the bug occurred only when LLDB tries to get the type information of fields in a closure environment. I tried debugging further but with my limited understanding of LLDB's codebase makes this very time-consuming.

So, for the time being I propose to implement a workaround: If we make rustc mark closure environment structs and their fields with DW_AT_artificial then LLDB does not hit the problematic code paths. Marking those fields as artificial is the right thing to do here anyway (we already do it for things like fields of slices). It's unfortunate that we don't know the exact problem in LLDB but the workaround would not be a hack. In my local testing this did not break any debuginfo tests for either LLDB or GDB.

@philipc
Copy link
Contributor

philipc commented Aug 16, 2019

@michaelwoerister I may be able to look into the LLDB side further. Do you have a simple testcase I can use? I wasn't able to reproduce with the testcase given earlier.

At a guess, the problem may be because we use the same name (closure) for multiple types.

@michaelwoerister
Copy link
Member

That would be awesome, @philipc!

Are you familiar with compiling Firefox? Then getting a test case is straightforward. It's also somewhat lengthy initially so bear with me :)

  • Get a recent clone of Firefox, e.g. from https://github.com/mozilla/gecko-dev.
  • Make sure you've get an up-to-date build environment by running ./mach bootstrap in the root dir of the FF checkout.
  • Place the following .mozconfig file in the root dir of the checkout:
    ac_add_options --enable-debug
    ac_add_options --enable-debug-symbols
    ac_add_options --disable-optimize
    ac_add_options --disable-crashreporter
    # This might be necessary if you are building on macOS
    ac_add_options --with-macos-sdk=/Library/Developer/SDKs/MacOSX10.14.sdk
    
  • Run ./mach build in the root directory of your FF checkout.
  • Depending your machine this will take between 5 and 50 minutes :)
  • Run ./mach run in the root directory of your FF checkout. This starts a debug instance of the browser.
  • Type about:config into the address bar, look for the gfx.webrender.enabled option and make sure it is set to true. If you needed to change it, close the browser and run ./mach run again. The test browser should have remembered the setting.
  • If you are on Linux, find out the process ID of the process named GPU Process, if you are on macOS, find out the PID of the main FF process.
  • Start your version of LLDB and run attach {PID from previous point}
  • Once LLDB has successfully stopped FF, set a breakpoint via break set -n "create_frame_builder"
  • Run continue in LLDB and focus the FF window. The breakpoint should be hit immediately.
  • Now type bt into LLDB. This should trigger the problematic code path and make LLDB crash.

For me, the above repro worked on Fedora 30 as well as an up-to-date macOS. It also worked with the stock macOS LLDB version, LLDB 8 on Fedora and a self-compiled LLDB from a recent LLVM master.

At a guess, the problem may be because we use the same name (closure) for multiple types.

Maybe yeah. Unfortunately, LLDB crashed on me while debugging itself and rr seemed to get stuck indefinitely after executing attach in LLDB.

@philipc
Copy link
Contributor

philipc commented Aug 20, 2019

Thanks for the instructions, @michaelwoerister!

Here's a reduced testcase:

fn main() {
    let _: Vec<_> = [()].iter().map(break_here).collect();
}

fn break_here(x: &()) -> () {
    *x
}

(in lldb: break set -n break_here, r, bt).

@philipc
Copy link
Contributor

philipc commented Aug 20, 2019

Hex editing the DWARF to change the names of the closure types stops the error from happening.

@michaelwoerister
Copy link
Member

Very interesting! Generating some kind of unique name for the closure types shouldn't be too hard.

@michaelwoerister
Copy link
Member

Alright, so I would suggest fixing this issue by doing both:

  1. Marking closure environment structs and their fields with DW_AT_artificial, and
  2. Generating a unique type name for closure environment structs.

The same should probably done for generators (I have not really looked at those yet).

Implementing item (1) should be straightforward: Go into librustc_codegen_llvm/debuginfo/metadata.rs and make closure types and their fields get DIFlags::FlagArtificial applied. The debuginfo for closure types is created here:

ty::Closure(def_id, substs) => {
let upvar_tys : Vec<_> = substs.upvar_tys(def_id, cx.tcx).collect();
prepare_tuple_metadata(cx,
t,
&upvar_tys,
unique_type_id,
usage_site_span).finalize(cx)
}

... and implementing this will mean that create_struct_stub() needs to grow a flags parameter. The rest is just threading the right flags value through the various function calls.

Closure fields are created here (because the compiler treats them as tuple fields):

fn create_member_descriptions(&self, cx: &CodegenCx<'ll, 'tcx>)
-> Vec<MemberDescription<'ll>> {
let layout = cx.layout_of(self.ty);
self.component_types.iter().enumerate().map(|(i, &component_type)| {
let (size, align) = cx.size_and_align_of(component_type);
MemberDescription {
name: format!("__{}", i),
type_metadata: type_metadata(cx, component_type, self.span),
offset: layout.fields.offset(i),
size,
align,
flags: DIFlags::FlagZero,
discriminant: None,
}
}).collect()

Similar to the case above, TupleMemberDescriptionFactory should probably be parameterized with a field_flags: DIFlags field which can then be used to make closure fields DIFlags::FlagArtificial.

For examples, one can also look for other occurrences of DIFlags::FlagArtificial in the same file.

Implementing item (2), i.e. generating a unique type name for closures, is even simpler: Debuginfo type names are generated in librustc_codegen_ssa/debuginfo/type_names.rs, and for closures specifically the name is generated here:

ty::Closure(..) => {
output.push_str("closure");
}

The first field of the ty::Closure(..) we have available here is the DefId of the closure, which gives us access to pretty much anything we can know about it. A very simple way of making the type name unique would be to append the DefPathHash of the closure to it, e.g. closure-x83y52ng. The DefPathHash for a given DefId can very cheaply be obtained via tcx.def_path_hash.

One could also find a more human-readable unique name for closure types (e.g. by using the DefPath instead of the DefPathHash) but that would be more costly and take up more space, so I think for now the DefPathHash is a good option.

I'm gonna tag this issue as E-mentor and E-help-wanted in case somebody wants to take a stab at implementing the suggested fix. I'm happy to help out with any questions.

@michaelwoerister michaelwoerister added E-help-wanted Call for participation: Help is requested to fix this issue. E-mentor Call for participation: This issue has a mentor. Use #t-compiler/help on Zulip for discussion. labels Aug 20, 2019
@philipc
Copy link
Contributor

philipc commented Aug 21, 2019

Further reduced:

fn main() {
    let f = |x: ()| x;
    let g = move |y| f(y);
    xxx();
}

fn xxx() {}

In lldb: b /xxx/, r, p g.

The requirements are to have the struct for one closure included as a member of the struct for another closure, and then get lldb to print that struct.

I'll take a look at implementing the fix.

@michaelwoerister
Copy link
Member

Thanks, @philipc! This should also make for a good regression test.

@philipc
Copy link
Contributor

philipc commented Aug 22, 2019

@michaelwoerister Should I be using tcx.def_path_hash(def_id).0.to_hex() for the names? That's 32 bytes, which seems a bit larger than needed.

Probably not something to do right now, but maybe we should move these types inside the functions where they are defined, and then use a simple disambiguation index (are we doing something like this for mangling closure symbols?). This would be similar to using DefPath, except that we don't need to repeat the path because the location of the type means it simply inherits the path of the function.

@michaelwoerister
Copy link
Member

Yes, I also think that we don't need that many characters for disambiguation. I'd probably do something like:

ty::Closure(..) => { 
    use rustc_data_structures::base_n;
    output.push_str("closure-"); // <-- could also use a short prefix here
    let hash = tcx.def_path_hash(def_id).0.to_smaller_hash(); // 64 bits is enough
    base_n::push_str(hash as u128, 62, output); // base62 only contains alphanumeric chars
}

This should result in ~11 disambiguation chars which seems fine.

Probably not something to do right now, but maybe we should move these types inside the functions where they are defined, and then use a simple disambiguation index (are we doing something like this for mangling closure symbols?). This would be similar to using DefPath, except that we don't need to repeat the path because the location of the type means it simply inherits the path of the function.

Do DWARF & LLVM allow for type DIEs inside of function DIEs? What would be the implications? I can't remember if we already do something like this (we definitely use namespaces though).

@philipc
Copy link
Contributor

philipc commented Aug 22, 2019

I haven't checked clang, but here's some output from gcc which does that:

int main() {
        struct Foobar {
                int i;
        };
        struct Foobar foobar = { .i = 1 };
        return foobar.i;
}
< 0><0x0000000b>  DW_TAG_compile_unit
                    DW_AT_producer              GNU C17 8.3.0 -mtune=generic -march=x86-64 -g
                    DW_AT_language              DW_LANG_C99
                    DW_AT_name                  test.c
                    DW_AT_comp_dir              F:\rust
                    DW_AT_low_pc                0x00401560
                    DW_AT_high_pc               <offset-from-lowpc>29
                    DW_AT_stmt_list             0x00000729
< 1><0x0000005e>    DW_TAG_subprogram
                      DW_AT_external              yes(1)
                      DW_AT_name                  main
                      DW_AT_decl_file             0x00000001 F:\rust/F:\rust/test.c
                      DW_AT_decl_line             0x00000001
                      DW_AT_decl_column           0x00000005
                      DW_AT_type                  0x000000b0<.debug_info+0x000078f5>
                      DW_AT_low_pc                0x00401560
                      DW_AT_high_pc               <offset-from-lowpc>29
                      DW_AT_frame_base            len 0x0001: 9c: DW_OP_call_frame_cfa
                      DW_AT_GNU_all_tail_call_sites yes(1)
                      DW_AT_sibling               0x000000b0<.debug_info+0x000078f5>
< 2><0x00000081>      DW_TAG_structure_type
                        DW_AT_name                  Foobar
                        DW_AT_byte_size             0x00000004
                        DW_AT_decl_file             0x00000001 F:\rust/F:\rust/test.c
                        DW_AT_decl_line             0x00000002
                        DW_AT_decl_column           0x00000009
                        DW_AT_sibling               0x0000009d<.debug_info+0x000078e2>
< 3><0x00000091>        DW_TAG_member
                          DW_AT_name                  i
                          DW_AT_decl_file             0x00000001 F:\rust/F:\rust/test.c
                          DW_AT_decl_line             0x00000003
                          DW_AT_decl_column           0x00000007
                          DW_AT_type                  0x000000b0<.debug_info+0x000078f5>
                          DW_AT_data_member_location  0
< 2><0x0000009d>      DW_TAG_variable
                        DW_AT_name                  foobar
                        DW_AT_decl_file             0x00000001 F:\rust/F:\rust/test.c
                        DW_AT_decl_line             0x00000005
                        DW_AT_decl_column           0x00000010
                        DW_AT_type                  0x00000081<.debug_info+0x000078c6>
                        DW_AT_location              len 0x0002: 916c: DW_OP_fbreg -20
< 1><0x000000b0>    DW_TAG_base_type
                      DW_AT_byte_size             0x00000004
                      DW_AT_encoding              DW_ATE_signed
                      DW_AT_name                  int

@eddyb
Copy link
Member

eddyb commented Aug 22, 2019

Do DWARF & LLVM allow for type DIEs inside of function DIEs?

This whole thread had me confused because I just assumed we were using def-paths (like symbol names do, and we know those are unambiguous), but I thought maybe it was some subtlety everyone else was aware of.

Thanks for asking this "out loud", everything's clear now!

I wonder if there are other bugs/complications arising from debuginfo not (fully) relying on the disambiguating information that everything else is based on.

@eddyb
Copy link
Member

eddyb commented Aug 22, 2019

You should be able to create similar situations to closures via this:

let pair = (
    { struct S; S },
    { struct S; S },
);

This should create two structs with the same def-path except for the disambiguator on the last component (since blocks don't introduce their own def-path components).

@michaelwoerister
Copy link
Member

Well, I don't think DWARF actually demands that types have different names. Each DIE has some kind of ID and the DW_AT_name attribute is actually optional.

@michaelwoerister
Copy link
Member

@philipc I remember now: We are building a parallel tree of DW_TAG_namespaces and put the types into those, because at the time (5 years ago?) GDB wasn't able to handle types inside of functions. Here's what this looks like:

fn main() {
    struct XYZ {
        a: u32
    }

    let b = XYZ { a: 8 };
    println!("{:?}", b.a);
}

DWARF

< 1><0x0000002a>    DW_TAG_namespace
                      DW_AT_name                  debuginfo_nesting_test
< 2><0x0000002f>      DW_TAG_subprogram
                        DW_AT_low_pc                0x000044f0
                        DW_AT_high_pc               <offset-from-lowpc>141
                        DW_AT_frame_base            len 0x0001: 57: DW_OP_reg7
                        DW_AT_linkage_name          _ZN22debuginfo_nesting_test4main17hbbfb22ccb8684996E
                        DW_AT_name                  main
                        DW_AT_decl_file             0x00000001 /xoxo/stuff/debuginfo_nesting_test/src/main.rs
                        DW_AT_decl_line             0x00000001
                        DW_AT_main_subprogram       yes(1)
< 3><0x00000048>        DW_TAG_lexical_block
                          DW_AT_low_pc                0x000044fc
                          DW_AT_high_pc               <offset-from-lowpc>124
< 4><0x00000055>          DW_TAG_variable
                            DW_AT_location              len 0x0002: 9124: DW_OP_fbreg 36
                            DW_AT_name                  b
                            DW_AT_decl_file             0x00000001 /xoxo/stuff/debuginfo_nesting_test/src/main.rs
                            DW_AT_decl_line             0x00000006
                            DW_AT_type                  <0x00000088>
< 4><0x00000063>          DW_TAG_lexical_block
                            DW_AT_low_pc                0x00004510
                            DW_AT_high_pc               <offset-from-lowpc>54
< 5><0x00000070>            DW_TAG_variable
                              DW_AT_location              len 0x0003: 91f000: DW_OP_fbreg 112
                              DW_AT_name                  arg0
                              DW_AT_alignment             0x00000001
                              DW_AT_decl_file             0x00000001 /xoxo/stuff/debuginfo_nesting_test/src/main.rs
                              DW_AT_decl_line             0x00000007
                              DW_AT_type                  <0x000000a4>
< 2><0x00000083>      DW_TAG_namespace
                        DW_AT_name                  main
< 3><0x00000088>        DW_TAG_structure_type
                          DW_AT_name                  XYZ
                          DW_AT_byte_size             0x00000004
                          DW_AT_alignment             0x00000004
< 4><0x0000008f>          DW_TAG_member
                            DW_AT_name                  a
                            DW_AT_type                  <0x0000009d>
                            DW_AT_alignment             0x00000004
                            DW_AT_data_member_location  0
< 1><0x0000009d>    DW_TAG_base_type
                      DW_AT_name                  u32
                      DW_AT_encoding              DW_ATE_unsigned
                      DW_AT_byte_size             0x00000004
< 1><0x000000a4>    DW_TAG_pointer_type
                      DW_AT_type                  <0x0000009d>
                      DW_AT_name                  &u32

It would be great to get rid of that hack :)

@philipc
Copy link
Contributor

philipc commented Aug 22, 2019

Ok, but the closure types are all being put under the DW_TAG_compile_unit. So we should at least move them into that namespace with the rest of the types, but yes getting rid of the parallel tree would be good too.

Not sure if this will cause problems, but one possible difference from other languages is that these types are visible outside the function by returning impl Trait.

@eddyb
Copy link
Member

eddyb commented Aug 22, 2019

@philipc Pretty sure you can get the same in C++ with decltype(auto).

@michaelwoerister
Copy link
Member

@philipc I'm pretty sure get_namespace_for_item(cx, closure_def_id) would give you the expected namespace. You can find many examples of how this is used in metadata.rs. Right now we seem to re-use the tuple metadata infrastructure for closures, which is why they all end up in the top-level scope.

bors added a commit that referenced this issue Aug 28, 2019
debuginfo: give unique names to closure and generator types

Closure types have been moved to the namespace where they
are defined, and both closure and generator type names now
include the disambiguator.

This fixes an exception when lldb prints nested closures.

Fixes #57822

I haven't included the `DW_AT_artificial` changes discussed in #57822 because they make the output worse IMO, but I can easily add these if still required. For example, for the new test case the output is now:
```
(lldb) p g
(issue_57822::main::closure-1) $1 = closure-1(closure(1))
```
but adding `DW_AT_artificial` changes this to:
```
(lldb) p g
(issue_57822::main::closure-1) $0 = closure-1 {

}
```

Note that nested generators didn't cause the exception. I haven't determined why, but I think it makes sense to add the disambiguator for them too. It feels like we still don't really understand why closures were causing an error though.

r? @michaelwoerister
@bors bors closed this as completed in fbe3f3e Aug 28, 2019
pietroalbini pushed a commit to pietroalbini/rust that referenced this issue Sep 2, 2019
Closure types have been moved to the namespace where they
are defined, and both closure and generator type names now
include the disambiguator.

This fixes an exception when lldb prints nested closures.

Fixes rust-lang#57822
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-debuginfo Area: Debugging information in compiled programs (DWARF, PDB, etc.) E-help-wanted Call for participation: Help is requested to fix this issue. E-mentor Call for participation: This issue has a mentor. Use #t-compiler/help on Zulip for discussion. P-high High priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants