-
Notifications
You must be signed in to change notification settings - Fork 520
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Added Rustc Debugger Support Chapter
- Loading branch information
Showing
2 changed files
with
322 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,321 @@ | ||
# Debugging support in the Rust compiler | ||
|
||
This document explains the state of debugging tools support in the Rust compiler (rustc). | ||
The document gives an overview of debugging tools like GDB, LLDB etc. and infrastrcture | ||
around Rust compiler to debug Rust code. If you want to learn how to debug the Rust compiler | ||
itself, then you must see [Debugging the Compiler] page. | ||
|
||
The material is gathered from YouTube video [Tom Tromey discusses debugging support in rustc]. | ||
|
||
## Preliminaries | ||
|
||
### Debuggers | ||
|
||
According to Wikipedia | ||
|
||
> A [debugger or debugging tool] is a computer program that is used to test and debug | ||
> other programs (the "target" program). | ||
Writing a debugger from scratch for a language requires a lot of work, especially if | ||
debuggers have to be supported on various platforms. GDB and LLDB, however, can be | ||
extended to support debugging a language. This is the path that Rust has chosen. | ||
This document's main goal is to document the said debuggers support in Rust compiler. | ||
|
||
### DWARF | ||
|
||
According to the [DWARF] standard website | ||
|
||
> DWARF is a debugging file format used by many compilers and debuggers to support source level | ||
> debugging. It addresses the requirements of a number of procedural languages, | ||
> such as C, C++, and Fortran, and is designed to be extensible to other languages. | ||
> DWARF is architecture independent and applicable to any processor or operating system. | ||
> It is widely used on Unix, Linux and other operating systems, | ||
> as well as in stand-alone environments. | ||
DWARF reader is a program that consumes the DWARF format and creates debugger compatible output. | ||
This program may live in the compiler itself. DWARF uses a data structure called | ||
Debugging Information Entry (DIE) which stores the information as "tags" to denote functions, | ||
variables etc., e.g., `DW_TAG_variable`, `DW_TAG_pointer_type`, `DW_TAG_subprogram` etc. | ||
You can also invent your own tags and attributes. | ||
|
||
## Supported debuggers | ||
|
||
### GDB | ||
|
||
We have our own fork of GDB - [https://github.com/rust-dev-tools/gdb] | ||
|
||
#### Rust expression parser | ||
|
||
To be able to show debug output we need an expression parser. | ||
This (GDB) expression parser is written in [Bison] and is only a subset of Rust expressions. | ||
This means that this parser can parse only a subset of Rust expressions. | ||
GDB parser was written from scratch and has no relation to any other parser. | ||
For example, this parser is not related to Rustc's parser. | ||
|
||
GDB has Rust like value and type output. It can print values and types in a way | ||
that look like Rust syntax in the output. Or when you print a type as [ptype] in GDB, | ||
it also looks like Rust source code. Checkout the documentation in the [manual for GDB/Rust]. | ||
|
||
#### Parser extensions | ||
|
||
Expression parser has a couple of extensions in it to facilitate features that you cannot do | ||
with Rust. Some limitations are listed in the [manual for GDB/Rust]. There is some special | ||
code in the DWARF reader in GDB to support the extensions. | ||
|
||
A couple of examples of DWARF reader support needed are as follows - | ||
|
||
1. Enum: Needed for support for enum types. The Rustc writes the information about enum into | ||
DWARF and GDB reads the DWARF to understand where is the tag field or is there a tag | ||
field or is the tag slot shared with non-zero optimization etc. | ||
|
||
2. Dissect trait objects: DWARF extension where the trait object's description in the DWARF | ||
also points to a stub description of the corresponding vtable which in turn points to the | ||
concrete type for which this trait object exists. This means that you can do a `print *object` | ||
for that trait object, and GDB will understand how to find the correct type of the payload in | ||
the trait object. | ||
|
||
**TODO**: Figure out if the following should be mentioned in the GDB-Rust document rather than | ||
this guide page so there is no duplication. This is regarding the following comments: | ||
|
||
[This comment by Tom](https://github.com/rust-lang/rustc-guide/pull/316#discussion_r284027340) | ||
> gdb's Rust extensions and limitations are documented in the gdb manual: | ||
https://sourceware.org/gdb/onlinedocs/gdb/Rust.html -- however, this neglects to mention that | ||
gdb convenience variables and registers follow the gdb $ convention, and that the Rust parser | ||
implements the gdb @ extension. | ||
|
||
[This question by Aman](https://github.com/rust-lang/rustc-guide/pull/316#discussion_r285401353) | ||
> @tromey do you think we should mention this part in the GDB-Rust document rather than this | ||
document so there is no duplication etc.? | ||
|
||
#### Developer notes | ||
|
||
* This work is now upstream. Bugs can be reported in [GDB Bugzilla]. | ||
|
||
### LLDB | ||
|
||
We have our own fork of LLDB - [https://github.com/rust-lang/lldb] | ||
|
||
Fork of LLVM project - [https://github.com/rust-lang/llvm-project] | ||
|
||
LLDB currently only works on macOS because of a dependency issue. This issue was easier to | ||
solve for macOS as compared to Linux. However, Tom has a possible solution which can enable | ||
us to ship LLDB everywhere. | ||
|
||
#### Rust expression parser | ||
|
||
This expression parser is written in C++. It is a type of [Recursive Descent parser]. | ||
Implements slightly less of the Rust language than GDB. LLDB has Rust like value and type output. | ||
|
||
#### Parser extensions | ||
|
||
There is some special code in the DWARF reader in LLDB to support the extensions. | ||
A couple of examples of DWARF reader support needed are as follows - | ||
|
||
1. Enum: Needed for support for enum types. The Rustc writes the information about | ||
enum into DWARF and LLDB reads the DWARF to understand where is the tag field or | ||
is there a tag field or is the tag slot shared with non-zero optimization etc. | ||
In other words, it has enum support as well. | ||
|
||
#### Developer notes | ||
|
||
* None of the LLDB work is upstream. This [rust-lang/lldb wiki page] explains a few details. | ||
* The reason for forking LLDB is that LLDB recently removed all the other language plugins | ||
due to lack of maintenance. | ||
* LLDB has a plugin architecture but that does not work for language support. | ||
* LLDB is available via Rust build (`rustup`). | ||
* GDB generally works better on Linux. | ||
|
||
## DWARF and Rustc | ||
|
||
[DWARF] is the standard way compilers generate debugging information that debuggers read. | ||
It is _the_ debugging format on macOS and Linux. It is a multi-language, extensible format | ||
and is mostly good enough for Rust's purposes. Hence, the current implementation reuses DWARF's | ||
concepts. This is true even if some of the concepts in DWARF do not align with Rust | ||
semantically because generally there can be some kind of mapping between the two. | ||
|
||
We have some DWARF extensions that the Rust compiler emits and the debuggers understand that | ||
are _not_ in the DWARF standard. | ||
|
||
* Rust compiler will emit DWARF for a virtual table, and this `vtable` object will have a | ||
`DW_AT_containing_type` that points to the real type. This lets debuggers dissect a trait object | ||
pointer to correctly find the payload. E.g., here's such a DIE, from a test case in the gdb | ||
repository: | ||
|
||
```asm | ||
<1><1a9>: Abbrev Number: 3 (DW_TAG_structure_type) | ||
<1aa> DW_AT_containing_type: <0x1b4> | ||
<1ae> DW_AT_name : (indirect string, offset: 0x23d): vtable | ||
<1b2> DW_AT_byte_size : 0 | ||
<1b3> DW_AT_alignment : 8 | ||
``` | ||
|
||
* The other extension is that the Rust compiler can emit a tagless discriminated union. | ||
See [DWARF feature request] for this item. | ||
|
||
### Current limitations of DWARF | ||
|
||
* Traits - require a bigger change than normal to DWARF, on how to represent Traits in DWARF. | ||
* DWARF provides no way to differentiate between Structs and Tuples. Rust compiler emits | ||
fields with `__0` and debuggers look for a sequence of such names to overcome this limitation. | ||
For example, in this case the debugger would look at a field via `x.__0` instead of `x.0`. | ||
This is resolved via the Rust parser in the debugger so now you can do `x.0`. | ||
|
||
DWARF relies on debuggers to know some information about platform ABI. | ||
Rust does not do that all the time. | ||
|
||
## Developer notes | ||
|
||
This section is from the talk about certain aspects of development. | ||
|
||
## What is missing | ||
|
||
### Shipping GDB in Rustup | ||
|
||
Tracking issue: [https://github.com/rust-lang/rust/issues/34457] | ||
|
||
Shipping GDB requires change to Rustup delivery system. To manage Rustup build size and | ||
times we need to build GDB separately, on its own and somehow provide the artifacts produced | ||
to be included in the final build. However, if we can ship GDB with rustup, it will simplify | ||
the development process by having compiler emit new debug info which can be readily consumed. | ||
|
||
Main issue in achieving this is setting up dependencies. One such dependency is Python. That | ||
is why we have our own fork of GDB because one of the drivers is patched on Rust's side to | ||
check the correct version of Python (Python 2.7 in this case. *Note: Python3 is not chosen | ||
for this purpose because Python's stable ABI is limited and is not sufficient for GDB's needs. | ||
See [https://docs.python.org/3/c-api/stable.html]*). | ||
|
||
This is to keep updates to debugger as fast as possible as we make changes to the debugging symbols. | ||
In essence, to ship the debugger as soon as new debugging info is added. GDB only releases | ||
every six months or so. However, the changes that are | ||
not related to Rust itself should ideally be first merged to upstream eventually. | ||
|
||
### Code signing for LLDB debug server on macOS | ||
|
||
According to Wikipedia, [System Integrity Protection] is | ||
|
||
> System Integrity Protection (SIP, sometimes referred to as rootless) is a security feature | ||
> of Apple's macOS operating system introduced in OS X El Capitan. It comprises a number of | ||
> mechanisms that are enforced by the kernel. A centerpiece is the protection of system-owned | ||
> files and directories against modifications by processes without a specific "entitlement", | ||
> even when executed by the root user or a user with root privileges (sudo). | ||
It prevents processes using `ptrace` syscall. If a process wants to use `ptrace` it has to be | ||
code signed. The certificate that signs it has to be trusted on your machine. | ||
|
||
See [Apple developer documentation for System Integrity Protection]. | ||
|
||
We may need to sign up with Apple and get the keys to do this signing. Tom has looked into if | ||
Mozilla cannot do this because it is at the maximum number of | ||
keys it is allowed to sign. Tom does not know if Mozilla could get more keys. | ||
|
||
Alternatively, Tom suggests that maybe a Rust legal entity is needed to get the keys via Apple. | ||
This problem is not technical in nature. If we had such a key we could sign GDB as well and | ||
ship that. | ||
|
||
### DWARF and Traits | ||
|
||
Rust traits are not emitted into DWARF at all. The impact of this is calling a method `x.method()` | ||
does not work as is. The reason being that method is implemented by a trait, as opposed | ||
to a type. That information is not present so finding trait methods is missing. | ||
|
||
DWARF has a notion of interface types (possibly added for Java). Tom's idea was to use this | ||
interface type as traits. | ||
|
||
DWARF only deals with concrete names, not the reference types. So, a given implementation of a | ||
trait for a type would be one of these interfaces (`DW_tag_interface` type). Also, the type for | ||
which it is implemented would describe all the interfaces this type implements. This requires a | ||
DWARF extension. | ||
|
||
Issue on Github: [https://github.com/rust-lang/rust/issues/33014] | ||
|
||
## Typical process for a Debug Info change (LLVM) | ||
|
||
LLVM has Debug Info (DI) builders. This is the primary thing that Rust calls into. | ||
This is why we need to change LLVM first because that is emitted first and not DWARF directly. | ||
This is a kind of metadata that you construct and hand-off to LLVM. For the Rustc/LLVM hand-off | ||
some LLVM DI builder methods are called to construct representation of a type. | ||
|
||
The steps of this process are as follows - | ||
|
||
1. LLVM needs changing. | ||
|
||
LLVM does not emit Interface types at all, so this needs to be implemented in the LLVM first. | ||
|
||
Get sign off on LLVM maintainers that this is a good idea. | ||
|
||
2. Change the DWARF extension. | ||
|
||
3. Update the debuggers. | ||
|
||
Update DWARF readers, expression evaluators. | ||
|
||
4. Update Rust compiler. | ||
|
||
Change it to emit this new information. | ||
|
||
### Procedural macro stepping | ||
|
||
A deeply profound question is that how do you actually debug a procedural macro? | ||
What is the location you emit for a macro expansion? Consider some of the following cases - | ||
|
||
* You can emit location of the invocation of the macro. | ||
* You can emit the location of the definition of the macro. | ||
* You can emit locations of the content of the macro. | ||
|
||
RFC: [https://github.com/rust-lang/rfcs/pull/2117] | ||
|
||
Focus is to let macros decide what to do. This can be achieved by having some kind of attribute | ||
that lets the macro tell the compiler where the line marker should be. This affects where you | ||
set the breakpoints and what happens when you step it. | ||
|
||
## Future work | ||
|
||
#### Name mangling changes | ||
|
||
* New demangler in `libiberty` (gcc source tree). | ||
* New demangler in LLVM or LLDB. | ||
|
||
**TODO**: Check the location of the demangler source. | ||
[Question on Github](https://github.com/rust-lang/rustc-guide/pull/316#discussion_r283062536). | ||
|
||
#### Reuse Rust compiler for expressions | ||
|
||
This is an important idea because debuggers by and large do not try to implement type | ||
inference. You need to be much more explicit when you type into the debugger than your | ||
actual source code. So, you cannot just copy and paste an expression from your source | ||
code to debugger and expect the same answer but this would be nice. This can be helped | ||
by using compiler. | ||
|
||
It is certainly doable but it is a large project. You certainly need a bridge to the | ||
debugger because the debugger alone has access to the memory. Both GDB (gcc) and LLDB (clang) | ||
have this feature. LLDB uses Clang to compile code to JIT and GDB can do the same with GCC. | ||
|
||
Both debuggers expression evaluation implement both a superset and a subset of Rust. | ||
They implement just the expression language but they also add some extensions like GDB has | ||
convenience variables. Therefore, if you are taking this route then you not only need | ||
to do this bridge but may have to add some mode to let the compiler understand some extensions. | ||
|
||
#### Windows debugging (PDB) is missing | ||
|
||
This is a complete unknown. | ||
|
||
[Tom Tromey discusses debugging support in rustc]: https://www.youtube.com/watch?v=elBxMRSNYr4 | ||
[Debugging the Compiler]: compiler-debugging.md | ||
[debugger or debugging tool]: https://en.wikipedia.org/wiki/Debugger | ||
[Bison]: https://www.gnu.org/software/bison/ | ||
[ptype]: https://ftp.gnu.org/old-gnu/Manuals/gdb/html_node/gdb_109.html | ||
[rust-lang/lldb wiki page]: https://github.com/rust-lang/lldb/wiki | ||
[DWARF]: http://dwarfstd.org | ||
[manual for GDB/Rust]: https://sourceware.org/gdb/onlinedocs/gdb/Rust.html | ||
[GDB Bugzilla]: https://sourceware.org/bugzilla/ | ||
[Recursive Descent parser]: https://en.wikipedia.org/wiki/Recursive_descent_parser | ||
[System Integrity Protection]: https://en.wikipedia.org/wiki/System_Integrity_Protection | ||
[https://github.com/rust-dev-tools/gdb]: https://github.com/rust-dev-tools/gdb | ||
[DWARF feature request]: http://dwarfstd.org/ShowIssue.php?issue=180517.2 | ||
[https://docs.python.org/3/c-api/stable.html]: https://docs.python.org/3/c-api/stable.html | ||
[https://github.com/rust-lang/rfcs/pull/2117]: https://github.com/rust-lang/rfcs/pull/2117 | ||
[https://github.com/rust-lang/rust/issues/33014]: https://github.com/rust-lang/rust/issues/33014 | ||
[https://github.com/rust-lang/rust/issues/34457]: https://github.com/rust-lang/rust/issues/34457 | ||
[Apple developer documentation for System Integrity Protection]: https://developer.apple.com/library/archive/releasenotes/MacOSX/WhatsNewInOSX/Articles/MacOSX10_11.html#//apple_ref/doc/uid/TP40016227-SW11 | ||
[https://github.com/rust-lang/lldb]: https://github.com/rust-lang/lldb | ||
[https://github.com/rust-lang/llvm-project]: https://github.com/rust-lang/llvm-project |