-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #2386 from japaric/used
RFC: #[used] static variables
- Loading branch information
Showing
1 changed file
with
245 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,245 @@ | ||
- Feature Name: `used` | ||
- Start Date: 2018-04-03 | ||
- RFC PR: [rust-lang/rfcs#2386](https://github.com/rust-lang/rfcs/pull/2386) | ||
- Rust Issue: [rust-lang/rust#40289](https://github.com/rust-lang/rust/issues/40289) | ||
|
||
# Summary | ||
[summary]: #summary | ||
|
||
Stabilize the `#[used]` attribute which is used to force the compiler to keep static variables, | ||
even if not referenced by any other part of the program, in the output object file. | ||
|
||
# Motivation | ||
[motivation]: #motivation | ||
|
||
Bare metal applications, like kernels, bootloaders and other firmware, usually need precise control | ||
over the memory layout of the program. These programs usually need to place data structures like | ||
vector (interrupt) tables in certain memory locations for the system to operate properly. | ||
|
||
The final memory layout of the program is decided by the linker; bare metal applications make use of | ||
*linker scripts* to control the placement of (linker) *sections* in memory. But for all this to work | ||
the vector table must be present in the object files passed to the linker. That's where the | ||
`#[used]` attribute comes in: without it the compiler will optimize away the vector table, as it's | ||
not directly used by the program, and it will never reach the linker. | ||
|
||
It's possible to work around the lack of the `#[used]` attribute by declaring the vector table as | ||
public: | ||
|
||
``` rust | ||
// public items are exposed in the object file | ||
#[link_section = ".vector_table.exceptions"] | ||
pub static EXCEPTIONS: [extern "C" fn(); 14] = [/* .. */]; | ||
``` | ||
|
||
But this is brittle because the compiler can still optimize the symbol away when compiling with LTO | ||
enabled -- with LTO the compiler has global knowledge about the program, and will see that | ||
`EXCEPTIONS` is unused by the program and discard it. | ||
|
||
Yet another workaround is to force a volatile load of the vector table in some part of the program, | ||
usually before main. The compiler will always keep the vector table in this case but this | ||
alternative incurs in the cost of a load operation that will never be optimized away by the | ||
compiler. | ||
|
||
``` rust | ||
#[link_section = ".vector_table.exceptions"] | ||
static EXCEPTIONS: [extern "C" fn(); 14] = [/* .. */]; | ||
|
||
// entry point of the firmware | ||
fn reset() -> ! { | ||
extern "C" { | ||
// user entry point | ||
fn main() -> !; | ||
} | ||
|
||
// this operation will never be optimized away | ||
unsafe { ptr::read_volatile(&EXCEPTIONS[0]) }; | ||
|
||
main() | ||
} | ||
``` | ||
|
||
The proper solution to keeping the vector table is to mark the vector table as a *used* variable to | ||
force the compiler to keep in one of the emitted object files. | ||
|
||
``` rust | ||
#[used] // will be present in the object file | ||
#[link_section = ".vector_table.exceptions"] | ||
static EXCEPTIONS: [extern "C" fn(); 14] = [/* .. */]; | ||
``` | ||
|
||
# Guide-level explanation | ||
[guide-level-explanation]: #guide-level-explanation | ||
|
||
We can think of the compilation process performed by `rustc` as a two stage process. First, `rustc` | ||
compiles a crate (source code) into *object files*, then `rustc` invokes the linker on those object | ||
files to produce a single *executable*, or shared library (e.g. `.so`) if the crate type was set to | ||
"cdylib". | ||
|
||
The `#[used]` attribute can be applied to static variables to keep them in the *object files* | ||
produced by `rustc`, even in the presence of LTO. Note that this does **not** mean that the static | ||
variable will make its way into the binary file emitted by the linker as the linker is free to drop | ||
symbols that it deems unused. In other words, the `#[used]` attribute does **not** affect the | ||
behavior of the linker. | ||
|
||
Consider the following program: | ||
|
||
``` rust | ||
#[used] | ||
static FOO: u32 = 0; | ||
static BAR: u32 = 0; | ||
|
||
fn main() {} | ||
``` | ||
|
||
The variable `FOO` marked with the `#[used]` attribute will be kept in the emitted object file | ||
regardless of the optimization level. On the other hand, the unused variable `BAR` is always | ||
optimized away. | ||
|
||
``` console | ||
$ cargo rustc -- --emit=obj # for simplicity incr. comp. has been disabled | ||
$ nm -C $(find target -name '*.o') | ||
(..) | ||
0000000000000000 r foo::FOO | ||
0000000000000000 t foo::main | ||
0000000000000000 T std::rt::lang_start | ||
(..) | ||
|
||
$ cargo clean; cargo rustc --release -- | ||
$ nm -C $(find target -name '*.o') | ||
0000000000000000 T main | ||
0000000000000000 r foo::FOO | ||
0000000000000000 t foo::main | ||
0000000000000000 T std::rt::lang_start | ||
(..) | ||
|
||
$ cargo clean; cargo rustc --release -- --emit=obj -C lto | ||
$ nm -C $(find target -name '*.o') | ||
(..) | ||
0000000000000000 r foo::FOO | ||
0000000000000000 t foo::main | ||
(..) | ||
``` | ||
|
||
`FOO` never makes it to the final executable because the linker sees that the call graph that stems | ||
from the user entry point `main` never makes use of `FOO` and discards it. | ||
|
||
``` console | ||
$ cargo clean; cargo build | ||
$ nm -C target/debug/foo | grep FOO || echo not found | ||
not found | ||
``` | ||
|
||
To keep `FOO` in the final binary assistance from the linker is required; this usually means writing | ||
a linker script. | ||
|
||
Consider the following program: | ||
|
||
``` rust | ||
#[used] | ||
#[link_section = ".init_array"] | ||
static FOO: extern "C" fn() = before_main; | ||
|
||
extern "C" fn before_main() { | ||
println!("Hello") | ||
} | ||
|
||
fn main() { | ||
println!("World") | ||
} | ||
``` | ||
|
||
When dealing with ELF files the `.init_array` section will usually be kept in the final binary by | ||
the default linker script. If the system supports it, all function pointers stored in the | ||
`.init_array` section will be called before entering `main`. Thus, the above program prints "Hello" | ||
and then "World" to the console when run on a *nix system. | ||
|
||
``` console | ||
$ cargo run --release | ||
Hello | ||
World | ||
|
||
$ nm -C target/release/foo | grep FOO | ||
000000000026b620 t foo::FOO | ||
``` | ||
|
||
If the `#[used]` attribute is removed from the source code then only "World" is printed to the | ||
console as the `FOO` variable will get optimized away by the compiler. | ||
|
||
# Reference-level explanation | ||
[reference-level-explanation]: #reference-level-explanation | ||
|
||
The `#[used]` attribute can only be used on static variables. Static variables marked with this | ||
attribute will be appended to the special `@llvm.used` global variable when lowered to LLVM IR. | ||
|
||
``` rust | ||
#[used] | ||
static FOO: u32 = 0; | ||
|
||
fn main() {} | ||
``` | ||
|
||
``` console | ||
$ cargo clean; cargo rustc -- --emit=llvm-ir | ||
$ grep llvm.used $(find -name '*.ll') | ||
@llvm.used = appending global [1 x i8*] [i8* getelementptr inbounds (<{ [4 x i8] }>, <{ [4 x i8] }>* @_ZN3foo3FOO17hf0af6b03a826c578E, i32 0, i32 0, i32 0)], section "llvm.metadata" | ||
``` | ||
|
||
The semantics of this operation are (quoting [LLVM reference][llvm]): | ||
|
||
[llvm]: https://llvm.org/docs/LangRef.html#the-llvm-used-global-variable | ||
|
||
> If a symbol appears in the @llvm.used list, then the compiler, assembler, ~and linker~ are | ||
> required to treat the symbol as if there is a reference to the symbol that it cannot see (which is | ||
> why they have to be named). For example, if a variable has internal linkage and no references | ||
> other than that from the @llvm.used list, it cannot be deleted. This is commonly used to represent | ||
> references from inline asms and other things the compiler cannot “see”, and corresponds to | ||
> “attribute((used))” in GNU C. | ||
*strikethrough added by the author* | ||
|
||
The part about the linker is not true (\*): from the point of view of the linker static variables | ||
marked with `#[used]` look exactly the same as variables that have not been marked with that | ||
attribute -- those are the implemented LLVM semantics. Also ELF object files have no mechanism to | ||
prevent the linker from dropping its symbols if they are not referenced by other object files. | ||
|
||
(\*) unless "linker" is actually referring to `llvm-link` (?) | ||
|
||
# Drawbacks | ||
[drawbacks]: #drawbacks | ||
|
||
This is yet another low level feature that alternative `rustc` implementations would have to | ||
implement to be 100% compatible with the official LLVM based `rustc` implementation. Also see | ||
`#[repr(align = "*")]`, `#[repr(*)]`, `#[link_section]`, etc. | ||
|
||
# Rationale and alternatives | ||
[alternatives]: #alternatives | ||
|
||
## Chosen design | ||
|
||
This design pretty much matches how C compilers implement this feature. See prior art section below. | ||
|
||
## Not doing this | ||
|
||
Not doing this means that people will continue to use the brittle workarounds presented in the | ||
motivation section. | ||
|
||
# Prior art | ||
[prior-art]: #prior-art | ||
|
||
Most compilers provide a feature with the exact same semantics: usually in the form of a "used" | ||
attribute (e.g. `__attribute(used)__`) that can be applied to static variables. | ||
|
||
The following C code is an example from the [KEIL toolchain documentation][keil]: | ||
|
||
[keil]: http://www.keil.com/support/man/docs/armcc/armcc_chr1359124983230.htm | ||
|
||
``` c | ||
static int lose_this = 1; | ||
static int keep_this __attribute__((used)) = 2; // retained in object file | ||
static int keep_this_too __attribute__((used)) = 3; // retained in object file | ||
``` | ||
# Unresolved questions | ||
[unresolved]: #unresolved-questions | ||
None so far. |