Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement debuginfo path remapping #41419

Conversation

michaelwoerister
Copy link
Member

This PR adds the -Zdebug-prefix-map-from/-Zdebug-prefix-map-to commandline option pair as discussed in #38322. I think this implementation fulfills all requirements that came up during discussion.

The implementation here does simple string replacement and does not do any normalization of paths before or after applying the mapping. E.g. it makes a difference whether one compiles their crate with rustc ./src/lib.rs or rustc src/lib.rs. This is the most flexible strategy but it might also be the most tedious one, as the example above shows. I suspect it will be fine in practice, though.

cc @alexcrichton @infinity0 @jmesmon @jsgf @sanxiyn @vadimcn

This PR supersedes #38348 and #39130, both of which I'd like to close if @sanxiyn and @jsgf don't object.

@rust-highfive
Copy link
Collaborator

r? @alexcrichton

(rust_highfive has picked a reviewer for you, use r? to override)

@jsgf jsgf mentioned this pull request Apr 20, 2017
@jsgf
Copy link
Contributor

jsgf commented Apr 20, 2017

Wishlist: option to make error/warning messages use the remapped path rather than the compiler input path. (It's not clear to me whether that would easily fall out of this implementation.)

@michaelwoerister
Copy link
Member Author

Wishlist: option to make error/warning messages use the remapped path rather than the compiler input path.

It might be something to think about. But at that point it would become a generalized path remapping feature. In any case, I'd like this particular PR to stay focused on debuginfo.

(It's not clear to me whether that would easily fall out of this implementation.)

Not trivially but I suspect it would be doable.

@@ -1435,6 +1458,8 @@ impl<'a, 'tcx> EncodeContext<'a, 'tcx> {
def_path_table: def_path_table,
impls: impls,
exported_symbols: exported_symbols,
compiler_working_dir: tcx.sess.working_dir.to_string_lossy().into_owned(),
debug_prefix_map: debug_prefix_map,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I'm not familiar with the surrounding code. Does this mean the mapping is going to be recorded in the output? In that case, the output would contain some data that is dependent on the build-path. But ideally we'd like the output to be reproducible regardless of the build-path, which is one of the motivations for remapping debuginfo in the first place.

We ran into similar issues with GCC's -fdebug-prefix-map - if we set this on a distribution-wide level (dpkg-buildflags in Debian unstable does this at the moment) we find that some build tools will save CFLAGS etc into other parts of the build output (such as pkg-config files etc) which makes the overall output reproducible, even though GCC's own output was reproducible. Previously, GCC was also saving the value of this flag into the DW_AT_producer field, then they accepted a patch from us to specifically filter this flag out.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's correct, the mapping and compilation directory will end up in any rlib and Rust-dylib (but not in executables, cdylibs, or staticlibs). This is indeed a problem for reproducible builds :(

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From #38322 :

For this to work, we need to do what @jmesmon says: remap paths as they are stored or -- equivalently -- store the mapping with the crate and apply it to anything that comes from that crate.

I should've mentioned the above point earlier, when you wrote this comment. So I take it you have implemented the second option. Is it possible instead to implement the first option? That might avoid the problem I mentioned. (I still need to reason through the example cases you gave.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this problem could be solved by adapting the rules slightly. We could say that

  • if you specify any debug-prefix-map then debuginfo for inlined functions just does not work anymore out-of-the-box (you can still fix that manually of course).
  • Conversely, if you do not specify a debug-prefix-map then your builds will not be reproducible.

If we do it like this we should be able to transform the codemap before storing it in crate metadata and thus would not need to store the working dir and prefix mapping.

I think this would work well in practice.

@infinity0
Copy link
Contributor

(continuing on the main part of the PR):

I think this problem could be solved by adapting the rules slightly. We could say that [..]

Hm, I think that is not ideal... I'm trying to find ways to avoid forcing people to make "tradeoffs" when deciding whether they want reproducible builds.

How about, instead of storing the unmapped work_dir and src plus the mapping, you store the mapped work_dir and src plus the to value that was actually used (or empty if no mapping was applied)? Then you can still calculate relative paths from other crates when inlining.

It relies on the inlined crate having a good mapping applied when it was compiled, but distributions/people that want reproducible builds, would want to do this anyways.

Concretely:

CRATE A
working dir = /P/build
src = ../src/lib.rs

compiled with a map /P => AA gives A.rlib containing:

working_dir = AA/build
src = ../src/lib.rs
virtual_base = AA

Later, perhaps on a different machine:

CRATE B
working dir = /Q/build
src = /Q/src/lib.rs

with A.rlib located in /irrelevant/A.rlib, compiled with a map /Q => BB gives B.rlib containing:

working_dir = BB/build
src = BB/src/lib.rs
virtual_base = BB
virtual_base[A] = AA # read from A.rlib, could maybe omit this I think
Function B_func1, imported from A: AA/src/lib.rs (resolved from AA/build + ../src/lib.rs [*])

Later, perhaps on a different machine:

CRATE C
working_dir = /R/build
src = ../src/lib.rs

with A.rlib, B.rlib located wherever, compiled with a map /R => CC gives C.rlib containing:

working_dir = CC/build
src = ../src/lib.rs
virtual_base = CC
virtual_base[A] = AA # could maybe omit
virtual_base[B] = BB # could maybe omit
# not sure re-exports are possible in rust, but anyway for demonstration:
Function C_func1 = B_func1, imported from B (originally imported from A): AA/src/lib.rs

A debugger reading C.rlib later could map virtual_base[A], ..[B] and ..[C] to whatever directories it wants.

I appreciate this is a bit more complex, but I don't think it's too much more complex - instead of storing debug_prefix_map we store a (or several) virtual_bases.


[*] A minor addition for usability would be, when compiling A.rlib, check if /P/build + ../src/lib.rs is still affected by the same mapping that affects /P/build, and issue a warning if not. Either way, rustc can proceed based on "the user knows what's best".

@michaelwoerister
Copy link
Member Author

michaelwoerister commented Apr 21, 2017

@infinity0 That's an interesting idea. I think you could rephrase it to "map the source part of all source -> target mappings to something 'stable' before storing it". I'll have to think about it some more but I think that would work.

@michaelwoerister
Copy link
Member Author

Thinking about it some more: One problem with any approach that maps paths before storing them in crate metadata is that you effectively lose the original information, which would then also affect error messages. @jsgf might say that that's a feature, not a bug :) -- but it would definitely widen the scope of the change.

My suggestion would be to

  • rename the command line options to -Z remap-path-prefix-from / -Z remap-path-prefix-to, so that the name would also work for a wider implementation of the feature,
  • but otherwise land this PR as it is for now, and
  • decide later which strategy to use in order to support reproducible builds.

@infinity0
Copy link
Contributor

I'm worried that if this PR lands now, affecting the rust.metadata.bin files, there would be less incentive to change them in a backwards-incompatible way later - and also people might have to keep extra logic in their debuggers to read "rust 1.18/1.19 rlibs".

If no mappings are applied, that would not affect any error messages right? They would just contain paths that might not be actually present when transferred to another machine. If mappings are applied, then this non-existent path would simply be mapped to another non-existent path, which doesn't sound so bad. :p

@michaelwoerister
Copy link
Member Author

I'm worried that if this PR lands now, affecting the rust.metadata.bin files, there would be less incentive to change them in a backwards-incompatible way later - and also people might have to keep extra logic in their debuggers to read "rust 1.18/1.19 rlibs".

We already change the crate metadata format in a backwards incompatible way every few weeks. And the format is opaque, only the Rust compiler can make sense of it. So I don't think this would be a concern.

If no mappings are applied, that would not affect any error messages right? They would just contain paths that might not be actually present when transferred to another machine. If mappings are applied, then this non-existent path would simply be mapped to another non-existent path, which doesn't sound so bad. :p

What case are we talking about here?

@arielb1
Copy link
Contributor

arielb1 commented Apr 21, 2017

I'm in the "remap all paths, including paths in metadata (for reproducible builds) and path in error messages (for consistency)" club.

This commit adds the -Zdebug-prefix-map-from/-Zdebug-prefix-map-to
commandline option pair. This feature allows to replace prefixes
of any file paths that are emitted into debuginfo.
@shepmaster shepmaster added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Apr 21, 2017
@michaelwoerister
Copy link
Member Author

OK, I have a proof of concept implementation that just indiscriminately remaps all file paths when a FileMap is allocated in the CodeMap. This should give rather consistent results, affecting error messages, metadata, debuginfo, and the file!() macro alike. If @rust-lang/compiler and @rust-lang/tools don't object to such a general file path remapping feature, I can go ahead and make a new PR.

@alexcrichton
Copy link
Member

@michaelwoerister would such a PR supersede this one or augment it?

It seems like a reasonable idea to me at least!

@alexcrichton
Copy link
Member

Otherwise this change looks good to me from a technical perspective at least.

@infinity0
Copy link
Contributor

@michaelwoerister Great! Thanks for putting up with my pesky additional constraints :)

@michaelwoerister
Copy link
Member Author

@michaelwoerister would such a PR supersede this one or augment it?

Supersede, I'd say.

@michaelwoerister
Copy link
Member Author

@infinity0 You're welcome! It's a good thing you caught the reproducibility problem.

@alexcrichton
Copy link
Member

Ok in that case I'm going to tag this as waiting-for-author, but @michaelwoerister if you'd like me to take another look please just ask!

@alexcrichton alexcrichton added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 24, 2017
@michaelwoerister
Copy link
Member Author

Closing in favor of #41508.

frewsxcv added a commit to frewsxcv/rust that referenced this pull request Apr 27, 2017
…ping, r=alexcrichton

Implement a file-path remapping feature in support of debuginfo and reproducible builds

This PR adds the `-Zremap-path-prefix-from`/`-Zremap-path-prefix-to` commandline option pair and is a more general implementation of rust-lang#41419. As opposed to the previous attempt, this implementation should enable reproducible builds regardless of the working directory of the compiler.

This implementation of the feature is more general in the sense that the re-mapping will affect *all* paths the compiler emits, including the ones in error messages.

r? @alexcrichton
bors added a commit that referenced this pull request Apr 28, 2017
…xcrichton

Implement a file-path remapping feature in support of debuginfo and reproducible builds

This PR adds the `-Zremap-path-prefix-from`/`-Zremap-path-prefix-to` commandline option pair and is a more general implementation of #41419. As opposed to the previous attempt, this implementation should enable reproducible builds regardless of the working directory of the compiler.

This implementation of the feature is more general in the sense that the re-mapping will affect *all* paths the compiler emits, including the ones in error messages.

r? @alexcrichton
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants