-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
In JSON output, emit a directive after metadata is generated. #60006
Conversation
@michaelwoerister You might be interested in this too. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I started writing a bunch of comments about the scheduling and costs and whatnot, but then I remembered that metadata files aren't object files! When rustc is producing the dylib crate type (not cdylib, "dylib") it is actually an object file which gets linked in. For --emit metadata
, however, the contents don't get threaded through LLVM.
Instead the contents of foo.rmeta
are available immediately after tcx.encode_metadata()
, called from here (wow we have some confusing naming, nothing is written there). That metadata
object returned is the raw bytes of metadata, and is threaded through all the way to the linking phase where we finally emit it to the filesystem here.
I think that we may want to move that emit_metadata
function much earlier into the compilation process? Basically just after the call to write_metadata
I think we can write out the rmeta
file (if necessary) and not have to touch the LLVM scheduler at all.
In theory the call to encode_metadata()
could be moved arbitrarily sooner in the compilation process as well and then the final metadata value is just encoded through to where we optionally create an object file for dylib crate types.
// I couldn't find where the file writing occurs. | ||
if let WorkItem::Optimize( | ||
ModuleCodegen { name, kind: ModuleKind::Metadata, .. }) = &llvm_work_item { | ||
cgcx.diag_emitter.emit_directive(format!("metadata done for {}", name)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should emit a message for all artifacts, with a path to the artifact, and let Cargo pick out what it cares out of that.
Something like sccache, for example, might care more about other files.
And it's less tied to pipelining, just another general thing rustc does.
☔ The latest upstream changes (presumably #60030) made this pull request unmergeable. Please resolve the merge conflicts. |
I had a suspicion this might be true, thanks for clarifying! My next question is this: what is the metadata stuff that is happening during codegen? E.g. what are this and this for? |
That... is a really good question! I think that may actually be legitimately dead code that can be deleted. AFAIK that just generates an empty object file which is always deleted. If a test starts failing though I could probably rationalize from the test why we need it :)
Ah so that's actually needed for the dylib crate type. For anything that doesn't produce a dylib I think it's dead code. If we're producing a dylib we compress the rmeta file, shove it into a section in the LLVM module, and then link it into the dylib itself. If we're not producing a dylib I think that's an empty LLVM module which produces an empty object file which we end up just deleting (it'd only get passed to the linker for a dylib anyway). So... I think in general those two blocks are largely dead code and/or historical artifacts! As I'm sure you've noticed, the backend here, especially the parallel part, could use with some cleanup at some point (ideally with all the new parallel query infrastructure!) |
@alexcrichton: I tried removing the |
Heh that'd make sense :) I suspect though that |
…excrichton Don't generate unnecessary rmeta files. As per rust-lang#60006 (comment). r? @alexcrichton
…excrichton Don't generate unnecessary rmeta files. As per rust-lang#60006 (comment). r? @alexcrichton
AIUI, metadata generation happens at the start of code generation, which happens under Does that sound right? The complication is that there's a fair amount of complicated code under those methods, and disentangling the metadata parts is difficult. |
5c775b9
to
57d9ea0
Compare
@alexcrichton: New draft code is up. It still has some |
I still have to get the tests working, too. I am currently battling through |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does that sound right? The complication is that there's a fair amount of complicated code under those methods, and disentangling the metadata parts is difficult.
This sounds about right yeah, but I think that a better fix might be to decouple metadata generation from all this back-end-related code. For example the dep-info
output type has a check at some point (very early on) of "if enabled do it right now and maybe exit afterwards". I suspect a similar check would suffice for metadata in the sense that the only thing necessary for generating metadata contents is tcx
, and after that it looks like tcx.encode_metadata()
is all that's necessary.
I suspect we may even be able to practically sequence this just after tcx
creation for maximal parallelism. Something like just before this line or around there (maybe inside analysis
? unsure) would be fine.
In any case, separation out and detangling from everything else going on in the linking phase is probably the best bet to sequence this earlier. And this can of course always happen later, no need to finish it in this PR!
It might be better to do that in a follow-up PR, because it'll be fiddly. Landing this sooner will let you test the Cargo side.
Sounds fine by me! I'm up for whatever you find most convenient :)
57d9ea0
to
400b92f
Compare
@alexcrichton: new code is up. It's now in a state that could be acceptable for landing: tests pass, and there are no
|
This currently unconditionally emits the directive, I think? I think we'll probably still want to have it off by default (something required to opt-in on the CLI), and I agree that for now it's probably best to put it behind To that end I'd recommend adding a new unstable flag |
Oh and for a test I think actually |
400b92f
to
8515fb6
Compare
@alexcrichton: I added |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few questions about the test, but otherwise r=me when you're comfortable they've been answered!
20f8ac7
to
f96c964
Compare
I switched to using output normalization to account for the path and filename differences on different platforms. Let's try again... @bors r=alexcrichton |
📌 Commit f96c96414aef2d57107096f8055236af0c9839ba has been approved by |
⌛ Testing commit f96c96414aef2d57107096f8055236af0c9839ba with merge e407efcd2f5bdbf225485e0913dc99cfdedab64b... |
💔 Test failed - checks-travis |
The job Click to expand the log.
I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact |
To implement pipelining, Cargo needs to know when metadata generation is finished. This commit adds code to do that. Unfortunately, metadata file writing currently occurs very late during compilation, so pipelining won't produce a speed-up. Moving metadata file writing earlier will be a follow-up. The change involves splitting the existing `Emitter::emit` method in two: `Emitter::emit_diagnostic` and `Emitter::emit_directive`. The JSON directives look like this: ``` {"directive":"metadata file written: liba.rmeta"} ``` The functionality is behind the `-Z emit-directives` option, and also requires `--error-format=json`.
f96c964
to
7bcb0cf
Compare
Once more with feeling! @bors r=alexcrichton |
📌 Commit 7bcb0cf has been approved by |
In JSON output, emit a directive after metadata is generated. To implement pipelining, Cargo needs to know when metadata generation is finished. This is done via a new JSON "directive". Unfortunately, metadata file writing currently occurs very late during compilation, so pipelining won't produce a speed-up. Moving metadata file writing earlier will be a follow-up. r? @alexcrichton
☀️ Test successful - checks-travis, status-appveyor |
…ichton rustc: rename -Z emit-directives to -Z emit-artifact-notifications and simplify the output. This is my take on #60006 / #60419 (see #60006 (comment)). I'm not too attached the "notifications" part, it's pretty much bikeshed material. **EDIT**: for "artifact", @matklad pointed out Cargo already uses it (in #60464 (comment)) The first two commits are fixes that could be landed independently, especially the `compiletest` one, which removes the need for any of the normalization added in #60006 to land the test. The last commit enables the emission for all outputs, which was my main suggestion for #60006, mostly to show that it's minimal and not really a "scope creep" (as suggested in #60006 (comment)). cc @alexcrichton @nnethercote
To implement pipelining, Cargo needs to know when metadata generation is
finished. This is done via a new JSON "directive".
Unfortunately, metadata file writing currently occurs very late during
compilation, so pipelining won't produce a speed-up. Moving metadata
file writing earlier will be a follow-up.
r? @alexcrichton