-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add feature gate to permit measuring memory usage of dyn upcasting #112355
Comments
@WaffleLapkin can you leave a comment with status update? |
I have a WIP branch here, it needs a little bit of cleanup (like gating the stat print under |
I've opened a PR: #112400 (sadly, while cleaning it up I've found a few ICEs I've introduced, which I'm not immediately sure how to fix...) |
The implementation PR was merged and with the latest nightly you can try this out (I've added instructions to the issue description). Note, however, that I'm still not able to run a big experiment on top-N crater, or anything like that. |
Hi, Waffle asked if I could help it out so I'm giving this experiment a shot. I maintain a crater-like tool that makes a number of different design decisions (it desperately needs a real name, it's linked above). I'll upload a tarball in a day or so I think with output from the latest version of all published crates per the instructions above. |
I just ran the option with some well known crates and created a script to summarises those results. I may edit this post with more results in the future; also fell free to use my script. All the summaries have been done with Summary for axum-all-features, axum, bevy, helix, ruffleUpcasting cost: MEAN ± STDEV (MIN, MAX) axum-all-features (with dependencies)Total entries: 3226 (for 708 traits total) Upcasting cost (for affected traits): ~24.3% ± 18.4 (min: 7.69, max: 76.9) axum (with dependencies)Total entries: 1617 (for 303 traits total) Upcasting cost (for affected traits): ~16.1% ± 4.51 (min: 8.33, max: 20.0) bevy (with dependencies)Total entries: 7592 (for 1108 traits total) Upcasting cost (for affected traits): ~21.2% ± 18.3 (min: 1.42, max: 80.0) helix (with dependencies)Total entries: 3706 (for 662 traits total) Upcasting cost (for affected traits): ~29.5% ± 17.4 (min: 7.14, max: 76.9) ruffle (with dependencies)Total entries: 8291 (for 1234 traits total) Upcasting cost (for affected traits): ~29.9% ± 18.2 (min: 1.42, max: 80.0) Externalmedium-sized commercial game (with dependencies) #112355 (comment)Total entries: 5486 (for 824 traits total) Upcasting cost (for affected traits): ~21.2% ± 19.5 (min: 1.42, max: 80.0) |
The file is too big for GitHub, so here's a Google Drive link: https://drive.google.com/file/d/1nURLpnrRODv8Essd268nhhqM_APFwrrX/view
|
Howdy! We're working on a medium-sized commercial game in Rust. I ran the test on our codebase using jq -s -c 'sort_by(.upcasting_cost_percent|tonumber) | reverse[]' output.json > sorted.json Here's our results: https://gist.github.com/LPGhatguy/1a9624f0a30711f301460a777696bd5f The biggest concern I would have is the impact on |
I am not sure that this really captures the major cost of trait upcasting? For example, if I have trait A {
fn foo(&self);
}
trait B {
fn bar(&self);
}
trait C: A + B {} and just uses It probably makes more sense to add a feature flag that disable trait upcasting computation at all and then just compare the number of vtables generated in the object files? |
I completely agree – the best way to measure would be to disable the feature and measure the end-to-end cost in the binary. |
Add an (perma-)unstable option to disable vtable vptr This flag is intended for evaluation of trait upcasting space cost for embedded use cases. Compared to the approach in rust-lang#112355, this option provides a way to evaluate end-to-end cost of trait upcasting. Rationale: rust-lang#112355 (comment) ## How this flag should be used (after merge) Build your project with and without `-Zno-trait-vptr` flag. If you are using cargo, set `RUSTFLAGS="-Zno-trait-vptr"` in the environment variable. You probably also want to use `-Zbuild-std` or the binary built may be broken. Save both binaries somewhere. ### Evaluate the space cost The option has a direct and indirect impact on vtable space usage. Directly, it gets rid of the trait vptr entry needed to store a pointer to a vtable of a supertrait. (IMO) this is a small saving usually. The larger saving usually comes with the indirect saving by eliminating the vtable of the supertrait (and its parent). Both impacts only affects vtables (notably the number of functions monomorphized should , however where vtable reside can depend on your relocation model. If the relocation model is static, then vtable is rodata (usually stored in Flash/ROM together with text in embedded scenario). If the binary is relocatable, however, the vtable will live in `.data` (more specifically, `.data.rel.ro`), and this will need to reside in RAM (which may be a more scarce resource in some cases), together with dynamic relocation info living in readonly segment. For evaluation, you should run `size` on both binaries, with and without the flag. `size` would output three columns, `text`, `data`, `bss` and the sum `dec` (and it's hex version). As explained above, both `text` and `data` may change. `bss` shouldn't usually change. It'll be useful to see: * Percentage change in text + data (indicating required flash/ROM size) * Percentage change in data + bss (indicating required RAM size)
#114974 is merged, and now we have a way to measure end-to-end cost. There's a how-to-use instruction in the PR, and I'll reproduce it here: How to use
|
Here's a few projects that I evaluated: https://github.com/tock/tock (embedded):
224 bytes increase in .text, 0.14% overhead. No .data overhead. I evaluated a few smaller firmware/embedded projects, and I don't see size difference at all, since there are no A personal WIP driver project:
936 bytes increase in .text, 0.12% overhead. 0.11% overall overhead. https://github.com/nbdd0121/r2vm (hosted, size doesn't really matter, investigate out of curiosity):
4800 bytes increase in .text, 0.14% overhead. .data 2400 bytes, 1.01% overhead. 0.2% overall overhead for text + data combined. With the data available, I think my conclusion is that the size overhead is <0.2% in all cases where I think size really matters a lot, so IMO the overhead is acceptable (when weighting against the language/compiler complexity to make the upcasting feature opt-in rather than always enabled). My only concern now is about libcore/liballoc accidentally introduce upcastable traits and force the overhead to everyone, but that concern was already mitigated by the multiple_supertrait_upcastable lint that's enabled in libcore/liballoc. I would encourage other people interested in binary size to give it a shot with the new flag and re-run the evaluation using the end-to-end approach and report numbers. @Urgau @LPGhatguy who posted some evaluation here and @oxalica who raised the initial vtable size concern. |
In rust-lang triage meeting today, we decided to close this issue because we felt we had gathered the data we needed. Thanks all. |
@WaffleLapkin is preparing a PR to permit measuring memory usage of dyn upcasting to help gather data for concerns raised in #65991. The lang team is generally in favor of gathering data but we don't want to wait infinitely long. Opening this issue to track this work.
Instructions on how to measure trait upcasting effects on your codebase
You need a recent nightly compiler — 2023-06-14 or newer.
If you are using
cargo
, runcargo clean && RUSTFLAGS="-Zprint-vtable-sizes" cargo check
. Note thatcargo clean
is required — without it caching will silence the output. If you are usingrustc
directly pass-Zprint-vtable-sizes
to it.The output will contain lines like these:
print-vtable-sizes
followed by a json describing vtable sizes. Json will contain the following fields:crate_name
— name of the crate which defines the traitstrait_name
— path to the traitentries
— number of entries in a vtable with the current algorithm (i.e. with upcasting)entries_ignoring_upcasting
— number of entries in a vtable, as-if we did not have trait upcastingentries_for_upcasting
— number of entries in a vtable needed solely for upcasting (i.e.entries - entries_ignoring_upcasting
).upcasting_cost_percent
— cost of having upcasting in % relative to the number of entries without upcasting (i.e.entries_for_upcasting / entries_ignoring_upcasting * 100%
).Lines are sorted by
upcasting_cost_percent
, so that the biggest % change is first.You can collect and analyze these stats however your want.
Some important notes:
dyn
and/or vtables of which are never actually instantiatedThe text was updated successfully, but these errors were encountered: