-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
precompute offsets ahead-of-time rather than on each dereference #141
Conversation
0726318
to
f992fcf
Compare
Criterion results on a test related to nearcore-private-1#2: compile time: [9.7873 ms 9.8477 ms 9.9119 ms] change: [-5.2388% -4.2981% -3.2369%] (p = 0.00 < 0.05) Performance has improved. Manual testing on the exact test of nearcore-private-1#2 confirms the ~5% speedup: Non-rayon goes down to 24.5-25ms from 25.5-26ms Rayon goes down to 12ms from 12.5ms
f992fcf
to
ba4b977
Compare
I think we agrdeed to just kill rayon (near/nearcore#8948), could you send a quick PR to do that as a follow up? |
lib/vm/src/vmoffsets.rs
Outdated
|
||
fn precompute(&mut self) { | ||
self.vmctx_signature_ids_begin = 0; | ||
self.vmctx_imported_functions_begin = self |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is so much more comprehensible than the previous code already, but I do wonder if there’s an opportunity to further simplify this code with some helpers.
For instance: instead of manually having checked_add
and checked_mul
, having that in a utility function along the lines of fma
(fused multiply-add).
That way we’d be looking at something like
self.vmctx_imported_functions_begin = offset_fma(
self.num_signature_ids,
u32::from(self.size_of_vmshared_signature_index())
self.vmctx_signature_ids_begin
);
which seems like less noise overall. (names/argument ordering at your choice)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great idea thanks! I’ve just pushed a commit doing this change :)
3292: Precompute offsets in VMOffsets r=ptitSeb a=ptitSeb # Description Small optimisation: Precompute Offsets in VMOffsets based on near/wasmer#141 For #3305 Co-authored-by: ptitSeb <sebastien.chev@gmail.com>
Criterion results on a test related to near/nearcore-private-1#2:
Manual testing on the exact test of near/nearcore-private-1#2 confirms the ~5% speedup:
Non-rayon goes down to 24.5-25ms from 25.5-26ms
Rayon goes down to 12ms from 12.5ms
This also feels more like a code cleanup than added complexity for optimization’s sake to me, but I would understand if other people thought the added invariants are added complexity