-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use slices rather than vectors in ZopfliHash (std only) #27
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, avoiding Vec
allocations all over the place is indeed a good thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've given this PR a good deal of thought, and I can't think of any nicer ways to ensure that the stack doesn't blow up while avoiding additional indirections, or to refactor the code so that it doesn't use troublesome structs like these in the first place, without requiring nightly Rust.
Copy elision is something the compiler often does, but I've seen more cases where it currently didn't. Also, both LLVM and Rust evolve relatively quickly, so in the end I found it extremely fragile to rely on undocumented optimizations for proper behavior in a library that is meant to be portable. Considering how much of a maintenance burden codegen tests would be, I think the right trade-off here is to stay with unsafe code.
However, the good news of all this thought is that I found ways to extend this optimization to non-std targets, drop the extra dependency on once_cell
, and overall improve how obvious heap usage is now. I took the liberty of making these changes directly on this branch, and found no significant performance differences vs. commit e1f5045
.
Thanks for the PR, it was interesting to review! 😄
In some of my tests, most of the CPU time seems to be spent constructing ZopfliHash. This PR optimizes that process by making HashThing consist of slices rather than Vec's. Initializing or resetting a ZopfliHash now consists of one big memory copy, and multiple internal pointers inside the Vec's are replaced with just one pointer to the ZopfliHash itself.
Of course, this means ZopfliHash is so large that attempting to return one by value would overflow the stack. To solve that problem, this PR constructs the empty instance and stores it in a Lazy<Box<>>, and the new() method copies it into another Box<>. This currently requires unsafe code and doesn't work if
std
isn't enabled.