-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid using default in HashBuffers::reset #147
Conversation
7a5a136
to
5e8c3df
Compare
The compiler ought to be able to optimize that yeah but I'll have to check. I will be away for a week so may not be able to check properly until then. |
The person I mentioned sent me an update, with a screenshot too: So that's quite unfortunate. I am not sure if we have ways to initialize the structure on the heap directly in a guaranteed way without There's certainly been discussion on this before https://users.rust-lang.org/t/how-to-create-large-objects-directly-in-heap/26405. |
5e8c3df
to
60ed8cb
Compare
Instead of simply changing the I don't know if we can do much better without |
60ed8cb
to
4a4e63c
Compare
I guess three boxed slices is better. |
I asked them to try with my fork: [patch.crates-io]
miniz_oxide = { git = "https://github.com/Lonami/miniz_oxide.git", branch = "reset-hashbuffers-inplace" } and:
|
Let me know if I should reset the last commit. 85KB is quite a bit to allocate on the stack in case the optimization fails again. So while this makes the object another usize large (to store the slice's length), I think it might be good to do it anyway for consistency. |
I don't think we can use the method in those commits without introducing unsafe or drastically worsening performance. I can't easily profile until next week as I'm not at home but losing the length as part of the type will likely prevent the compiler from optimizing several bounds checks in the performance critical sections of the code. Is the person having stack overflow issues having them in release mode? |
Anyhow, we used Box::default as a workaround as Box::new didn't optimize out the stack copy in the past, but that's probably quite outdated now with stuff like this change, and the current rust sdl lib implementation of box::default calls box::new so I think that needs to be re-checked any maybe that would help here. (the reason why it helped in the past was that box::default used the old special box keyword directly on the structs default trait implementation previously) |
Yep, it's quite unfortunate. But until Rust stabilizes a solution, I think we should choose either safety or performance. Maybe this change should be under But, aborting the process on overflow isn't really better than a performance hit.
Let's wait on the benchmarks. Perhaps we can add
From what I understood, yes. |
This approach can potentially be helpful, as it presumably allows to construct a boxed array without blowing the stack: rust-lang/rust#63291 (comment) |
As noted I will look into whether using new instead of default helps solve this and whether I can reproduce the stack overflow in release on windows but if not that method seems reasonable as it seems to avoid the other issues and just requires bumping the minimun rust version to 1.56.0. |
I haven't tried reproducing the problem. All I know is the person that had the problem before, does not have it after this PR. |
Some more info. The person was testing in a Windows server. It did not have the latest version of Unfortunately they didn't check the |
It would still be nice to have library work in debug mode too... |
Yeah it works fine in debug mode normally. It's more an issue with how Rust works than this library in particular that compiling with no optimizations results in a lot of stack usage so a few tens of kilobytes of stack usage in many different libraries can add up in a large project. I will have to look into whether it's an issue in release in older versions. |
My own library indirectly depends on miniz via
use flate2::write::GzDecoder;
. A user reported to me a stack overflow, which seems to come from the default implementation. I don't know if they're using Windows, but it might be similar to #21.I don't actually know for sure if
reset
is the problem, but it seemed like it doesn't crash immediately, so it might be thereset
as opposed to initializing the buffers the first time around.I have kept the
#[inline]
, but I don't know if it still makes sense (hopefully the optimizer can emit a singlememset
; I'm having trouble running the benchmarks on Windows).