-
Notifications
You must be signed in to change notification settings - Fork 515
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Quadratic random from mangle_getLen
overflows causing failure to corrupt past 64 KiB
#360
Comments
Thanks for analyzing this. Yeah.. big fail on my part. Will fix soon! |
Hehe, no problem. I've had my fair share of silly mistakes! I've been porting some of the mutators in Rust and based on the input size (and what turns into x^4), I either use a Historically for fuzzing I've used I'm not sure how viable it is, but I've also used the following, which gives some exponential bias, but never can be more than 2x worse than uniform distribution. There's also no risk of overflow or div by zero if size > 0: let rand = if rand() % 2 == 0 {
// "exp" rand
rand() % (rand() % size + 1)
} else {
// Uniform rand
rand() % size
} Here's what the difference between honggfuzz rand and my rand is. The x axis is the byte offset, and the Y axis is the ratio of selection compared to uniform selection. Thus, y=0.5 means that value is selected half as frequently as it would be when uniformly generated , and y=20 means it's observed 20x more frequently than if it was a uniform distribution. They seem fairly similar, but there's definitely enough of a difference that it could matter. |
rand() uses futexes and that hurts performance a lot. there PRNGs that are faster, have better randomness and don't bottleneck. |
Oh yeah, when I say rand(), I mean a random function. I don't think anyone here is using libc rand(). The cost of nearly any rand implementation (xorshift64, xoroshiro, doesn't really matter) is "free" WRT the cost of the modulo, it shouldn't matter at all. |
well, guess what google afl is using :) |
Of course, I will fix it, though this approach to favoring lower offsets vs others is at best handwavy and based on my experience than any solid data at hand :) The idea from #360 (comment) seems interesting since "looks" like something we'd want. I will experiment a bit with it and implement this or something similar. I've looked through various exprand implementations in other projects, and they seem to be mostly implemented using floats, and since I don't want to spend too much time on analyzing whether float resolution will cause similar problems (e.g. skipping certain values because resolution becomes lower with higher values), I'd rather stick to some implementation based on integers. |
offsets for fuzzing. It didn't do proper job (as per #360), so it's been replaced by something simpler. Now it's still prefering smaller values, but it's a step function-like distribution now.
I've replaced it by something simpler - c950e75 - which still has useful properties from my subjective PoV, but less of a chance of misbehaving. Feel free to discuss the topic anyway in this thread, if you'd like - it's an interesting topic, tho I worry that we might not have more substantial data to evaluate such offset/mangling strategies, than just our experience with various file data formats. And, thanks for the report and for the analysis, which was very enlightening. |
Due to an integer overflow in
mangle_getLen
, which is the function largely used to pick the offsets to corrupt in the input file, there is a hard limit of this function returning 65536. Thus, for input files larger than 64 KiB, the fuzzer no longer will be able to corrupt past the 64 KiB boundary (except for when it expands/splices past it).For example
mangle_Bit
usesmangle_getOffSet
to get the index of the byte to corrupt. This goes intomangle_getLen
which will overflow if the input size exceeds 64 KiB, thus causing a failure to generate offsets past the 64 KiB boundary.Due to the nature of the overflow, the numerator overflows but the divisor does not (except when the input exceeds ~2.52 MiB, when even the divisor will overflow when cubed). This is ultimately due to the numerator potentially being the input length to the fourth power, which overflows a 64-bit int at a 64 KiB input. While the numerator overflows and the divisor does not, the window of indices provided by
mangle_getOffSet
reduces even more, resulting in the inability to corrupt even past the first 16 bytes of an input when the input size is 1 MiB.Here's the possible indices returned, and thus the indices which can be used to corrupt the input based on the input file size.
The text was updated successfully, but these errors were encountered: