Quadratic random from `mangle_getLen` overflows causing failure to corrupt past 64 KiB #360

gamozolabs · 2020-08-04T08:04:14Z

Due to an integer overflow in mangle_getLen, which is the function largely used to pick the offsets to corrupt in the input file, there is a hard limit of this function returning 65536. Thus, for input files larger than 64 KiB, the fuzzer no longer will be able to corrupt past the 64 KiB boundary (except for when it expands/splices past it).

For example mangle_Bit uses mangle_getOffSet to get the index of the byte to corrupt. This goes into mangle_getLen which will overflow if the input size exceeds 64 KiB, thus causing a failure to generate offsets past the 64 KiB boundary.

Due to the nature of the overflow, the numerator overflows but the divisor does not (except when the input exceeds ~2.52 MiB, when even the divisor will overflow when cubed). This is ultimately due to the numerator potentially being the input length to the fourth power, which overflows a 64-bit int at a 64 KiB input. While the numerator overflows and the divisor does not, the window of indices provided by mangle_getOffSet reduces even more, resulting in the inability to corrupt even past the first 16 bytes of an input when the input size is 1 MiB.

Here's the possible indices returned, and thus the indices which can be used to corrupt the input based on the input file size.

The text was updated successfully, but these errors were encountered:

robertswiecki · 2020-08-04T08:18:11Z

Thanks for analyzing this. Yeah.. big fail on my part. Will fix soon!

gamozolabs · 2020-08-04T11:10:34Z

Hehe, no problem. I've had my fair share of silly mistakes!

I've been porting some of the mutators in Rust and based on the input size (and what turns into x^4), I either use a u32, u64, or u128 for the exprand.

Historically for fuzzing I've used rand() % (rand() % size + 1) which has some exponential properties, but it doesn't have as nice of a shape as the honggfuzz rand. Honggfuzz exprand converges to no worse than 1/2 the chance of a uniform probability which is super nice.

I'm not sure how viable it is, but I've also used the following, which gives some exponential bias, but never can be more than 2x worse than uniform distribution. There's also no risk of overflow or div by zero if size > 0:

let rand = if rand() % 2 == 0 {
    // "exp" rand
    rand() % (rand() % size + 1)
} else {
    // Uniform rand
    rand() % size
}

Here's what the difference between honggfuzz rand and my rand is. The x axis is the byte offset, and the Y axis is the ratio of selection compared to uniform selection. Thus, y=0.5 means that value is selected half as frequently as it would be when uniformly generated , and y=20 means it's observed 20x more frequently than if it was a uniform distribution.

They seem fairly similar, but there's definitely enough of a difference that it could matter.

vanhauser-thc · 2020-08-04T11:22:31Z

rand() uses futexes and that hurts performance a lot. there PRNGs that are faster, have better randomness and don't bottleneck.
in afl++ we use a xoroshiro variant, but random123 or a PCG variant are fine too.

gamozolabs · 2020-08-04T11:24:27Z

Oh yeah, when I say rand(), I mean a random function. I don't think anyone here is using libc rand(). The cost of nearly any rand implementation (xorshift64, xoroshiro, doesn't really matter) is "free" WRT the cost of the modulo, it shouldn't matter at all.

vanhauser-thc · 2020-08-04T12:43:04Z

I don't think anyone here is using libc rand().

well, guess what google afl is using :)

robertswiecki · 2020-08-04T14:27:29Z

Of course, I will fix it, though this approach to favoring lower offsets vs others is at best handwavy and based on my experience than any solid data at hand :)

The idea from #360 (comment) seems interesting since "looks" like something we'd want. I will experiment a bit with it and implement this or something similar.

I've looked through various exprand implementations in other projects, and they seem to be mostly implemented using floats, and since I don't want to spend too much time on analyzing whether float resolution will cause similar problems (e.g. skipping certain values because resolution becomes lower with higher values), I'd rather stick to some implementation based on integers.

offsets for fuzzing. It didn't do proper job (as per #360), so it's been replaced by something simpler. Now it's still prefering smaller values, but it's a step function-like distribution now.

robertswiecki · 2020-08-04T19:26:04Z

I've replaced it by something simpler - c950e75 - which still has useful properties from my subjective PoV, but less of a chance of misbehaving.

Feel free to discuss the topic anyway in this thread, if you'd like - it's an interesting topic, tho I worry that we might not have more substantial data to evaluate such offset/mangling strategies, than just our experience with various file data formats.

And, thanks for the report and for the analysis, which was very enlightening.

robertswiecki closed this as completed Sep 3, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Quadratic random from `mangle_getLen` overflows causing failure to corrupt past 64 KiB #360

Quadratic random from `mangle_getLen` overflows causing failure to corrupt past 64 KiB #360

gamozolabs commented Aug 4, 2020 •

edited

Loading

robertswiecki commented Aug 4, 2020

gamozolabs commented Aug 4, 2020 •

edited

Loading

vanhauser-thc commented Aug 4, 2020

gamozolabs commented Aug 4, 2020 •

edited

Loading

vanhauser-thc commented Aug 4, 2020

robertswiecki commented Aug 4, 2020

robertswiecki commented Aug 4, 2020 •

edited

Loading

Quadratic random from mangle_getLen overflows causing failure to corrupt past 64 KiB #360

Quadratic random from mangle_getLen overflows causing failure to corrupt past 64 KiB #360

Comments

gamozolabs commented Aug 4, 2020 • edited Loading

robertswiecki commented Aug 4, 2020

gamozolabs commented Aug 4, 2020 • edited Loading

vanhauser-thc commented Aug 4, 2020

gamozolabs commented Aug 4, 2020 • edited Loading

vanhauser-thc commented Aug 4, 2020

robertswiecki commented Aug 4, 2020

robertswiecki commented Aug 4, 2020 • edited Loading

Quadratic random from `mangle_getLen` overflows causing failure to corrupt past 64 KiB #360

Quadratic random from `mangle_getLen` overflows causing failure to corrupt past 64 KiB #360

gamozolabs commented Aug 4, 2020 •

edited

Loading

gamozolabs commented Aug 4, 2020 •

edited

Loading

gamozolabs commented Aug 4, 2020 •

edited

Loading

robertswiecki commented Aug 4, 2020 •

edited

Loading