Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DoS risk: small regex gives: panicked at 'tried to unwrap expr from HirFrame, got: Concat' #465

Closed
PaulGrandperrin opened this issue Apr 14, 2018 · 10 comments
Labels

Comments

@PaulGrandperrin
Copy link

Hi @BurntSushi ,
Sorry to bother you :-) another one!

regex::Regex::new("(?i)(?i)*?");
thread 'main' panicked at 'tried to unwrap expr from HirFrame, got: Concat', /root/.cargo/registry/src/gh.neting.cc-1ecc6299db9ec823/regex-syntax-0.5.3/src/hir/translate.rs:203:18

playground

@robertswiecki : honggfuzz is really good! libFuzzer has been unable to find those even though I'm sure many people ran it for many hours/days on this code.

@robertswiecki
Copy link

Interesting! Seems like we (libfuzzer, afl, honggfuzz) might be using complementary strategies after all.

@PaulGrandperrin
Copy link
Author

I guess so. I'll try to help on the rust front to make it easier to fuzz with all three.

@BurntSushi
Copy link
Member

Thanks! This is another good one, and I thought I had a test case for this. I'll look into it, but won't be able to fix it until tomorrow.

Is it possible to batch these bugs? I wouldn't be surprised if you found more, and it'd make more sense to just throw as many as you can over the wall so I can fix them in one go.

libFuzzer has been unable to find those even though I'm sure many people ran it for many hours/days on this code.

To be clear, the regex crate got a rewrite of its regex-syntax crate a few months ago. So if you're the first to run a fuzzer against that rewrite (I haven't), then you'll be the first to pick at the low hanging fruit. :-)

@BurntSushi BurntSushi added the bug label Apr 15, 2018
@PaulGrandperrin
Copy link
Author

Is it possible to batch these bugs?

Yes, of course, it's just that I didn't know there was a recent rewrite so I didn't expect to find many bugs if any.

I ran the fuzzer for a couple of hours and it just found another one:

regex::Regex::new("(?i)?");
// thread 'main' panicked at 'called `Option::unwrap()` on a `None` value', libcore/option.rs:335:21

I think that's it for the most obvious ones. But I'll improve a bit the process and continue to fuzz for a small while.

Do you have any recommendation of APIs that might need more scrutiny than others?

@BurntSushi
Copy link
Member

Do you have any recommendation of APIs that might need more scrutiny than others?

Traditionally, fuzzers have found bugs by comparing the outputs of the different execution engines, but it requires dealing with internal APIs. There are a few open issues detailing bugs in that space.

With respect to the parser rewrite though, no, I think Regex::new is exactly the thing you want to test. The other piece is printing the error message returned by Regex::new when given an invalid pattern, but it looks like your setup already accounts for this. If you test those two things, then that should cover most or all of the rewrite.

@BurntSushi
Copy link
Member

I will let this issue sit for a few days before diving into bug fixing mode. :-)

@PaulGrandperrin
Copy link
Author

Ok, I'll leave it to fuzz for a few days then.
Could you confirm that this code is enough to cover the code rewrite?

pub fn fuzz_regex(data: &[u8]) {
    if let Ok(data) = std::str::from_utf8(data) {
        let _ = regex::Regex::new(data);
    }
}

It seems to me that it's not mandatory to display the error (if any) because I didn't see any real logic in the Debug and Display implementation of regex::Error. I guess the new code printing the message is executed while constructing the regex::Error so if it panics, we'll see it.

I've also configured the fuzzer to generate input with a maximum size of 20bytes, does it looks like a good size to you?

@BurntSushi
Copy link
Member

Yup, that code looks good to me. And yes, the formatter is indeed run when the regex::Error is built.

20 bytes is probably good. At least, I can't think of any specific reason to go bigger.

You might also consider adding (?x) to the beginning of patterns, if that's easy to do. Alternatively, enable ignore_whitespace, which should accomplish the same thing. I bet that will uncover more bugs. :-)

@PaulGrandperrin
Copy link
Author

Great, I'll do that.

@PaulGrandperrin
Copy link
Author

Ok, I think you can go ahead and go into bug fixing mode whenever you want :-)
After almost 1 billion iterations, honggfuzz stopped being able to find new coverage.
I then tried libFuzzer, but it was unusable because it cannot continue fuzzing once it finds one crash.
And AFL was just too slow and wasn't able to deduplicate similar crashes so its output would have been unusable.
So it didn't find anything new and I'm stopping fuzzing.

killercup added a commit to killercup/targets that referenced this issue Apr 27, 2018
Run it with `cargo run target regex_syntax`

Finds the following inputs that yield panics, some of which are also
mentioned in [1]:

- `(?i)?i�`
- `(?m)?`
- `CBh~62�Y((?i))??i�
- `(?m)?90`
- `(?i)?`

[1]: rust-lang/regex#465
killercup added a commit to killercup/targets that referenced this issue Apr 27, 2018
Run it with `cargo run target regex_syntax`

Finds the following inputs that yield panics, some of which are also
mentioned in [1]:

- `b"(?m)?"`�
- `b"(?i)?i\x0e"`
- `b"CBh~62\x17Y((?i))??i\x0e"`�
- `b"(?m)?90"`
- `b"(?i)?"`

[1]: rust-lang/regex#465
killercup added a commit to killercup/targets that referenced this issue Apr 27, 2018
Run it with `cargo run target regex_syntax`

Finds the following inputs that yield panics, some of which are also
mentioned in [1]:

- `b"(?m)?"`
- `b"(?i)?i\x0e"`
- `b"CBh~62\x17Y((?i))??i\x0e"`
- `b"(?m)?90"`
- `b"(?i)?"`

[1]: rust-lang/regex#465
bors bot added a commit to rust-fuzz/targets that referenced this issue Apr 27, 2018
112: Add regex syntax target r=PaulGrandperrin a=killercup

Add regex syntax target

Run it with `cargo run target regex_syntax`

Finds the following inputs that yield panics, some of which are also
mentioned in [1]:

- `b"(?m)?"`
- `b"(?i)?i\x0e"`
- `b"CBh~62\x17Y((?i))??i\x0e"`
- `b"(?m)?90"`
- `b"(?i)?"`

[1]: rust-lang/regex#465

Co-authored-by: Pascal Hertleif <killercup@gmail.com>
BurntSushi added a commit that referenced this issue Mar 30, 2019
This fixes a bug where the HIR translator would panic on regexes such as
`(?i){1}` since it assumes that every repetition operator has a valid
sub-expression, and `(?i)` is not actually a sub-expression (but is more
like a directive instead).

Previously, we fixed this same bug for *uncounted* repetitions in commit
17764ff (for bug #465), but we did not fix it for counted repetitions.
We apply the same fix here.

Fixes #555
BurntSushi added a commit that referenced this issue Mar 30, 2019
This fixes a bug where the HIR translator would panic on regexes such as
`(?i){1}` since it assumes that every repetition operator has a valid
sub-expression, and `(?i)` is not actually a sub-expression (but is more
like a directive instead).

Previously, we fixed this same bug for *uncounted* repetitions in commit
17764ff (for bug #465), but we did not fix it for counted repetitions.
We apply the same fix here.

Fixes #555
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants