-
-
Notifications
You must be signed in to change notification settings - Fork 406
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Merged by Bors] - Parser Idempotency Fuzzer #2400
Conversation
Codecov Report
@@ Coverage Diff @@
## main #2400 +/- ##
==========================================
- Coverage 38.74% 38.70% -0.05%
==========================================
Files 313 314 +1
Lines 23856 23883 +27
==========================================
Hits 9244 9244
- Misses 14612 14639 +27
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
pub struct FuzzData { | ||
pub context: Context, | ||
pub ast: StatementList, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even if the usage of the structure could be considered straightforward, adding some documentation could be useful.
let mut syms_available = Vec::with_capacity(8); | ||
for c in 'a'..='h' { | ||
syms_available.push(context.interner_mut().get_or_intern(&*String::from(c))); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason to only have these symbols? Also, I see a TODO
below requesting arbitrary string literals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes -- this creates a fixed pool of symbols to use in the AST. The AST, when generated, just throws random symbol indices in; we need to fix them to well-known (non-keyword) items in the symbol interner. Additionally, we don't want to generate them dynamically from the fuzz data because this can introduce "noise" into our sample (basically, bytes used previously to make symbols now form part of the AST or vice versa, unaligning the AST-generating bytes). This is also the reason for the TODO -- string literals from the arbitrary data could introduce quite a bit of undesirable noise, but not doing so means we don't test string parsing/interning. I decided not to for this PR because it could introduce so much noise that it would render this unusable.
use std::error::Error; | ||
use std::io::Cursor; | ||
|
||
fn do_fuzz(mut data: FuzzData) -> Result<(), Box<dyn Error>> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even if the usage of the function could be considered straightforward, adding some documentation could be useful. A link to the libfuzzer documentation or to the cargo-fuzz documentation where this usage is explained would be nice.
The merge conflict is just because the |
49ab7a4
to
2df18bf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should be better to lift the fuzz
directory into the project root, then have a subdirectory for every type of fuzzer.
I'm not sure if this is possible for cargo-fuzz, but I'll see. I seem to remember it having some trouble executing from a virtual workspace root. |
You should be able to easily. For example, on |
Yup -- the issue is that you cannot init the fuzzer directory in the virtual workspace root, but you can move it later. Interesting. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice work!
Perhaps in the future we could put this into CI? It catches problems very fast. |
Yeah! Though, we'll need a proper CI platform for it. We can't run it on every PR because it could throw random errors at any time, A CI action that runs it every hour or so would be nice. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should rebase this. I recently pushed a changed that extracted the parser from the engine into a crate, and the API of the Parser
type changed slightly; it now requires only a &mut Interner
reference to work.
This change should also speedup the fuzzer, because it would skip having to initialize the builtins in order to parse.
80bd041
to
a5350f9
Compare
I can ask if this would be appropriate for OSS-Fuzz? |
Maybe! We would need to apply for it though. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ty for the awesome contribution and your perseverance having had to redo it. ❤️
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No more nitpicking 😆 Great job!
bors r+ |
This comment was marked as outdated.
This comment was marked as outdated.
bors r+ |
This Pull Request offers a fuzzer which is capable of detecting faults in the parser and interner. It does so by ensuring that the parsed AST remains the same between a parsed source and the result of parsing the `to_interned_string` result of the first parsed source. It changes the following: - Adds a fuzzer for the parser and interner. Any issues I raise in association with this fuzzer will link back to this fuzzer. You may run the fuzzer using the following commands: ```bash $ cd boa_engine $ cargo +nightly fuzz run -s none parser-idempotency ``` Co-authored-by: Addison Crump <addison.crump@cispa.de>
Pull request successfully merged into main. Build succeeded: |
This Pull Request offers a basic VM fuzzer which relies on implied oracles (namely, "does it crash or timeout?"). It changes the following: - Adds an insns_remaining field to Context, denoting the number of instructions remaining to execute (only available when fuzzing) - Adds a JsNativeError variant, denoting when the number of instructions has been exceeded (only available when fuzzing) - Adds a VM fuzzer which looks for cases where Boa may crash on an input This offers no guarantees about correctness, only assertion violations. Depends on #2400. Any issues I raise in association with this fuzzer will link back to this fuzzer. You may run the fuzzer using the following commands: ```bash $ cd boa_engine $ cargo +nightly fuzz run -s none vm-implied ``` Co-authored-by: Addison Crump <addison.crump@cispa.de>
This Pull Request offers a fuzzer which is capable of detecting faults in the parser and interner. It does so by ensuring that the parsed AST remains the same between a parsed source and the result of parsing the
to_interned_string
result of the first parsed source.It changes the following:
Any issues I raise in association with this fuzzer will link back to this fuzzer.
You may run the fuzzer using the following commands:
$ cd boa_engine $ cargo +nightly fuzz run -s none parser-idempotency