-
-
Notifications
You must be signed in to change notification settings - Fork 226
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some inputs taking long time to parse #273
Comments
I tried the input
with the commonmark dingus and it reported a parse time of 1 ms. I then tried your program above. It completed in 1.682 ms. So...I can't reproduce this! Are you using the latest code from the repository? |
Hi, Yes its latest, I have tested it against 97da298. I have attached the input file produced by jazzer.js maybe that helps :) const commonmark = require('./dist/commonmark.js')
const fs = require('fs')
const data = fs.readFileSync('./timeout-64699b4ddd13b0497d6ec6ffb5f21d485800abfc.txt').toString()
console.time("test")
new commonmark.Parser().parse(data)
console.timeEnd("test") |
Following are system configs ❯ node --version
v18.18.2 ❯ npm --version
9.8.1
|
Tested it against master in github's codespace following is what it produces. @manunio ➜ /workspaces/commonmark.js (master) $ cat fuzz.js
const commonmark = require('./dist/commonmark.js')
const fs = require('fs')
const data = fs.readFileSync('./timeout-64699b4ddd13b0497d6ec6ffb5f21d485800abfc.txt').toString()
console.time("test")
new commonmark.Parser().parse(data)
console.timeEnd("test")
@manunio ➜ /workspaces/commonmark.js (master) $ node fuzz.js
test: 1:38.560 (m:ss.mmm)
@manunio ➜ /workspaces/commonmark.js (master) $ node --version
v20.8.1
@manunio ➜ /workspaces/commonmark.js (master) $ git --no-pager log --oneline | head -n 5
97da298 Track underscore bottom separately mod 3, like asterisk
df3ea1e Fix list tightness.
20b52e5 Fix "CommomMark" typo (#270)
9a16ff4 Declarations do not need a space, per the spec.
46538e5 Allow `<!doctype` to be case-insensitive. |
I couldn't reproduce it with this fuzz.js and timeout file either. Odd! Can you reproduce it on the online "try commonmark" dingus? I was using node v18.17.1, but it's hard to imagine that's the issue. |
I was not able to reproduce it with https://spec.commonmark.org/dingus/ . - var reClosingCodeFence = /^(?:`{3,}|~{3,})(?= *$)/;
+ var reClosingCodeFence = /^(?:`{3,}|~{3,})(?=[ \t]*$)/; Then i tried testing it locally against local dingus build, It was throwing following error in browser console
at Line 121 in 97da298
|
OK, I can reproduce it now. I had not regenerated |
Bisected: the problem starts with commit 28c82d8 and the next one. var HTMLCOMMENT = "<!-->|<!--->|<!--(?:[^-]+|-[^-]|--[^>])*-->" |
OK, I can see why this causes pathological behavior! |
This should fix it. Thanks for noticing the issue! |
@jgm would be interested in Fuzzing this project at oss-fuzz? It's a free service run by Google to fuzz important open source projects. as an example cmark is already being fuzzed at oss-fuzz. |
|
Fair enough , I'm too more inclined towards other type of bugs for client slide libs, like the bugs found in yaml package(see below). The above mentioned performance reports were result of running it for few minutes , and i believe that through oss-fuzz and proper seed we will be able to discover other type of bugs. As for timeouts, we can fine tune input length so that it produces less noise. Edit: |
What would be required to put it on oss-fuzz? |
Only thing i need is an mail id, so that maintainers can receiver bug reports via mail which points to a private tracker(google's monorail). Edit: |
Please note that the mail id will be part of public config at oss-fuzz for example: |
Additionally I'll submit a pr here which adds the minimal fuzzing code(which will call Parser.parse()) and add |
Hi @jgm found a similar too much recursion error, should i post it here or create a new issue ? |
I'm receiving following output with cmark. ❯ cmark timeout-0c457d4f5c4c88bd5ef8a65765f75140d786edb0
Traceback (most recent call last):
File "/home/maxx/.local/bin/cmark", line 8, in <module>
sys.exit(main())
File "/home/maxx/.local/lib/python3.10/site-packages/commonmark/cmark.py", line 34, in main
for line in f:
File "/usr/lib/python3.10/codecs.py", line 322, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xda in position 81: invalid continuation byte |
cmark output indicates that the input is not UTF-8 encoded. |
You can post it here, either way. |
const commonmark = require('./dist/commonmark.js');
const fs = require('fs');
console.time("test")
const data = fs.readFileSync('./timeout-0c457d4f5c4c88bd5ef8a65765f75140d786edb0.txt').toString()
new commonmark.Parser().parse(data)
console.timeEnd("test") |
@jgm Do you approve of me going ahead with oss-fuzz integration? |
Hi, While fuzzing using jazzer.js (for oss-fuzz)
Parser().parse()
was taking longer than exptected time to parse.The text was updated successfully, but these errors were encountered: