-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor the escape() function to improve performance 10-20% #975
Conversation
I know this sounds kinda silly, but can we stick to the present coding style? This almost looks like a different language. |
Okey, I changed the style code, it provided me with VS Code through auto formatting. Also, I replaced |
lib/marked.js
Outdated
"'": ''' | ||
}; | ||
|
||
var escapeTestNoEncode = /(?:[<>"']|&(?!#?\w+;))/; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no need to use grouping to wrap the whole thing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, I fixed it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Y'all are awesome! Thank you.
lib/marked.js
Outdated
@@ -1084,13 +1084,33 @@ Parser.prototype.tok = function() { | |||
* Helpers | |||
*/ | |||
|
|||
var escapeTest = /[&<>"']/; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These should be declared inside escape() IMO
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, marked do not have to recreate the same RegExp
instance every call escape()
. This reduces performance and increases memory usage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I made them static.
Actually, I've found there's no advantage in testing before replacing, you still have to scan the whole thing at least once, either by testing or replacing, so the first phase is pretty useless. # with current changes:
$ node test --bench
marked completed in 8388ms.
marked (gfm) completed in 9380ms.
marked (pedantic) completed in 8315ms.
Could not bench robotskirt.
Could not bench showdown.
Could not bench markdown.js.
# without testing first:
$ node test --bench
marked completed in 8394ms.
marked (gfm) completed in 9286ms.
marked (pedantic) completed in 8045ms.
Could not bench robotskirt.
Could not bench showdown.
Could not bench markdown.js. And you can spare some line of code: function escape(html, encode) {
if (encode) {
return html.replace(escape.replace, function (ch) {
return escape.replacements[ch];
});
} else {
return html.replace(escape.replaceNoEncode, function (ch) {
return escape.replacements[ch];
});
}
}
escape.replace = /[&<>"']/g;
escape.replaceNoEncode = /[<>"']|&(?!#?\w+;)/g;
escape.replacements = {
'&': '&',
'<': '<',
'>': '>',
'"': '"',
"'": '''
}; |
My first function escape(html, encode) {
if (encode) {
return html.replace(escape.escapeReplace, function (ch) { return escape.replacements[ch] });
}
else {
return html.replace(escape.escapeReplaceNoEncode, function (ch) { return escape.replacements[ch] });
}
return html;
} I run this code: node test -t Three times:
My second function escape(html, encode) {
if (encode) {
if (escape.escapeTest.test(html)) {
return html.replace(escape.escapeReplace, function (ch) { return escape.replacements[ch] });
}
}
else {
if (escape.escapeTestNoEncode.test(html)) {
return html.replace(escape.escapeReplaceNoEncode, function (ch) { return escape.replacements[ch] });
}
}
return html;
} Run three times:
|
Looks like this would make
|
In my benchmarks, remarkable is faster and more economical than |
Yeah. @worker8's independent benchmark sample has remarkable at the top as well. @KostyaTretyak: Just to make sure. They can compete with marked with large files >2mb - versus the can not? I think if we do what in #746 - we will be able to see areas for optimization easier. Right now we kinda have the large class thing happening. |
In |
Interesting. Of course, if they're (or we're) targeting web developers - most folks aren't going to need to go above that. Maybe marked is the "large file" parser. :) Thinking of something like LeanPub - parse an entire book in markdown. |
No, it is a favorite when files are smaller than 2MB. If the files are bigger, then Not for the sake of advertising, just for you to see it clearly. Do the following: git clone https://github.com/KostyaTretyak/marked-ts.git
cd marked-ts
npm install
npm run compile And then you can: npm run bench -- -l 1000 Where |
Thanks! That's an interesting trick...might interesting for us to add to the CLI...if I'm understanding correctly: I can secify how large of a file. Kind of like lipsum https://lipsum.lipsum.com - generate Markdown of X size to run the bench against. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
Is there a way to force push to invoke travis unit tests? |
@KostyaTretyak if you can rebase this PR we should be able to merge it. git fetch upstream && git rebase upstream/master && git push -f |
I rebased and tested locally, and everything worked fine. |
Nice! I ran benchmarks locally and this is the before and after: Before
After
|
@UziTech, done: git fetch upstream && git rebase upstream/master && git push -f |
@KostyaTretyak Thanks! Can you fix this lint error 😄 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent! 🎉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome work @KostyaTretyak
No description provided.