-
-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(post): use non-greedy regular expressions #4161
Conversation
Since this PR has fixed a issue but you are not sure whether your regexp will solve that issue or not, why not adding a related test case? https://github.com/hexojs/hexo/blob/master/test/scripts/hexo/post.js |
// test for PR #4161
it('render() - adjacent tags', () => {
const content = [
'{% quote %}',
'content1',
'{% endquote %}',
'{% quote %}',
'content2',
'{% endquote %}'
].join('\n');
return post.render(null, {
content,
engine: 'swig'
}).then(data => {
data.content.trim().should.eql([
'<blockquote><p>content1</p>\n</blockquote>\n',
'<blockquote><p>content2</p>\n</blockquote>\n',
].join(''));
});
}); Test case could be designed like this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add related test case.
Thanks for reminding. In fact, when Tommy351 created the relevant code five years ago, the test cases given were incomplete: 683fd0a What's more interesting is that for multiple nested tags with the same name, the regular expression before modification does not actually match the last https://www.regextester.com/15 /\{% *(.+?)(?: *| +.*)%\}[\s\S]+?\{% *end\1 *%\}/g {% note danger %}
note text, note text, note text
{% note danger %}
note text, note text, note text
{% endnote %}
{% endnote %} I am still reading the source code to determine how Nunjucks preprocessing works. I'd appreciate it if you would help. |
I noticed this regexp is only used to escape content: https://github.com/hexojs/hexo/blob/master/lib/hexo/post.js#L48-L52 The whole post render process are looking like this:
Line 253 in a97bd2f
Hexo has built-in backtick code filter which will be executed at this time.
Line 259 in a97bd2f
Lines 282 to 283 in a97bd2f
Lines 288 to 289 in a97bd2f
|
@SukkaW Thank you for the explanation. The problem is in the second step
Lines 42 to 53 in a97bd2f
As I said earlier, Line 49 in a97bd2f
{% note danger %}
note text, note text, note text
{% note danger %}
note text, note text, note text
{% endnote %}
{% endnote %} becomes <!-- \uFFFC 0 -->
{% endnote %}
Line 50 in a97bd2f
And it becomes <!-- \uFFFC 0 -->
<!-- \uFFFC 1 --> Of course, this is not a bug, it's just a bit confusing. The test cases have been updated. |
https://runkit.com/sukkaw/5e57b5ead3f4440013a77529 I have set up a demo to show how it works. I believe it is ok to merge this PR then. |
What does it do?
Currently, the regular expression
rSwigFullBlock
used in theescapeAllSwigTags
method inlib/hexo/post.js
may incorrectly match the Swig / Nunjucks block. For example, the user writes the following in a markdown document{% note danger %}note text, note text, note text{% endnote %} ## Title {% note danger %}note text, note text, note text{% endnote %}
Then, this regular expression will match all three lines, starting from
{% note danger %}
in the first line to the ending{% endnote %}
in the third line, which means that## Title
in the second line is only processed by Nunjucks but not by the Markdown renderer, so it appears as## Title
instead of<h2>Title</h2>
in the output HTML.By using non-greedy regular expression, the first and third line will be matched separately to get the correct results.
I'm not an expert in regular expressions, so I can't guarantee whether this change will cause other bugs. Suggestions are welcome, thank you!
Related issues:
iissnan/hexo-theme-next#1674
iissnan/hexo-theme-next#1678
iissnan/theme-next-docs#163
theme-next/hexo-theme-next#839
How to test
Screenshots
Pull request tasks