-
Notifications
You must be signed in to change notification settings - Fork 140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inconsistent behavior of multiple backticks #166
Comments
Markdown doesn't ensure output is secure in any way by design. It is allowing HTML so you don't need to craft it like that, just use |
I don't think escaping HTML is the job of markdown processing library. |
@samdark then why does it escape it if the payload isnt crafted? |
It's by design of markdown: https://daringfireball.net/projects/markdown/syntax#autoescape |
That's expected. You should process result of markdown conversion with something like http://htmlpurifier.org/ if you want to allow users to enter text. |
@samdark I shouldn't need to rely on a second library because one of them doesn't do its job properly |
Please re-read markdown specification. It says explicitly that it allows any HTML by design and it's unsafe by definition to allow users to enter markdown w/o further escaping/cleanup. |
It's not the job of the markdown parser to escape/cleanup output. |
The point of a parser is to render the data in a safe way. You're saying I should rely on a second library to render the data in a safe way, why should I have to do that when this parser is already here and should do that for me? |
No since it's a markdown parser and markdown wasn't meant to be safe.
Yes.
This parser converts markdown to HTML. Markdown, by definition, allows any HTML and that is unsafe if you allow users to enter data you render. There are valid use cases for non-filtered HTML. For example, you can allow full markdown for admin only and that's huge flexibility. You can do things like this by embedding HTML and CSS into blog post. |
@cebe I think security topic should be emphasized in readme pointing to HTMLPurifier. |
@samdark I think you should incorporate HTMLPurifier into the code itself if that is what would be needed to fix the XSS issues in it. That would probably save you guys a lot of headache along with people like me who find bugs in stuff that they use sometimes. Thank you for your input I'll wait for @cebe |
Thanks for the detailed report. As pointed out by @samdark this is due to the nature of markdown and not a bug. You may remove HTML support by creating your own Markdown flavor class, which does not render the HTML, but the safest solution is to use HTML Purifier or similar tools. I added a section in the README about this: https://github.com/cebe/markdown/blob/master/README.md#security |
@cebe This still does not explain why it mitigates simplistic attacks such as:
into:
If you are saying that the design of the parser allows this, then the simplistic attacks as stated would not work. |
Code tags are expected to display the raw text inside them, that means every HTML inside code tags is escaped properly. In your case above you had multiple code tags: |
So then if that is the case, there is a bug in your code with multiple backticks. |
Well, the rendering is technically correct according to the markdown spec. If you use multiple backticks you need to leave space before and after the code. But it seems other parsers are doing it differently.... https://johnmacfarlane.net/babelmark2/?normalize=1&text=something+%60%60%60%3Cscript%3E%60%60%60 |
related to #99 |
I got the bug tag, never been more happy in my life lol |
Any idea when we will have the 1.2.1? Thanks, |
I just stumbled across CVE-2018-1000874 referencing this issue and wanted to leave a note with my thoughts. I disagree with the CVE-2018-1000874 being assigned as an XSS vulnerability in Dispute of Vulnerability CVE-2018-1000874I agree with @samdark that it is not the responsibility of the markdown processor to perform any escaping, cleanup, or sanitization outside of anything that might be required by the markdown specification of each respective markdown flavor. Such additional escaping, cleanup, and sanitization would be no different than expecting Asking for the markdown parser to also perform sanitization isn't necessarily an invalid request, but it is a feature request, not a vulnerability bug to fix. However, since good sanitization libraries already exist, are easy to integrate, and have no coupling with the act of parsing markdown, it seems unnecessary and unwise to build such functionality into the parser. Backtick ParsingI readily concede that the Traditional Markdown specifications do not unambiguously describe the case demonstrated in this issue with respect to multiple backticks, but I think the current behavior is incorrect, and I agree with @cebe's renaming of the issue from "Cross site scripting vulnerability" to "Inconsistent behavior of multiple backticks". Traditional Markdown SpecificationThe relevant portion of the Traditional Markdown specification says:
It goes on to show an example that begins and ends with two backticks with a single backtick contained inside: ``There is a literal backtick (`) here.`` produces: <p><code>There is a literal backtick (`) here.</code></p> InterpretationMy interpretation is that "you can use multiple backticks" seems to allow any number N of backticks to be selected as the open/close delimiter and that you are not required to have a string of N-1 backticks contained within them for the N backticks to be considered open/close delimiters. CommonMark Specification for ComparisonThis interpretation is consistent with the CommonMark specification, which attempts to provide a "standard, unambiguous syntax specification for Markdown." From the Code spans section of the specification:
Result of InterpretationThe result of this interpretation is that on the respective Payload lines, Payload: `<script>alert(1);</script>`
Payload: ``<script>alert(1);</script>``
Payload: ```<script>alert(1);</script>``` produces: <p>Payload: <code><script>alert(1);</script></code></p>
<p>Payload: <code><script>alert(1);</script></code></p>
<p>Payload: <code><script>alert(1);</script></code></p> |
Hi, thanks for the detailed post. I was not aware of a CVE being assigned to this issue and I agree that this is not a security issue. The documentation in the README explains this situation in detail: https://github.com/cebe/markdown#security-considerations- I have requested the CVE to be rejected. |
No problem. I also put in a request and just received this response from Mitre:
|
This is currently being reported by snyk It does mention it's "fixed" by way of the readme edit but I guess because these commits aren't published as a version yet it's still flagging it. Might this get a push to a 1.2.1.1 / 1.2.2 so we can clear this up too? |
I agree with the dispute, anyone coming here because of the CVE use a purification system before allowing anything from end users. They tend to do stupid stuff. |
I have contacted snyk and asked them to remove the report. |
FYI: The issue is not reported by snyk anymore. |
Issue
There is a reflected and/or stored xss vulnerability (depending on how the markdown is parsed from user input or from a user uploaded file) from a crafted use of backticks, in all of the following parsers:
GithubMarkdown
Markdown
MarkdownExtra
How?
The vulnerability occurs when a user crafts a malicious payload with characters before a 3 backtick wrapped payload, thus bypassing the parser escape. For example, here is an image of the payloads crafted with single, double, and triple backticks:
And here is an image of the payloads rendered:
As you can see when the payload is crafted correctly using three backticks, the parser will render it as a script, this can allow malicious individuals to render scripts within a
.md
file or within a text box on any platform that is using this as the markdown parser. An example of a ran script:Impact
Doing a quick search on Github for the code that enables your parser:
new \cebe\markdown\
. I get this many results:The vulnerability can be either stored using an
.md
file (README
for example), or reflected if the markdown parser is just parsing the user input text. Malicious attackers can use this method to steal sensitive user data. For example to steal a users cookies:This can allow serious impacts on not only the end users using the site, but the reputation of the website as well.
Proof of Concept
User input
You can use the following code for a PoC on user entered text:
MD file
And you can use the following code for a PoC on text read from an MD file:
The text was updated successfully, but these errors were encountered: