Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Possible Denial of Service #667

Closed
Akashkarmakar787 opened this issue Jun 8, 2023 · 3 comments
Closed

[BUG] Possible Denial of Service #667

Akashkarmakar787 opened this issue Jun 8, 2023 · 3 comments

Comments

@Akashkarmakar787
Copy link

Akashkarmakar787 commented Jun 8, 2023

What I tried to do

What I expected to happen

  • solve the statement without taking much time.

What actually happened

Versions of i18n, rails, and anything else you think is necessary

  • <=1.4.1. i18n affected.

Bonus points for providing an application or a small code example which reproduces the issue.

Screenshot 2023-06-08 at 11 34 32 AM Thanks! ❤️
kbrock added a commit to kbrock/i18n that referenced this issue Jun 13, 2023
From what I can see, this is done in linear time: 4*O(n)
This tokenizer change converts that to something a little quicker: 3*O(n)

Seems that not using a capture group and something other than split would be a big win.
Other than that, the changes were meager.

I used https://regex101.com/ (and pcre2) to evaluate the cost of the TOKENIZER.
I verified with cruby 3.0.6

```
 /(%%\{[^\}]+\}|%\{[^\}]+\})/ =~ '%{{'*9999)+'}'

/(%%\{[^\}]+\}|%\{[^\}]+\})/ ==> 129,990 steps
/(%?%\{[^\}]+\})/            ==> 129,990 steps
/(%%?\{[^\}]+\})/            ==>  99,992 steps (simple savings of 25%) <===
/(%%?\{[^%}{]+\})/           ==>  89,993 steps (limiting variable contents has minimal gains)
```

Also of note are the null/simple cases:

```
/x/ =~ '%{{'*9999)+'}'
/x/                          ==>  29,998 steps
/(x)/                        ==>  59,996 steps
/%{x/                        ==>  49,998 steps
/(%%?{x)/                    ==>  89,993 steps
```

And the plain string that doesn't fair too much worse than the specially crafted string

```
/x/ =~ 'abb'*9999+'c'

/x/                          ==>  29,999
/(%%?{x)/                    ==>  59,998
/(%%?\{[^\}]+\})/            ==>  59,998
/(%%\{[^\}]+\}|%\{[^\}]+\})/ ==>  89,997
```

per ruby-i18n#667
kbrock added a commit to kbrock/i18n that referenced this issue Jun 13, 2023
From what I can see, this is done in linear time: 4*O(n)
This tokenizer change converts that to something a little quicker: 3*O(n)

Seems that not using a capture group and something other than split would be a big win.
Other than that, the changes were meager.

I used https://regex101.com/ (and pcre2) to evaluate the cost of the TOKENIZER.
I verified with cruby 3.0.6

```
 /(%%\{[^\}]+\}|%\{[^\}]+\})/ =~ '%{{'*9999)+'}'

/(%%\{[^\}]+\}|%\{[^\}]+\})/ ==> 129,990 steps
/(%?%\{[^\}]+\})/            ==> 129,990 steps
/(%%?\{[^\}]+\})/            ==>  99,992 steps (simple savings of 25%) <===
/(%%?\{[^%}{]+\})/           ==>  89,993 steps (limiting variable contents has minimal gains)
```

Also of note are the null/simple cases:

```
/x/ =~ '%{{'*9999)+'}'
/x/                          ==>  29,998 steps
/(x)/                        ==>  59,996 steps
/%{x/                        ==>  49,998 steps
/(%%?{x)/                    ==>  89,993 steps
```

And comparing against a the plain string of the same length.

```
/x/ =~ 'abb'*9999+'c'

/x/                          ==>  29,999
/(%%?{x)/                    ==>  59,998
/(%%?\{[^\}]+\})/            ==>  59,998
/(%%\{[^\}]+\}|%\{[^\}]+\})/ ==>  89,997
```

per ruby-i18n#667
@radar radar closed this as completed Jun 21, 2023
@radar
Copy link
Collaborator

radar commented Jun 21, 2023

Believe these will be fixed by #668 and #669

@Akash-Karmakar-e3082
Copy link

Akash-Karmakar-e3082 commented Jun 21, 2023

@radar @kbrock Can we have a CVE for the issue?

@kbrock
Copy link
Contributor

kbrock commented Jul 13, 2023

@Akash-Karmakar-e3082 you passed in a 40k character string, and it was going through character by character.

It had 4 checks per character, so it was 40k * 3 checks.
It did not have any major backtraces, so it did not exhibit the O(n^2) slowdown like you had in your referenced article.

If you have reason to believe that this with exponentially take longer, please share how you came to that conclusion.

Also, can you share a quick example of why someone would localize content that was entered by a user? (An admin does not count. if the admin is nefarious, then they may as well just format the disk and be done)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants