Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hashes for Slots in ScriptableObject should be seeded #290

Closed
botic opened this issue Dec 18, 2016 · 10 comments
Closed

Hashes for Slots in ScriptableObject should be seeded #290

botic opened this issue Dec 18, 2016 · 10 comments

Comments

@botic
Copy link

botic commented Dec 18, 2016

As far as I can interpret the slot allocation in ScriptableObject it uses String.hashCode() to generate a hash number, which then is used to determine a slot for the value. This makes it easy to exploit with an algorithmically attack.

Take the following example in RingoJS:

function loadJSON(file) {
   const start = Date.now();
   const data = require("./" + file);
   console.log("File:", file, "Time: " + (Date.now() - start) + " ms");
}

loadJSON(require("system").args[1]);

For a 20 MB huge JSON with randomized keys it takes under a second to load. An attacker could use colliding hash codes since it's easy to generate them with the knowledge that slots are allocated with String.hashCode(). Just an example sequence of colliding hashes: AaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAa, AaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaBB, AaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaBBAa, ……… Then the example above takes 20 minutes on my machine. And just imagine an application calling JSON.parse() without any bounds for the request's body.

To fix this problem Java changed its hash-based data structure's implementation with Java 7: https://vaskoz.wordpress.com/2013/04/06/java-7-hashing-drastically-better-than-java-6/

You can find the full code here: https://gist.github.com/botic/d409564dca727eea8086fb6a1cb00623

Further infos / links:

@davidmann4
Copy link

maybe useful lib if you want to find some collisions; (I guess you need to modify the hashing algorithm)
https://github.com/hastebrot/V8-Hash-Collision-Generator

@gbrail
Copy link
Collaborator

gbrail commented Dec 19, 2016

Thanks -- that's an important issue. There seem to be very few people able to work actively on Rhino right now. This looks like a great issue for a new author who might be looking for a way to contribute to a new project.

@botic
Copy link
Author

botic commented Jan 4, 2017

I started with including a public domain Java-aware (chars != bytes) port of MurmurHash3: https://github.com/botic/rhino/tree/issue290-seededhashes

Various versions with more or less adaptions of the implementations can be found all over Github. Just some of them:

Feedback / other ideas welcome, since this is the first time I work directly on Rhino ;)

@whitlockjc
Copy link
Contributor

@botic What's stopping you from creating a PR from this? I think your approach is sound. I was just about to do this work myself when I saw you had done it already.

@gbrail
Copy link
Collaborator

gbrail commented Aug 15, 2017 via email

@botic
Copy link
Author

botic commented Aug 16, 2017

Sorry to say, but my time is extremely limited right now. I see no time to do this in the next months due my parental leave and other projects. We implemented some workarounds in RingoJS and the web packages, which avoid the issue "good enough" for the moment. Afai-remember: some unit tests went red after my change and I could not spot the source of the problems.

@gbrail
Copy link
Collaborator

gbrail commented Aug 30, 2017

Here's an update on this issue. I've done two experiments so far:

-- Used the contribution from @botic to use MurmurHash on every String object rather than the built-in hashCode() method on java.lang.String.
-- Replaced the hand-rolled hash table in the ScriptableObject class with java.lang.LinkedHashMap. (This will help because when a hash bucket has too many collisions it switches that bucket from a linked list to a balanced tree, so the worst case performance for a hash-flooding attack is much less bad.)

Here are the results of the "v8 benchmarks" in these cases. (If you are new to Rhino then you will be shocked at how bad these are in comparison to V8 or even Nashorn):

master: 88.8
MurmurHash: 60.6
java.lang.Hash: 63.2
In other words, so far we don't have a way yet to fix this problem that does not have a large performance impact.

One simple proposal would be to simply permute the hash that's generated in ScriptableObject.java by some sort of a random seed. For instance we could do the following:

  1. At class initialization time (aka once per JVM) we would generate a random 32-bit seed, possibly using SecureRandom.
  2. Use this seed to modify the output from the built-in hashCode() function. This should make it more difficult to guess a hash collision because different instances of Rhino on different VMs will hash things differently.

I'm going to try this. I'm looking for feedback on the various pros and cons from a security perspective.

@gbrail
Copy link
Collaborator

gbrail commented Sep 22, 2017

Pull request #328 addresses this problem by dividing up the hash table that ScriptableObject uses into two implementations: One with very low overhead based on the existing code, and one that uses java.util.HashMap for its collision resistance. ScriptableObject changes from one to the other when the object reaches a threshold. (Currently 2500 properties based on some testing I did.)

@botic -- thank you very much for bringing this to our attention. I used part of the test file you sent me to create a regression test. Please let me know if you have any issues with this.

@gbrail
Copy link
Collaborator

gbrail commented Nov 20, 2017

Pushed to master.

@gbrail gbrail closed this as completed Nov 20, 2017
@botic
Copy link
Author

botic commented Nov 21, 2017

Thanks, the approach sounds reasonable. Since most objects don't have 2.500 or more properties, attacks would run into the "trap".

@gbrail no, perfect that it helped!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants