-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: support for snapshot cheatcodes #427
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
incredibly straightforward 👌
Co-authored-by: karmacoma <karma@coma.lol>
src/halmos/cheatcodes.py
Outdated
""" | ||
Generates a snapshot ID by hashing the current state (code, storage, and balance). | ||
""" | ||
m = hashlib.sha3_256() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since we don't need the hash to be cryptographically secure for this, we'd be better off using a high quality regular hash function like xxh3_64 from https://pypi.org/project/xxhash/, about 10x faster:
# built-in sha3_256, not keccak
$ python -m timeit --setup="from hashlib import sha3_256" "sha3_256(b'').digest()"
500000 loops, best of 5: 892 nsec per loop
# xxh64, requires `pip install xxhash`
$ python -m timeit --setup="from xxhash import xxh64_digest" "xxh64_digest(b'')"
2000000 loops, best of 5: 101 nsec per loop
# xxh3_64, requires `pip install xxhash`
$ python -m timeit --setup="from xxhash import xxh3_64_digest" "xxh3_64_digest(b'')"
5000000 loops, best of 5: 98.3 nsec per loop
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for suggestion!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wdyt? 7d08e7a
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for consistency I think it would be better to avoid xxh3_128 and use xxh3_64 only. Plus that way we have an extra 64 bit reserved for some future use!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was considering 128 bits for storages to be more conservative. (Storages are more frequently updated than code and balances.) With 64 bits, the hash collision probability exceeds 10^-12 when generating 6,000 hashes (which is large, but not unrealistic when running halmos in cloud several days.) With 128 bits, however, it would take 20 trillion hashes to reach the 10^-12 collision probability.
That said, 64 bits might still be good enough for our use case. But, in that case, I'm not sure what's the benefit of using xxhash vs the builtin hash()
. The builtin hash() can take arbitrary objects as input, and also doesn't seem slower than xxhash:
$ python -m timeit --setup="from xxhash import xxh3_64_digest" "xxh3_64_digest(b'')"
5000000 loops, best of 5: 54.2 nsec per loop
$ python -m timeit "hash(b'')"
20000000 loops, best of 5: 12.6 nsec per loop
One concern with the builtin hash() is that it returns 32-bit hashes on 32-bit machines, but I guess 32-bit machines are rare these days, so it might not be a real concern? Another concern is that it's not deterministic across runs, which could be an issue if we want to support resume features. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let me merge this as is, and we can discuss hash functions separately.
b424c17
to
7d08e7a
Compare
support for state and storage snapshots:
note: two structurally identical symbolic storages/states are given the same snapshot id, but those that are semantically equivalent may not be assigned the same snapshot id. (see SnapshotTest for examples.)