Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve deposit processing performance #4082

Merged
merged 3 commits into from
Sep 7, 2022
Merged

Conversation

etan-status
Copy link
Contributor

When there are a lot of deposits, we decompress the public key into a crypto cache. To avoid having those caches grow unreasonably big, make sure to operate on the decompressed pubkey instead.

When there are a lot of deposits, we decompress the public key into a
crypto cache. To avoid having those caches grow unreasonably big,
make sure to operate on the decompressed pubkey instead.
@@ -537,6 +537,15 @@ func getImmutableValidatorData*(validator: Validator): ImmutableValidatorData2 =
pubkey: cookedKey.get(),
withdrawal_credentials: validator.withdrawal_credentials)

func getImmutableValidatorData*(
deposit: DepositData): Opt[ImmutableValidatorData2] =
let cookedKey = deposit.pubkey.load()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the operation that is slow ;)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but if not done here, it additionally ends up in the blsVerify cache which is unbounded and never pruned.

Screenshot 2022-09-07 at 09 45 19

Screenshot 2022-09-07 at 09 45 51

let cookedKey = validator.pubkey.load() # Loading the pubkey is slow!
doAssert cookedKey.isSome,
"Cannot parse validator key: " & toHex(validator.pubkey)
let cookedKey = validator.pubkey.loadValid() # `Validator` has valid key
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this effectively skips blst_p1_affine_in_g1 just to assert :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Validator instances may come from an untrusted source (like the REST API or an invalid checkpoint sync state) - as long as this function is public, there's no guarantee that the pubkey inside is "valid enough"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's an odd function in general - I think it's worth rethinking this flow in general and see if there's low-hanging room for improvement

@mratsim
Copy link
Contributor

mratsim commented Sep 7, 2022

To give some figures on the pubkey deserialization speed.

Source mratsim/constantine#183

image

Pubkey deserialization without check: 11us
Pubkey deserialization with checks: 45 ns (ratio ~4.5x)

This is the same speed in Constantine, which will allow me to extrapolate uncompressed deserialization speed

image

Compression explanation

A BLS12-381 G1 (pubkey) point has coordinate (x, y) with formula: y² = x³ + 4, when you compress, you only store the x coordinate and because sqrt(x³+4) has 2 solutions, you can store which y you refer to (+y or -y) in a single bit. Thankfully BLS12-381 only needs 381-bit and so we have 3 extra bits for metadata (since we can only store bytes, so 48 bytes = 384 bits).

Difference between compressed and uncompressed speed

So the differences between compressed and uncompressed representation are:

  • The size, compressed is 2x smaller
  • The sqrt operation which cost 11us in Constantine, and I assume about the same in BLST (I actually have optimizations to save ~15% compared to BLST but in this particular use-case they are irrelevant)
    image

Conclusion

99.9% of deserialization cost without checks is due to compression and we can make it virtually instant with uncompressed pubkeys

@arnetheduck
Copy link
Member

we can make it virtually instant with uncompressed pubkeys

unfortunately we don't have uncompressed deposit pubkeys - so the only thing we can do with deposits that I know is to batch-verify all deposits in a block (unless there's a way to batch-load keys as well?)..

oh, and as etan points out, we could avoid decompressing the keys twice, but that's probably a non-trivial flow to introduce

@arnetheduck
Copy link
Member

hm, maybe we can have a variant pubkey type that holds either compressed or uncompressed key

@mratsim
Copy link
Contributor

mratsim commented Sep 7, 2022

(unless there's a way to batch-load keys as well?)..

We can load them in parallel but there is no math / cryptography or number-theoretic construct to improve batch-load like we can batch-verify.

hm, maybe we can have a variant pubkey type that holds either compressed or uncompressed key

In terms of memory, the variant take as much space as the bigger type + 1 byte.
Also while in the previous case we were aligned, due to the runtime variant tag, the key will be unaligned, on x86 unaligned load aren't a problem, unsure about ARM.

Anyway the question becomes, are there cases where we would load a compressed pubkey without using it right away and so we want to delay paying that cost?

@etan-status
Copy link
Contributor Author

The flow here is (1) deposit processing, as part of which pubkey is fully loaded and checked; (2) create Validator from it which contains compressed form once more; (3) when the entire block checks out, as part of saving it to DB, load the pubkeys once more to have efficient DB access from then onward.

In this PR, (1) is modified to not pollute the last-resort crypto key cache, and (3) is modified to use the decompression without repeating the entire verification process.

@arnetheduck arnetheduck enabled auto-merge (squash) September 7, 2022 17:46
@mratsim
Copy link
Contributor

mratsim commented Sep 7, 2022

(1) is modified to not pollute the last-resort crypto key cache

If someone becomes a validator they will attest once every 6 min anyway so it will be cached and we only delay the inevitable, or did I miss something?

@etan-status
Copy link
Contributor Author

There is a separate crypto key cache when using ValidatorPubKey with blsVerify. We have a separate cache of CookedKey in beacon_chain_db that already will hold validator keys. The changes here avoid the use of the separate crypto key cache in addition to the DB cache.

@arnetheduck arnetheduck enabled auto-merge (squash) September 7, 2022 18:49
@arnetheduck arnetheduck merged commit 0191225 into unstable Sep 7, 2022
@arnetheduck arnetheduck deleted the dev/etan/pf-deposits branch September 7, 2022 18:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants