Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detect all Apple M* CPUs and enable the wide multiplier assembly implementations #1901

Merged
merged 1 commit into from
Oct 2, 2024

Conversation

andrewhop
Copy link
Contributor

Description of changes:

Looking at the benchmarks on https://github.com/ctz/graviola/ I noticed AWS-LC was slower than expected. Those benchmarks were performed on an M2 Mac and previously AWS-LC was only checking for M1 CPUs. This change opts all Apple M* CPUs in the alt/wide multiplier implementations. A future Apple M CPU might benefit from the non-alt implementation, but for now it seems like all M CPUs benefit from the alt implementation and this is a sane default.

Call-outs:

I think this is safe because both implementations use the same instructions, it's just a reordering. So if a future M CPU doesn't support the wide multiplier it will still work it will just be slower than if it was using the non-alt implementation.

Testing:

Tested on an M1 and M3 that the alt implementation is picked and is faster. M3 before:

Did 29631 EVP ECDH P-224 operations in 1018316us (29098.0 ops/sec)
Did 30000 EVP ECDH P-256 operations in 1020237us (29404.9 ops/sec)
Did 5840 EVP ECDH P-384 operations in 1000133us (5839.2 ops/sec)
Did 3630 EVP ECDH P-521 operations in 1096572us (3310.3 ops/sec)
Did 5430 EVP ECDH secp256k1 operations in 1000506us (5427.3 ops/sec)
Did 41000 EVP ECDH X25519 operations in 1009621us (40609.3 ops/sec)

M3 After

Did 30233 EVP ECDH P-224 operations in 1013635us (29826.3 ops/sec)
Did 31000 EVP ECDH P-256 operations in 1022186us (30327.2 ops/sec)
Did 7227 EVP ECDH P-384 operations in 1076336us (6714.4 ops/sec)
Did 4690 EVP ECDH P-521 operations in 1042402us (4499.2 ops/sec)
Did 5710 EVP ECDH secp256k1 operations in 1051837us (5428.6 ops/sec)
Did 52000 EVP ECDH X25519 operations in 1008149us (51579.7 ops/sec)

As expected P-224 and P-256 are unaffected, while 384/521/X25519 which do use s2n-bignum are much faster.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and the ISC license.

@andrewhop andrewhop requested a review from a team as a code owner October 2, 2024 18:22
@jakemas jakemas self-requested a review October 2, 2024 18:32
@codecov-commenter
Copy link

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 78.48%. Comparing base (ad93747) to head (17028e8).
Report is 5 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1901      +/-   ##
==========================================
+ Coverage   78.45%   78.48%   +0.02%     
==========================================
  Files         585      585              
  Lines       99516    99517       +1     
  Branches    14244    14243       -1     
==========================================
+ Hits        78080    78105      +25     
+ Misses      20796    20777      -19     
+ Partials      640      635       -5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ctz
Copy link

ctz commented Oct 2, 2024

Nice! I had assumed this was coming from shipping the fastest implementation on Graviton; I didn't realise you were switching between them at runtime :)

@andrewhop andrewhop merged commit ca5f197 into aws:main Oct 2, 2024
107 of 110 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants