Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve client init time by switching to regex-lite #3269

Merged
merged 6 commits into from
Nov 30, 2023

Conversation

jdisanti
Copy link
Collaborator

@jdisanti jdisanti commented Nov 28, 2023

Each client initialization was taking between 1 and 2 milliseconds, regardless if the client had been constructed before or not. For example, if a customer wants five clients with different credentials providers, that could be 10 milliseconds of time spent in Client::from_conf. Approximately 98% of this time was spent compiling regular expressions for the endpoint partition resolver.

This change switches everything over to the regex-lite crate, which has faster regex compile times, and shouldn't have much of an impact on performance for our specific use-cases (small strings, only evaluated at client initialization).

The use of regex was entirely removed in aws-sigv4 since it was overkill for what it was being used for.


By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@jdisanti
Copy link
Collaborator Author

jdisanti commented Nov 28, 2023

Whether or not the caching added in efd5032 is justified is open for debate. The following are client creation times on MacOS with and without it.

With caching:

s3::Client::new - 215.5µs
s3::Client::new - 11.25µs
s3::Client::new - 8.791µs
s3::Client::new - 8.208µs
s3::Client::new - 8.708µs

Without caching:

s3::Client::new - 172.292µs
s3::Client::new - 27.875µs
s3::Client::new - 24.208µs
s3::Client::new - 23.583µs
s3::Client::new - 23.042µs

The first client construction with caching is consistently about 50µs slower than without.

@jdisanti
Copy link
Collaborator Author

jdisanti commented Nov 28, 2023

Overall, this is an improvement over what's currently released to crates.io (again, measured on MacOS):

crates.io aws_config::load_defaults - 150.846792ms
crates.io s3::Client::new - 1.531208ms
crates.io s3::Client::new - 1.51575ms
crates.io s3::Client::new - 1.346666ms
crates.io s3::Client::new - 1.342542ms
crates.io s3::Client::new - 1.278125ms
crates.io list_buckets - 279.871625ms
crates.io list_buckets 2 - 66.796834ms

aws_config::load_defaults - 134.817875ms
s3::Client::new - 231.375µs
s3::Client::new - 23µs
s3::Client::new - 29.375µs
s3::Client::new - 11.125µs
s3::Client::new - 8.834µs
list_buckets - 262.579666ms
list_buckets 2 - 48.314125ms

The remaining large time chunks are being spent in default HTTP client initialization, which will be scrutinized in a separate PR.

Same measurements in Amazon Linux 2 x64 on a c5.2xlarge EC2 instance:

crates.io aws_config::load_defaults - 4.334069ms
crates.io s3::Client::new - 2.911449ms
crates.io s3::Client::new - 2.599377ms
crates.io s3::Client::new - 2.339871ms
crates.io s3::Client::new - 2.344298ms
crates.io s3::Client::new - 2.363885ms
crates.io list_buckets - 96.714985ms
crates.io list_buckets 2 - 10.899774ms

aws_config::load_defaults - 1.311494ms
s3::Client::new - 99.413µs
s3::Client::new - 43.055µs
s3::Client::new - 26.739µs
s3::Client::new - 19.159µs
s3::Client::new - 20.428µs
list_buckets - 100.001779ms
list_buckets 2 - 14.689284ms

Copy link

A new generated diff is ready to view.

  • AWS SDK (ignoring whitespace)
  • No codegen difference in the Client Test
  • No codegen difference in the Server Test
  • No codegen difference in the Server Test Python
  • No codegen difference in the Server Test Typescript

A new doc preview is ready to view.

Copy link

A new generated diff is ready to view.

  • AWS SDK (ignoring whitespace)
  • No codegen difference in the Client Test
  • No codegen difference in the Server Test
  • No codegen difference in the Server Test Python
  • No codegen difference in the Server Test Typescript

A new doc preview is ready to view.

@jdisanti
Copy link
Collaborator Author

After more profiling, the remainder of the overhead is almost entirely the TLS handshake on first request on Linux. On Mac, loading trusted certs is slow. I think if we were to make the default HTTP client lazy, we could consider only doing it for Mac.

@jdisanti
Copy link
Collaborator Author

Ran the previous release benchmark to make sure this didn't regress anything significantly:

compare/previous/S3 ListObjectsV2
                        time:   [34.145 µs 34.345 µs 34.527 µs]
compare/main/S3 ListObjectsV2
                        time:   [35.767 µs 35.963 µs 36.171 µs]

It seems to have increased request overhead by roughly 1 µs.

@rcoh
Copy link
Collaborator

rcoh commented Nov 29, 2023

do you want to target main or the release branch?

Comment on lines +60 to +67
"DEFAULT_PARTITION_RESOLVER" to RuntimeType.forInlineFun("DEFAULT_PARTITION_RESOLVER", EndpointStdLib) {
rustTemplate(
"""
// Loading the partition JSON is expensive since it involves many regex compilations,
// so cache the result so that it only need to be paid for the first constructed client.
pub(crate) static DEFAULT_PARTITION_RESOLVER: #{Lazy}<#{PartitionResolver}> =
#{Lazy}::new(|| #{PartitionResolver}::new_from_json(b$json).expect("valid JSON"));
""",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we do eventually need to support runtime-loading of partitions.json but since this is all private, that seems fine

@@ -299,6 +299,7 @@ data class RuntimeType(val path: String, val dependency: RustDependency? = null)
val PercentEncoding = CargoDependency.PercentEncoding.toType()
val PrettyAssertions = CargoDependency.PrettyAssertions.toType()
val Regex = CargoDependency.Regex.toType()
val RegexLite = CargoDependency.RegexLite.toType()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually unused, afaict

Comment on lines +444 to +449
text.chars()
// Filter out consecutive spaces
.zip(text.chars().skip(1).chain(std::iter::once('!')))
.filter(|(a, b)| *a != ' ' || *b != ' ')
.map(|(a, _)| a)
.collect(),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clever algorithm!

@jdisanti jdisanti force-pushed the jdisanti-client-init-regex branch from a364d09 to a365675 Compare November 30, 2023 00:23
Copy link

A new generated diff is ready to view.

  • AWS SDK (ignoring whitespace)
  • No codegen difference in the Client Test
  • No codegen difference in the Server Test
  • No codegen difference in the Server Test Python
  • No codegen difference in the Server Test Typescript

A new doc preview is ready to view.

@jdisanti jdisanti added this pull request to the merge queue Nov 30, 2023
Merged via the queue into main with commit 5b93fd2 Nov 30, 2023
41 checks passed
@jdisanti jdisanti deleted the jdisanti-client-init-regex branch November 30, 2023 20:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants