feat: allow pluggable tower layers in connector service stack #2496

jlizen · 2024-12-12T04:56:00Z

Background

Closes: #2490

This PR adds support for injecting arbitrary layers into the Connector's tower service stack, provided their corresponding futures have Send + Sync + 'static bounds (as well as the layer itself, in the blocking client's case).

I included a working example showing my specific use case that this unblocks: delegating the connection future to a secondary, low priority executor. (This has to do with perf issues with long-polling TLS futures, ref: rustls/tokio-rustls#94)

This required bumping the MSRV to 1.64 to get access to: tower-rs/tower#777

Sample usage:

let client = reqwest::Client::builder()
                  // resolved to outermost layer, so before the semaphore permit is attempted
                  .connect_timeout(Duration::from_millis(100)) 
                  // underneath the concurrency check, so only after a semaphore permit is acquired
                  .connector_layer(tower::timeout::TimeoutLayer::new(Duration::from_millis(50)))
                  .connector_layer(tower::limit::concurrency::ConcurrencyLimitLayer::new(2))
                  .build()
                  .unwrap();

Performance ramifications

I took care to ensure that there is no meaningful overhead in the case of the caller not using any custom layers. We just use the base service on its own, basically. We actually could probably optimize it a tiny bit further beyond the existing state by splitting the base service into one with a timeout and one without, to avoid that branching every connection, but I didn't bother. My focus was just on avoiding indirection/allocation, mostly.

For cases with custom layers, I took the middle ground of using generics on the input side, but type erasing when constructing the tower stack to avoid badly polluting the rest of the library with generics. We are able to avoid muddying the input API with generics (which would be a breaking change) by initializing our builder with the non-op tower_layer::Identity layer.

The custom layer approach approach does involve an extra allocation/v-table, since we are using BoxCloneSyncService on the very outside of the stack to type-erase our layer stack. The generic input bounds let us avoid an additional indirection layer per custom layer, at least.

My feeling is, if the caller cares to avoid that final allocation, they probably want to construct a lower level client anyway. Theoretically we could flatten the current Box::pin() we do for the base service, into the outer Box::pin(), but it would take changes to the Connect trait in hyper-util.

Relationship to connect_timeout() config

If the caller specifies any custom layers, we implicitly hoist builder.connect_timeout() settings to an outermost TimeoutLayer. I feel that this is what the average caller would expect. If they need to move a timeout somewhere else in the stack, they can just compose the TimeoutLayer directly in their stack. I have doc comments showing this case directly.

Meanwhile, in the case of no custom layers, we just keep the old behavior of evaluating the timeout directly inside the base tower service future. We do it that way since using a separate timeout layer forces an extra Box::pin as the tokio Timeout future is Unpin.

I would have preferred to always use the timeout layer approach, but we're currently unconditionally Box::pin-ing inside the base service due to bounds imposed by the underlying Connect trait in hyper-util. Didn't want to go deeper there.

Added dependency on `tower`

I want to call out that this adds a dependency on tower, with features util and timeout. Previously we only had tower_service. I need this for:

tower::ServiceBuilder,
tower::Layer
tower::layer::util::Identity
tower::layer::util::Stack
tower::timeout::TimeoutLayer

Also some additional dev dependencies to show some sample usages in tests/examples.

I could see an argument for putting this behind a feature flag. My feeling was, tower pluggability of reqwest is only going to grow, and it probably doesn't make sense to keep as a feature gate long term since it will (perhaps) become a core feature as the tower middleware ecosystem grows.

But, glad to throw this functionality behind a feature flag, up to the maintainers. I probably would switch to type erasing at every passed in tower layer at that point, since managing all the generic bounds w/r/t conditional compiling sounds miserable.

Testing

The tests should be fairly resilient. There are integration tests probing behavior of the connector with timeout, concurrency limit, and non-op layers, including both blocking and non-blocking clients.

I wrote a custom layer in the /tests/ directory that injects arbitrary delays. Previously the only support for connect delays we had was all-or-nothing. This was handy for testing things like concurrency limits on the client usage without doing more complex server construction.

I did test my new example locally, it resolves properly via the background channel tls handshake.

I would have preferred to unit test the composed stacks more directly inside async_impl::connect, but I didn't see a convenient way to construct a throwaway client inside that module. It seemed like most cases where we construct a client are in the integration tests. Please correct me if I'm missing something :)

src/blocking/client.rs

src/async_impl/client.rs

jlizen · 2024-12-12T16:16:01Z

Some side questions (I can also open an issue in tower if preferred):

Accepting arbitrary layers without a lot of indirection was pretty annoying. Would it be useful to show an example of how to construct the stack with generics in the tower examples? Or too niche? This mostly is relevant for libraries, applications will generally have concrete types.
The 'spawn a service future via a delegated executor' service used in my example is a bit of boilerplate. We have the 'buffer' layer, but it processes tasks sequentially rather than spawning them. Would this be useful to contribute as a tower layer, probably adjusted to be a bit less opinionated and accepting just a runtime handle or something? And behind a feature flag, ofc. Or too niche?
The 'delay' layer I wrote for the tests is very simple but perhaps useful for others doing testing? Is that worth adding to tower (behind a future flag)? Or too trivial?

src/async_impl/client.rs

Cargo.toml

seanmonstar

Excellent PR! The write-up is very clear, I appreciate the clear example and all the tests. And the proposed API feels good too.

I have one concern commented inline.

src/async_impl/client.rs

jlizen · 2024-12-16T21:46:00Z

New commit up that type erases throughout the tower stack. This means we no longer are storing the input layers as concrete generics, so ClientBuilder is back to no generic bounds.

Input interface / usability

There still is a generic in the connector_layer() input bounds. It should be easily inferred by whatever layer is being passed in.

Error messages could get slightly confusing in case the caller tries to pass in a non-Clone + Send + Sync + 'static layer or one that is opinionated about its inner service's future / output. But, that shouldn't generally be a problem with any off the shelf middleware. In case somebody is using a custom layer, I think they can probably troubleshoot.

Type exposure

This allowed us to move the ConnectorService visibility back to pub(crate).

We still do expose Conn, our service's response type. Right now all the implementation details of that struct are hidden. I think maybe we might eventually want to expose more of a handle in that response value to the underlying connection, for use on the backswing. Seems like it could open up a lot of nice functionality around pool management.

Glad to tuck Conn away more deeply though if you have any suggestions.

Performance

There continues to be no perf impact on the simple case (no custom layers, maybe a connect_timeout).

For the custom layers case, the type erasing boxes are around:

1 - the base connector service
N - each custom input layer (previously 1 total w/ generics)
1 - the combination of all default outer layers (optional .connect_timeout's TimeoutLayer, and MapErr for external -> internal type conversions)

This results in a minimum of 1 extra allocation compared to the generic case, due to boxing the inner service. There is also an extra allocation per custom layer beyond the first, which the caller can avoid by pre-composing their stack into a single layer before passing it in.

I didn't add any notes on this or adjust the examples to pre-compose layers. Performance impact should be virtually indistinguishable for most cases, and we prefer to index on ease of use.

jlizen commented Dec 12, 2024

View reviewed changes

src/blocking/client.rs Outdated Show resolved Hide resolved

src/async_impl/client.rs Outdated Show resolved Hide resolved

jlizen force-pushed the master branch from bcc62fb to 47b87d0 Compare December 12, 2024 16:11

jlizen force-pushed the master branch 2 times, most recently from 9c3cbe7 to e9e0eaa Compare December 12, 2024 16:46

arielb1 reviewed Dec 12, 2024

View reviewed changes

src/async_impl/client.rs Outdated Show resolved Hide resolved

weihanglo reviewed Dec 12, 2024

View reviewed changes

Cargo.toml Outdated Show resolved Hide resolved

jlizen force-pushed the master branch 14 times, most recently from 8ec98c8 to 027abf4 Compare December 12, 2024 21:39

feat: allow pluggable tower layers in connector service stack

a4f058e

jlizen force-pushed the master branch from 027abf4 to a4f058e Compare December 12, 2024 23:48

seanmonstar requested changes Dec 13, 2024

View reviewed changes

src/async_impl/client.rs Outdated Show resolved Hide resolved

connector layers: hide all generics behind boxing

3b6a3c8

jlizen force-pushed the master branch from 1b55cda to 3b6a3c8 Compare December 16, 2024 21:35

jlizen requested a review from seanmonstar December 17, 2024 21:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: allow pluggable tower layers in connector service stack #2496

feat: allow pluggable tower layers in connector service stack #2496

jlizen commented Dec 12, 2024 •

edited

Loading

jlizen commented Dec 12, 2024 •

edited

Loading

seanmonstar left a comment

jlizen commented Dec 16, 2024 •

edited

Loading

feat: allow pluggable tower layers in connector service stack #2496

Are you sure you want to change the base?

feat: allow pluggable tower layers in connector service stack #2496

Conversation

jlizen commented Dec 12, 2024 • edited Loading

Background

Performance ramifications

Relationship to connect_timeout() config

Added dependency on tower

Testing

jlizen commented Dec 12, 2024 • edited Loading

seanmonstar left a comment

Choose a reason for hiding this comment

jlizen commented Dec 16, 2024 • edited Loading

Input interface / usability

Type exposure

Performance

jlizen commented Dec 12, 2024 •

edited

Loading

Added dependency on `tower`

jlizen commented Dec 12, 2024 •

edited

Loading

jlizen commented Dec 16, 2024 •

edited

Loading