Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: allow pluggable tower layers in connector service stack #2496

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

jlizen
Copy link
Contributor

@jlizen jlizen commented Dec 12, 2024

Background

Closes: #2490

This PR adds support for injecting arbitrary layers into the Connector's tower service stack, provided their corresponding futures have Send + Sync + 'static bounds (as well as the layer itself, in the blocking client's case).

I included a working example showing my specific use case that this unblocks: delegating the connection future to a secondary, low priority executor. (This has to do with perf issues with long-polling TLS futures, ref: rustls/tokio-rustls#94)

This required bumping the MSRV to 1.64 to get access to: tower-rs/tower#777

Sample usage:

let client = reqwest::Client::builder()
                  // resolved to outermost layer, so before the semaphore permit is attempted
                  .connect_timeout(Duration::from_millis(100)) 
                  // underneath the concurrency check, so only after a semaphore permit is acquired
                  .connector_layer(tower::timeout::TimeoutLayer::new(Duration::from_millis(50)))
                  .connector_layer(tower::limit::concurrency::ConcurrencyLimitLayer::new(2))
                  .build()
                  .unwrap();

Performance ramifications

I took care to ensure that there is no meaningful overhead in the case of the caller not using any custom layers. We just use the base service on its own, basically. We actually could probably optimize it a tiny bit further beyond the existing state by splitting the base service into one with a timeout and one without, to avoid that branching every connection, but I didn't bother. My focus was just on avoiding indirection/allocation, mostly.

For cases with custom layers, I took the middle ground of using generics on the input side, but type erasing when constructing the tower stack to avoid badly polluting the rest of the library with generics. We are able to avoid muddying the input API with generics (which would be a breaking change) by initializing our builder with the non-op tower_layer::Identity layer.

The custom layer approach approach does involve an extra allocation/v-table, since we are using BoxCloneSyncService on the very outside of the stack to type-erase our layer stack. The generic input bounds let us avoid an additional indirection layer per custom layer, at least.

My feeling is, if the caller cares to avoid that final allocation, they probably want to construct a lower level client anyway. Theoretically we could flatten the current Box::pin() we do for the base service, into the outer Box::pin(), but it would take changes to the Connect trait in hyper-util.

Relationship to connect_timeout() config

If the caller specifies any custom layers, we implicitly hoist builder.connect_timeout() settings to an outermost TimeoutLayer. I feel that this is what the average caller would expect. If they need to move a timeout somewhere else in the stack, they can just compose the TimeoutLayer directly in their stack. I have doc comments showing this case directly.

Meanwhile, in the case of no custom layers, we just keep the old behavior of evaluating the timeout directly inside the base tower service future. We do it that way since using a separate timeout layer forces an extra Box::pin as the tokio Timeout future is Unpin.

I would have preferred to always use the timeout layer approach, but we're currently unconditionally Box::pin-ing inside the base service due to bounds imposed by the underlying Connect trait in hyper-util. Didn't want to go deeper there.

Added dependency on tower

I want to call out that this adds a dependency on tower, with features util and timeout. Previously we only had tower_service. I need this for:

tower::ServiceBuilder,
tower::Layer
tower::layer::util::Identity
tower::layer::util::Stack
tower::timeout::TimeoutLayer

Also some additional dev dependencies to show some sample usages in tests/examples.

I could see an argument for putting this behind a feature flag. My feeling was, tower pluggability of reqwest is only going to grow, and it probably doesn't make sense to keep as a feature gate long term since it will (perhaps) become a core feature as the tower middleware ecosystem grows.

But, glad to throw this functionality behind a feature flag, up to the maintainers. I probably would switch to type erasing at every passed in tower layer at that point, since managing all the generic bounds w/r/t conditional compiling sounds miserable.

Testing

The tests should be fairly resilient. There are integration tests probing behavior of the connector with timeout, concurrency limit, and non-op layers, including both blocking and non-blocking clients.

I wrote a custom layer in the /tests/ directory that injects arbitrary delays. Previously the only support for connect delays we had was all-or-nothing. This was handy for testing things like concurrency limits on the client usage without doing more complex server construction.

I did test my new example locally, it resolves properly via the background channel tls handshake.

I would have preferred to unit test the composed stacks more directly inside async_impl::connect, but I didn't see a convenient way to construct a throwaway client inside that module. It seemed like most cases where we construct a client are in the integration tests. Please correct me if I'm missing something :)

src/blocking/client.rs Outdated Show resolved Hide resolved
src/async_impl/client.rs Outdated Show resolved Hide resolved
@jlizen
Copy link
Contributor Author

jlizen commented Dec 12, 2024

Some side questions (I can also open an issue in tower if preferred):

  • Accepting arbitrary layers without a lot of indirection was pretty annoying. Would it be useful to show an example of how to construct the stack with generics in the tower examples? Or too niche? This mostly is relevant for libraries, applications will generally have concrete types.
  • The 'spawn a service future via a delegated executor' service used in my example is a bit of boilerplate. We have the 'buffer' layer, but it processes tasks sequentially rather than spawning them. Would this be useful to contribute as a tower layer, probably adjusted to be a bit less opinionated and accepting just a runtime handle or something? And behind a feature flag, ofc. Or too niche?
  • The 'delay' layer I wrote for the tests is very simple but perhaps useful for others doing testing? Is that worth adding to tower (behind a future flag)? Or too trivial?

@jlizen jlizen force-pushed the master branch 2 times, most recently from 9c3cbe7 to e9e0eaa Compare December 12, 2024 16:46
src/async_impl/client.rs Outdated Show resolved Hide resolved
Cargo.toml Outdated Show resolved Hide resolved
@jlizen jlizen force-pushed the master branch 14 times, most recently from 8ec98c8 to 027abf4 Compare December 12, 2024 21:39
Copy link
Owner

@seanmonstar seanmonstar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent PR! The write-up is very clear, I appreciate the clear example and all the tests. And the proposed API feels good too.

I have one concern commented inline.

src/async_impl/client.rs Outdated Show resolved Hide resolved
@jlizen
Copy link
Contributor Author

jlizen commented Dec 16, 2024

New commit up that type erases throughout the tower stack. This means we no longer are storing the input layers as concrete generics, so ClientBuilder is back to no generic bounds.

Input interface / usability

There still is a generic in the connector_layer() input bounds. It should be easily inferred by whatever layer is being passed in.

Error messages could get slightly confusing in case the caller tries to pass in a non-Clone + Send + Sync + 'static layer or one that is opinionated about its inner service's future / output. But, that shouldn't generally be a problem with any off the shelf middleware. In case somebody is using a custom layer, I think they can probably troubleshoot.

Type exposure

This allowed us to move the ConnectorService visibility back to pub(crate).

We still do expose Conn, our service's response type. Right now all the implementation details of that struct are hidden. I think maybe we might eventually want to expose more of a handle in that response value to the underlying connection, for use on the backswing. Seems like it could open up a lot of nice functionality around pool management.

Glad to tuck Conn away more deeply though if you have any suggestions.

Performance

There continues to be no perf impact on the simple case (no custom layers, maybe a connect_timeout).

For the custom layers case, the type erasing boxes are around:

  • 1 - the base connector service
  • N - each custom input layer (previously 1 total w/ generics)
  • 1 - the combination of all default outer layers (optional .connect_timeout's TimeoutLayer, and MapErr for external -> internal type conversions)

This results in a minimum of 1 extra allocation compared to the generic case, due to boxing the inner service. There is also an extra allocation per custom layer beyond the first, which the caller can avoid by pre-composing their stack into a single layer before passing it in.

I didn't add any notes on this or adjust the examples to pre-compose layers. Performance impact should be virtually indistinguishable for most cases, and we prefer to index on ease of use.

@jlizen jlizen requested a review from seanmonstar December 17, 2024 21:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature request] Allow composable tower stack for Connector service
4 participants