From 06ea250563d99142fa4502c570de4020faa3a1d1 Mon Sep 17 00:00:00 2001 From: Yuki Saito <awsaito@amazon.com> Date: Tue, 17 Jan 2023 11:34:02 -0600 Subject: [PATCH 01/18] Add RFC: providing fallback credentials on timeout --- design/src/SUMMARY.md | 1 + design/src/rfcs/overview.md | 2 + ...oviding_fallback_credentials_on_timeout.md | 251 ++++++++++++++++++ 3 files changed, 254 insertions(+) create mode 100644 design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md diff --git a/design/src/SUMMARY.md b/design/src/SUMMARY.md index 2b79d89d61..381b27987a 100644 --- a/design/src/SUMMARY.md +++ b/design/src/SUMMARY.md @@ -53,6 +53,7 @@ - [RFC-0028: SDK Credential Cache Type Safety](./rfcs/rfc0028_sdk_credential_cache_type_safety.md) - [RFC-0029: Finding New Home for Credential Types](./rfcs/rfc0029_new_home_for_cred_types.md) - [RFC-0030: Serialization And Deserialization](./rfcs/rfc0030_serialization_and_deserialization.md) + - [RFC-0031: Providing Fallback Credentials on Timeout](./rfcs/rfc0031_providing_fallback_credentials_on_timeout.md) - [Contributing](./contributing/overview.md) - [Writing and debugging a low-level feature that relies on HTTP](./contributing/writing_and_debugging_a_low-level_feature_that_relies_on_HTTP.md) diff --git a/design/src/rfcs/overview.md b/design/src/rfcs/overview.md index 000286b82f..4109cd56fb 100644 --- a/design/src/rfcs/overview.md +++ b/design/src/rfcs/overview.md @@ -39,3 +39,5 @@ - [RFC-0027: Endpoints 2.0](./rfc0027_endpoints_20.md) - [RFC-0028: SDK Credential Cache Type Safety](./rfc0028_sdk_credential_cache_type_safety.md) - [RFC-0029: Finding New Home for Credential Types](./rfc0029_new_home_for_cred_types.md) +- [RFC-0030: Serialization And Deserialization](./rfc0030_serialization_and_deserialization.md) +- [RFC-0031: Providing Fallback Credentials on Timeout](./rfc0031_providing_fallback_credentials_on_timeout.md) diff --git a/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md b/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md new file mode 100644 index 0000000000..1591839478 --- /dev/null +++ b/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md @@ -0,0 +1,251 @@ +RFC: Providing fallback credentials on external timeout +======================================================= + +> Status: RFC +> +> Applies to: client + +For a summarized list of proposed changes, see the [Changes Checklist](#changes-checklist) section. + +This RFC proposes a fallback mechanism for credentials providers on external timeout (see the [Terminology](#terminology) section), allowing them to continue serving (possibly stale) credentials for the sake of overall reliability of the intended service; the IMDS credentials provider is an example that must fulfill such a requirement to support static stability. + +Terminology +----------- + +- External timeout: A timeout within which a call to the `provide_credentials` method must complete. `provide_credentials` returns a future and if it does not complete within this timeout, it gets dropped on the floor and there is no output retrieved from the future. +- Internal timeout: A timeout imposed on functions/methods internally called by `provide_credentials` that, when failing to meet any of them, renders `provide_credentials` return an `CredentialsError`. For instance, internal timeout includes connection timeout, TLS negotiation timeout, and HTTP request timeout, if `provide_credentials` needs to fetch credentials via HTTP(S). This is a kind of timeout `provide_credentials` can react and handle it accordingly, i.e. returning fallback credentials or an `CredentialsError`. + +Assumption +---------- + +This RFC is concerned only with external timeout, as the cost of designing APIs poorly is much higher than that for internal timeout. The former will affect a public trait implemented by all credentials providers whereas the latter can be handled locally by individual credentials providers without affecting one another. + +Problem +------- + +We have mentioned static stability. Supporting it calls for the following functional requirement, among others: +- REQ 1: Once a credentials provider has served credentials, it should continue serving them in the event of a timeout (whether internal or external) while obtaining refreshed credentials. + +Today, we have the following trait method to obtain credentials: +```rust +fn provide_credentials<'a>(&'a self) -> future::ProvideCredentials<'a> +where + Self: 'a, +``` +This method can be used in async contexts, which means that nothing stops it from being raced against a timeout future, as demonstrated by the following code snippet from `LazyCredentialsCache`: + +```rust +let timeout_future = self.sleeper.sleep(self.load_timeout); // by default self.load_timeout is 5 seconds. +// --snip-- +let future = Timeout::new(provider.provide_credentials(), timeout_future); +let result = cache + .get_or_load(|| { + async move { + let credentials = future.await.map_err(|_err| { + CredentialsError::provider_timed_out(load_timeout) + })??; + // --snip-- + } + }).await; +// --snip-- +``` +This creates an external timeout for `provide_credentials`. If `timeout_future` wins the race, a future for `provide_credentials` gets dropped, `timeout_future` returns an error, and the error is mapped to `CredentialsError::ProviderTimedOut` and returned. This makes it impossible for the variable `provider` above to serve credentials as stated in REQ 1. + +A slightly more complex use case involves `CredentialsProviderChain`. It is a manifestation of the chain of responsibility pattern and keeps calling the `provide_credentials` method on each credentials provider down the chain until credentials are returned by one of them. In addition to REQ 1, we have the following functional requirement with respect to `CredentialsProviderChain`: +- REQ 2: Once a credentials provider in the chain has returned credentials, it should continue serving them even in the event of a timeout (whether internal or external) without falling back to another credentials provider. + +Referring back to the code snippet above, we analyze two relevant cases (and suppose provider 2 below must meet REQ 1 and REQ 2 in each case): + +**Case 1:** Provider 2 successfully loaded credentials but later failed to do so because an external timeout kicked in. + +<p align="center"> +<img width="750" alt="chain-provider-ext-timeout-1" src="https://user-images.githubusercontent.com/15333866/212421638-d08e4821-2dbe-497f-82c5-c78aab8acbe9.png"> +</p> + +The figure above illustrates an example. This `CredentialsProviderChain` consists of three credentials providers. When `CredentialsProviderChain::provide_credentials` is called, provider 1's `provide_credentials` is called but does not find credentials so passes the torch to provider 2, which in turn successfully loads credentials and returns them. The next time the method is called, provider 1 does not find credentials but neither does provider 2 this time, because an external timeout by `timeout_future` given to the whole chain kicked in and the future is dropped while provider 2's `provide_credentials` was running. Given the functional requirements, provider 2 should return the previously available credentials but today the code snippet from `LazyCredentialsCache` returns a `CredentialsError::ProviderTimedOut` instead. + +**Case 2:** Provider 2 successfully loaded credentials but later was not reached because its preceding provider was still running when an external timeout kicked in. + +<p align="center"> +<img width="750" alt="chain-provider-ext-timeout-2" src="https://user-images.githubusercontent.com/15333866/212421712-8c6eab11-a0c1-4229-8ba3-67b0bb6056e7.png"> +</p> + +The figure above illustrates an example with the same setting as the previous figure. Again, when `CredentialsProviderChain::provide_credentials` is called the first time, provider 1 does not find credentials but provider 2 does. The next time the method is called, provider 1 is still executing `provide_credentials` and then an external time by `timeout_future` kicked in. Consequently, the execution of `CredentialsProviderChain::provide_credentials` has been terminated. Given the functional requirements, provider 2 should return the previously available credentials but today the code snippet from `LazyCredentialsCache` returns `CredentialsError::ProviderTimedOut` instead. + + +Proposal +-------- +To address the problem in the previous section, we propose to add a new method to the `ProvideCredentials` trait called `on_timeout` that looks something like this: +```rust +mod future { + // --snip-- + + pub struct OnTimeout<'a>(NowOrLater<Option<Credentials>, BoxFuture<'a, Option<Credentials>>>); + + // impls for OnTimeout similar to those for the ProvideCredentials future newtype +} + +pub trait ProvideCredentials: Send + Sync + std::fmt::Debug { + // --snip-- + + fn on_timeout<'a>(&'a self) -> future::OnTimeout<'a> { + future::OnTimeout::ready(None) + } +} +``` +This method allows credentials providers to have a fallback mechanism on an external timeout and to serve credentials to users if needed. It comes with a default implementation to return a future that immediately yields `None`. Credentials providers that need a different behavior can choose to override the method. + +The user experience for the code snippet in question will look like this once this proposal is implemented: +```rust +let timeout_future = self.sleeper.sleep(self.load_timeout); // by default self.load_timeout is 5 seconds. +// --snip-- +let future = Timeout::new(provider.provide_credentials(), timeout_future); +let result = cache + .get_or_load(|| { + async move { + let credentials= match future.await { + Ok(creds) => creds?, + Err(_err) => match provider.on_timeout().await { // can provide fallback credentials + Some(creds) => creds, + None => return Err(CredentialsError::provider_timed_out(load_timeout)), + } + }; + // --snip-- + } + }).await; +// --snip-- +``` + + +How to actually implement this RFC +---------------------------------- + +Almost all credentials providers do not have to implement their own `on_timeout` except for `CredentialsProviderChain` (`ImdsCredentialsProvider` also needs to implement `on_timeout` when we are adding static stability support to it but that is outside the scope of this RFC). + +Considering the two cases we analyzed above, implementing `CredentialsProviderChain::on_timeout` is not so straightforward. Keeping track of whose turn in the chain it is to call `provide_credentials` when an external timeout has ocurred is a challening task. Even if we figured it out, that would still not satisfy `Case 2` above, because it was provider 1 that was actively running when the external timeout kicked in, but the chain should return credentials from provider 2, not from provider 1. + +With all that taken into account, we can consider starting with the following simple approach: +```rust +impl ProvideCredentials for CredentialsProviderChain { + // --snip-- + + fn on_timeout<'a>(&'a self) -> future::OnTimeout<'a> { + future::OnTimeout::new(async move { + for (_, provider) in &self.providers { + match provider.on_timeout().await { + creds @ Some(_) => return creds, + None => {} + } + } + None + }) + } +} +``` +`CredentialsProviderChain::on_timeout` will invoke each provider's `on_timeout` method until credentials are returned by one of them. It ensures that the updated code snippet for `LazyCredentialsCache` can return credentials from provider 2 in both `Case 1` and `Case 2`. Even if `timeout_future` wins the race, the execution subsequently calls `provider.on_timeout().await` to obtain fallback credentials from provider 2, assuming provider 2's `on_timeout` is implemented to return credentials retrieved on the first call to `CredentialsProviderChain::provide_credentials`. + +The downside of this simple approach is that the behavior is not clear if more than one credentials provider in the chain can return credentials from their `on_timeout`. Note, however, that it is the exception rather than the norm for a provider's `on_timeout` to return fallback credentials, at least at the time of writing (01/13/2023). The fact that it returns fallback credentials means that the provider successfully loaded credentials at least once, and it usually continues serving credentials on subsequent calls to `provide_credentials`. + +Should we have more than one provider in the chain that can potentially return fallback credentials from `on_timeout`, we could configure the behavior of `CredentialsProviderChain` managing in what order and how each `on_timeout` should be executed; it can be possible to introduce a new API without breaking changes. That being said, we first need to investigate and understand the use case behind it to design the right API. + +Alternative +----------- + +In this section, we will describe our alternative approach that we ended up dismissing as unworkable. + +Instead of `on_timeout`, we considered the following method to be added to the `ProvideCredentials` trait: +```rust +pub trait ProvideCredentials: Send + Sync + std::fmt::Debug { + // --snip-- + + /// Returns a future that provides credentials within the given `timeout`. + /// + /// The default implementation races `provide_credentials` against + /// a timeout future created from `timeout`. + fn provide_credentials_with_timeout<'a>( + &'a self, + sleeper: Arc<dyn AsyncSleep>, + timeout: Duration, + ) -> future::ProvideCredentials<'a> + where + Self: 'a, + { + let timeout_future = sleeper.sleep(timeout); + let future = Timeout::new(self.provide_credentials(), timeout_future); + future::ProvideCredentials::new(async move { + let credentials = future + .await + .map_err(|_err| CredentialsError::provider_timed_out(timeout))?; + credentials + }) + } +``` +`provide_credentials_with_timeout` encapsulated the timeout race and allowed users to specify how long the external timeout for `provide_credentials` would be. The code snippet from `LazyCredentialsCache` then looked like +```rust +let sleeper = Arc::clone(&self.sleeper); +let load_timeout = self.load_timeout; // by default self.load_timeout is 5 seconds. +// --snip-- +let result = cache + .get_or_load(|| { + async move { + let credentials = provider + .provide_credentials_with_timeout(sleeper, load_timeout) + .await?; + // --snip-- + } + }).await; +// --snip-- +``` +However, implementing `CredentialsProviderChain::provide_credentials_with_timeout` quickly ran into the following problem: +```rust +impl ProvideCredentials for CredentialsProviderChain { + // --snip-- + + fn provide_credentials_with_timeout<'a>( + &'a self, + sleeper: Arc<dyn AsyncSleep>, + timeout: Duration, + ) -> future::ProvideCredentials<'a> + where + Self: 'a, + { + future::ProvideCredentials::new(self.credentials_with_timeout(sleeper, timeout)) + } +} + +impl CredentialsProviderChain { + // --snip-- + + async fn credentials_with_timeout( + &self, + sleeper: Arc<dyn AsyncSleep>, + timeout: Duration, + ) -> provider::Result { + for (_, provider) in &self.providers { + match provider + .provide_credentials_with_timeout(Arc::clone(&sleeper), /* how do we calculate timeout for each provider ? */) + .await + { + Ok(credentials) => { + return Ok(credentials); + } + Err(CredentialsError::ProviderTimedOut(_)) => { + // --snip-- + } + Err(err) => { + // --snip-- + } + } + } + Err(CredentialsError::provider_timed_out(timeout)) + } +``` +There are mainly two problems with this approach. The first problem is that as shown above, there is no sensible way to calculate a timeout for each provider in the chain. The second problem is that exposing a parameter like `timeout` at a public trait's level is giving too much control to users; delegating overall timeout to the individual provider means each provider has to get it right. + +Changes checklist +----------------- + +- [ ] Add `on_timeout` method to the `ProvideCredentials` trait with the default implementation +- [ ] Implement the `OnTimeout` future newtype +- [ ] Implement `CredentialsProviderChain::on_timeout` +- [ ] Add unit tests for `Case 1` and `Case 2` From 77c2b4f0d86c13c08bafd3086902523b5e55bd42 Mon Sep 17 00:00:00 2001 From: ysaito1001 <gperson22@gmail.com> Date: Tue, 17 Jan 2023 15:43:02 -0600 Subject: [PATCH 02/18] Update design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md Co-authored-by: Zelda Hessler <zhessler@amazon.com> --- .../rfcs/rfc0031_providing_fallback_credentials_on_timeout.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md b/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md index 1591839478..97bef9e7aa 100644 --- a/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md +++ b/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md @@ -7,7 +7,7 @@ RFC: Providing fallback credentials on external timeout For a summarized list of proposed changes, see the [Changes Checklist](#changes-checklist) section. -This RFC proposes a fallback mechanism for credentials providers on external timeout (see the [Terminology](#terminology) section), allowing them to continue serving (possibly stale) credentials for the sake of overall reliability of the intended service; the IMDS credentials provider is an example that must fulfill such a requirement to support static stability. +This RFC proposes a fallback mechanism for credentials providers on external timeout (see the [Terminology](#terminology) section), allowing them to continue serving (possibly stale) credentials for the sake of overall reliability of the intended service; The IMDS credentials provider is an example that must fulfill such a requirement to support static stability. Terminology ----------- From 8904a5e70dfdbd4412f809669e2698026142a25a Mon Sep 17 00:00:00 2001 From: ysaito1001 <gperson22@gmail.com> Date: Tue, 17 Jan 2023 15:43:41 -0600 Subject: [PATCH 03/18] Update design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md Co-authored-by: Zelda Hessler <zhessler@amazon.com> --- .../rfcs/rfc0031_providing_fallback_credentials_on_timeout.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md b/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md index 97bef9e7aa..9b0b9d39b4 100644 --- a/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md +++ b/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md @@ -12,7 +12,7 @@ This RFC proposes a fallback mechanism for credentials providers on external tim Terminology ----------- -- External timeout: A timeout within which a call to the `provide_credentials` method must complete. `provide_credentials` returns a future and if it does not complete within this timeout, it gets dropped on the floor and there is no output retrieved from the future. +- External timeout: The name of the timeout that occurs when a duration elapses before an async call to `provide_credentials` returns. In this case, `provide_credentials` returns no credentials. - Internal timeout: A timeout imposed on functions/methods internally called by `provide_credentials` that, when failing to meet any of them, renders `provide_credentials` return an `CredentialsError`. For instance, internal timeout includes connection timeout, TLS negotiation timeout, and HTTP request timeout, if `provide_credentials` needs to fetch credentials via HTTP(S). This is a kind of timeout `provide_credentials` can react and handle it accordingly, i.e. returning fallback credentials or an `CredentialsError`. Assumption From d5657d34380fdab66a179aefab087305e98758f8 Mon Sep 17 00:00:00 2001 From: ysaito1001 <gperson22@gmail.com> Date: Tue, 17 Jan 2023 15:44:43 -0600 Subject: [PATCH 04/18] Update design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md Co-authored-by: Zelda Hessler <zhessler@amazon.com> --- .../rfcs/rfc0031_providing_fallback_credentials_on_timeout.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md b/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md index 9b0b9d39b4..339311db4b 100644 --- a/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md +++ b/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md @@ -13,7 +13,7 @@ Terminology ----------- - External timeout: The name of the timeout that occurs when a duration elapses before an async call to `provide_credentials` returns. In this case, `provide_credentials` returns no credentials. -- Internal timeout: A timeout imposed on functions/methods internally called by `provide_credentials` that, when failing to meet any of them, renders `provide_credentials` return an `CredentialsError`. For instance, internal timeout includes connection timeout, TLS negotiation timeout, and HTTP request timeout, if `provide_credentials` needs to fetch credentials via HTTP(S). This is a kind of timeout `provide_credentials` can react and handle it accordingly, i.e. returning fallback credentials or an `CredentialsError`. +- Internal timeout: The name of the timeout that occurs when a duration elapses before an async call to some function, inside the implementation of `provide_credentials`, returns. Examples include connection timeouts, TLS negotiation timeouts, and HTTP request timeouts. Implementations of `provide_credentials` may handle these failures at their own discretion e.g. by returning _(possibly expired)_ credentials or a `CredentialsError`. Assumption ---------- From e4c571c3cce5ec64c19cafa46ee5d1723f7cd846 Mon Sep 17 00:00:00 2001 From: ysaito1001 <gperson22@gmail.com> Date: Tue, 17 Jan 2023 15:47:13 -0600 Subject: [PATCH 05/18] Update design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md Co-authored-by: Zelda Hessler <zhessler@amazon.com> --- .../rfcs/rfc0031_providing_fallback_credentials_on_timeout.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md b/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md index 339311db4b..130e9a67c7 100644 --- a/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md +++ b/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md @@ -18,7 +18,7 @@ Terminology Assumption ---------- -This RFC is concerned only with external timeout, as the cost of designing APIs poorly is much higher than that for internal timeout. The former will affect a public trait implemented by all credentials providers whereas the latter can be handled locally by individual credentials providers without affecting one another. +This RFC is concerned only with external timeouts, as the cost of poor API design is much higher than in this case than for internal timeouts. The former will affect a public trait implemented by all credentials providers whereas the latter can be handled locally by individual credentials providers without affecting one another. Problem ------- From 9323dad6893e3ab47b9adb34ddfbfedc19effdda Mon Sep 17 00:00:00 2001 From: ysaito1001 <gperson22@gmail.com> Date: Tue, 17 Jan 2023 15:47:24 -0600 Subject: [PATCH 06/18] Update design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md Co-authored-by: Zelda Hessler <zhessler@amazon.com> --- .../rfcs/rfc0031_providing_fallback_credentials_on_timeout.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md b/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md index 130e9a67c7..ef8e00a784 100644 --- a/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md +++ b/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md @@ -32,7 +32,7 @@ fn provide_credentials<'a>(&'a self) -> future::ProvideCredentials<'a> where Self: 'a, ``` -This method can be used in async contexts, which means that nothing stops it from being raced against a timeout future, as demonstrated by the following code snippet from `LazyCredentialsCache`: +This method returns a future, which can be raced against a timeout future as demonstrated by the following code snippet from `LazyCredentialsCache`: ```rust let timeout_future = self.sleeper.sleep(self.load_timeout); // by default self.load_timeout is 5 seconds. From 3834c0e3988f61cb26755867568c9eee7b221ec1 Mon Sep 17 00:00:00 2001 From: ysaito1001 <gperson22@gmail.com> Date: Tue, 17 Jan 2023 15:50:42 -0600 Subject: [PATCH 07/18] Update design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md Co-authored-by: Zelda Hessler <zhessler@amazon.com> --- ...0031_providing_fallback_credentials_on_timeout.md | 12 +++++------- 1 file changed, 5 insertions(+), 7 deletions(-) diff --git a/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md b/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md index ef8e00a784..8a934b88bb 100644 --- a/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md +++ b/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md @@ -39,13 +39,11 @@ let timeout_future = self.sleeper.sleep(self.load_timeout); // by default self.l // --snip-- let future = Timeout::new(provider.provide_credentials(), timeout_future); let result = cache - .get_or_load(|| { - async move { - let credentials = future.await.map_err(|_err| { - CredentialsError::provider_timed_out(load_timeout) - })??; - // --snip-- - } + .get_or_load(|| async move { + let credentials = future.await.map_err(|_err| { + CredentialsError::provider_timed_out(load_timeout) + })??; + // --snip-- }).await; // --snip-- ``` From 9be64b2b4599990d240f4e62dfd525be993f2b75 Mon Sep 17 00:00:00 2001 From: ysaito1001 <gperson22@gmail.com> Date: Tue, 17 Jan 2023 15:50:59 -0600 Subject: [PATCH 08/18] Update design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md Co-authored-by: Zelda Hessler <zhessler@amazon.com> --- .../rfcs/rfc0031_providing_fallback_credentials_on_timeout.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md b/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md index 8a934b88bb..26cc633fc4 100644 --- a/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md +++ b/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md @@ -68,7 +68,7 @@ The figure above illustrates an example. This `CredentialsProviderChain` consist <img width="750" alt="chain-provider-ext-timeout-2" src="https://user-images.githubusercontent.com/15333866/212421712-8c6eab11-a0c1-4229-8ba3-67b0bb6056e7.png"> </p> -The figure above illustrates an example with the same setting as the previous figure. Again, when `CredentialsProviderChain::provide_credentials` is called the first time, provider 1 does not find credentials but provider 2 does. The next time the method is called, provider 1 is still executing `provide_credentials` and then an external time by `timeout_future` kicked in. Consequently, the execution of `CredentialsProviderChain::provide_credentials` has been terminated. Given the functional requirements, provider 2 should return the previously available credentials but today the code snippet from `LazyCredentialsCache` returns `CredentialsError::ProviderTimedOut` instead. +The figure above illustrates an example with the same setting as the previous figure. Again, when `CredentialsProviderChain::provide_credentials` is called the first time, provider 1 does not find credentials but provider 2 does. The next time the method is called, provider 1 is still executing `provide_credentials` and then an external timeout by `timeout_future` kicked in. Consequently, the execution of `CredentialsProviderChain::provide_credentials` has been terminated. Given the functional requirements, provider 2 should return the previously available credentials but today the code snippet from `LazyCredentialsCache` returns `CredentialsError::ProviderTimedOut` instead. Proposal From e5db721a18f0c1547806e1edacb0fd7438f7b37f Mon Sep 17 00:00:00 2001 From: ysaito1001 <gperson22@gmail.com> Date: Tue, 17 Jan 2023 15:52:32 -0600 Subject: [PATCH 09/18] Update design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md Co-authored-by: Zelda Hessler <zhessler@amazon.com> --- .../rfcs/rfc0031_providing_fallback_credentials_on_timeout.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md b/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md index 26cc633fc4..b3c2e8e338 100644 --- a/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md +++ b/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md @@ -120,7 +120,7 @@ How to actually implement this RFC Almost all credentials providers do not have to implement their own `on_timeout` except for `CredentialsProviderChain` (`ImdsCredentialsProvider` also needs to implement `on_timeout` when we are adding static stability support to it but that is outside the scope of this RFC). -Considering the two cases we analyzed above, implementing `CredentialsProviderChain::on_timeout` is not so straightforward. Keeping track of whose turn in the chain it is to call `provide_credentials` when an external timeout has ocurred is a challening task. Even if we figured it out, that would still not satisfy `Case 2` above, because it was provider 1 that was actively running when the external timeout kicked in, but the chain should return credentials from provider 2, not from provider 1. +Considering the two cases we analyzed above, implementing `CredentialsProviderChain::on_timeout` is not so straightforward. Keeping track of whose turn in the chain it is to call `provide_credentials` when an external timeout has occurred is a challenging task. Even if we figured it out, that would still not satisfy `Case 2` above, because it was provider 1 that was actively running when the external timeout kicked in, but the chain should return credentials from provider 2, not from provider 1. With all that taken into account, we can consider starting with the following simple approach: ```rust From 7f557655e5a0a592582f7e23f303fdfaa3aa82c2 Mon Sep 17 00:00:00 2001 From: ysaito1001 <gperson22@gmail.com> Date: Tue, 17 Jan 2023 16:00:24 -0600 Subject: [PATCH 10/18] Update design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md Co-authored-by: Zelda Hessler <zhessler@amazon.com> --- .../rfcs/rfc0031_providing_fallback_credentials_on_timeout.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md b/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md index b3c2e8e338..9f97186b2f 100644 --- a/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md +++ b/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md @@ -149,7 +149,7 @@ Should we have more than one provider in the chain that can potentially return f Alternative ----------- -In this section, we will describe our alternative approach that we ended up dismissing as unworkable. +In this section, we will describe an alternative approach that we ended up dismissing as unworkable. Instead of `on_timeout`, we considered the following method to be added to the `ProvideCredentials` trait: ```rust From c79eee5d904b0ca165aefa73b806691201bb4f59 Mon Sep 17 00:00:00 2001 From: ysaito1001 <gperson22@gmail.com> Date: Tue, 17 Jan 2023 16:00:40 -0600 Subject: [PATCH 11/18] Update design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md Co-authored-by: Zelda Hessler <zhessler@amazon.com> --- .../rfcs/rfc0031_providing_fallback_credentials_on_timeout.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md b/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md index 9f97186b2f..f14ff39e2b 100644 --- a/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md +++ b/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md @@ -122,7 +122,7 @@ Almost all credentials providers do not have to implement their own `on_timeout` Considering the two cases we analyzed above, implementing `CredentialsProviderChain::on_timeout` is not so straightforward. Keeping track of whose turn in the chain it is to call `provide_credentials` when an external timeout has occurred is a challenging task. Even if we figured it out, that would still not satisfy `Case 2` above, because it was provider 1 that was actively running when the external timeout kicked in, but the chain should return credentials from provider 2, not from provider 1. -With all that taken into account, we can consider starting with the following simple approach: +With that in mind, consider instead the following approach: ```rust impl ProvideCredentials for CredentialsProviderChain { // --snip-- From e79eca1cb12101fe9ffbf7e402062f2a806e2eb7 Mon Sep 17 00:00:00 2001 From: ysaito1001 <gperson22@gmail.com> Date: Tue, 17 Jan 2023 18:42:15 -0600 Subject: [PATCH 12/18] Update design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md Co-authored-by: John DiSanti <jdisanti@amazon.com> --- .../rfcs/rfc0031_providing_fallback_credentials_on_timeout.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md b/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md index f14ff39e2b..7715646fd9 100644 --- a/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md +++ b/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md @@ -101,7 +101,7 @@ let future = Timeout::new(provider.provide_credentials(), timeout_future); let result = cache .get_or_load(|| { async move { - let credentials= match future.await { + let credentials = match future.await { Ok(creds) => creds?, Err(_err) => match provider.on_timeout().await { // can provide fallback credentials Some(creds) => creds, From d413e58e57dc0fd7bcd68583414bb7aa250c684c Mon Sep 17 00:00:00 2001 From: Yuki Saito <awsaito@amazon.com> Date: Fri, 20 Jan 2023 12:38:51 -0600 Subject: [PATCH 13/18] Incorporate review feedback into RFC This commit addresses the review feedback: https://github.com/awslabs/smithy-rs/pull/2218#discussion_r1072657848 https://github.com/awslabs/smithy-rs/pull/2218#discussion_r1072685567 https://github.com/awslabs/smithy-rs/pull/2218#discussion_r1072693150 In addition, we have renamed the proposed method `on_timeout` to `fallback_on_interrupt` to make it more descriptive. --- ...oviding_fallback_credentials_on_timeout.md | 122 ++++++++++++++---- 1 file changed, 96 insertions(+), 26 deletions(-) diff --git a/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md b/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md index 7715646fd9..ab5faf1f7a 100644 --- a/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md +++ b/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md @@ -1,24 +1,24 @@ RFC: Providing fallback credentials on external timeout ======================================================= - > Status: RFC > > Applies to: client For a summarized list of proposed changes, see the [Changes Checklist](#changes-checklist) section. -This RFC proposes a fallback mechanism for credentials providers on external timeout (see the [Terminology](#terminology) section), allowing them to continue serving (possibly stale) credentials for the sake of overall reliability of the intended service; The IMDS credentials provider is an example that must fulfill such a requirement to support static stability. +This RFC proposes a fallback mechanism for credentials providers on external timeout (see the [Terminology](#terminology) section), allowing them to continue serving (possibly expired) credentials for the sake of overall reliability of the intended service; The IMDS credentials provider is an example that must fulfill such a requirement to support static stability. Terminology ----------- - External timeout: The name of the timeout that occurs when a duration elapses before an async call to `provide_credentials` returns. In this case, `provide_credentials` returns no credentials. - Internal timeout: The name of the timeout that occurs when a duration elapses before an async call to some function, inside the implementation of `provide_credentials`, returns. Examples include connection timeouts, TLS negotiation timeouts, and HTTP request timeouts. Implementations of `provide_credentials` may handle these failures at their own discretion e.g. by returning _(possibly expired)_ credentials or a `CredentialsError`. +- Static stability: Continued availability of a service in the face of impaired dependencies. Assumption ---------- -This RFC is concerned only with external timeouts, as the cost of poor API design is much higher than in this case than for internal timeouts. The former will affect a public trait implemented by all credentials providers whereas the latter can be handled locally by individual credentials providers without affecting one another. +This RFC is concerned only with external timeouts, as the cost of poor API design is much higher in this case than for internal timeouts. The former will affect a public trait implemented by all credentials providers whereas the latter can be handled locally by individual credentials providers without affecting one another. Problem ------- @@ -49,7 +49,7 @@ let result = cache ``` This creates an external timeout for `provide_credentials`. If `timeout_future` wins the race, a future for `provide_credentials` gets dropped, `timeout_future` returns an error, and the error is mapped to `CredentialsError::ProviderTimedOut` and returned. This makes it impossible for the variable `provider` above to serve credentials as stated in REQ 1. -A slightly more complex use case involves `CredentialsProviderChain`. It is a manifestation of the chain of responsibility pattern and keeps calling the `provide_credentials` method on each credentials provider down the chain until credentials are returned by one of them. In addition to REQ 1, we have the following functional requirement with respect to `CredentialsProviderChain`: +A more complex use case involves `CredentialsProviderChain`. It is a manifestation of the chain of responsibility pattern and keeps calling the `provide_credentials` method on each credentials provider down the chain until credentials are returned by one of them. In addition to REQ 1, we have the following functional requirement with respect to `CredentialsProviderChain`: - REQ 2: Once a credentials provider in the chain has returned credentials, it should continue serving them even in the event of a timeout (whether internal or external) without falling back to another credentials provider. Referring back to the code snippet above, we analyze two relevant cases (and suppose provider 2 below must meet REQ 1 and REQ 2 in each case): @@ -73,25 +73,40 @@ The figure above illustrates an example with the same setting as the previous fi Proposal -------- -To address the problem in the previous section, we propose to add a new method to the `ProvideCredentials` trait called `on_timeout` that looks something like this: +To address the problem in the previous section, we propose to add a new method to the `ProvideCredentials` trait called `fallback_on_interrupt` that looks something like this: ```rust -mod future { +pub trait ProvideCredentials: Send + Sync + std::fmt::Debug { // --snip-- - pub struct OnTimeout<'a>(NowOrLater<Option<Credentials>, BoxFuture<'a, Option<Credentials>>>); - - // impls for OnTimeout similar to those for the ProvideCredentials future newtype + fn fallback_on_interrupt(&self) -> future::FallbackOnInterrupt { + future::FallbackOnInterrupt::ready(None) + } } -pub trait ProvideCredentials: Send + Sync + std::fmt::Debug { +mod future { // --snip-- - fn on_timeout<'a>(&'a self) -> future::OnTimeout<'a> { - future::OnTimeout::ready(None) + // Mimics `FutureExt::now_or_never` using `NowOrLater` with `OnlyReady`. + pub struct FallbackOnInterrupt(NowOrLater<Option<Credentials>, OnlyReady>); + + impl FallbackOnInterrupt { + pub fn ready(credentials: Option<Credentials>) -> Self { + FallbackOnInterrupt(NowOrLater::ready(credentials)) + } + } + + impl Future for FallbackOnInterrupt { + type Output = Option<Credentials>; + + fn poll(mut self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output> { + Pin::new(&mut self.0).poll(cx) + } } } ``` -This method allows credentials providers to have a fallback mechanism on an external timeout and to serve credentials to users if needed. It comes with a default implementation to return a future that immediately yields `None`. Credentials providers that need a different behavior can choose to override the method. +This method allows credentials providers to have a fallback mechanism on an external timeout and to serve credentials to users if needed. Since it is a fallback plan, the execution should not take too long. The concept of "now or never" fits well and we can use `NowOrLater` by supplying the `OnlyReady` type to realize that. We considerd this as a synchronous primitive but we did not need an async critical section for this operation so an asynchronous primitive should be a reasonable design decision here. + +The default implementation returns a future that immediately yields `None`. Credentials providers that need a different behavior can choose to override the method. The user experience for the code snippet in question will look like this once this proposal is implemented: ```rust @@ -103,7 +118,7 @@ let result = cache async move { let credentials = match future.await { Ok(creds) => creds?, - Err(_err) => match provider.on_timeout().await { // can provide fallback credentials + Err(_err) => match provider.fallback_on_interrupt().await { // can provide fallback credentials Some(creds) => creds, None => return Err(CredentialsError::provider_timed_out(load_timeout)), } @@ -118,19 +133,19 @@ let result = cache How to actually implement this RFC ---------------------------------- -Almost all credentials providers do not have to implement their own `on_timeout` except for `CredentialsProviderChain` (`ImdsCredentialsProvider` also needs to implement `on_timeout` when we are adding static stability support to it but that is outside the scope of this RFC). +Almost all credentials providers do not have to implement their own `fallback_on_interrupt` except for `CredentialsProviderChain` (`ImdsCredentialsProvider` also needs to implement `fallback_on_interrupt` when we are adding static stability support to it but that is outside the scope of this RFC). -Considering the two cases we analyzed above, implementing `CredentialsProviderChain::on_timeout` is not so straightforward. Keeping track of whose turn in the chain it is to call `provide_credentials` when an external timeout has occurred is a challenging task. Even if we figured it out, that would still not satisfy `Case 2` above, because it was provider 1 that was actively running when the external timeout kicked in, but the chain should return credentials from provider 2, not from provider 1. +Considering the two cases we analyzed above, implementing `CredentialsProviderChain::fallback_on_interrupt` is not so straightforward. Keeping track of whose turn in the chain it is to call `provide_credentials` when an external timeout has occurred is a challenging task. Even if we figured it out, that would still not satisfy `Case 2` above, because it was provider 1 that was actively running when the external timeout kicked in, but the chain should return credentials from provider 2, not from provider 1. With that in mind, consider instead the following approach: ```rust impl ProvideCredentials for CredentialsProviderChain { // --snip-- - fn on_timeout<'a>(&'a self) -> future::OnTimeout<'a> { - future::OnTimeout::new(async move { + fn fallback_on_interrupt(&self) -> future::FallbackOnInterrupt { + future::FallbackOnInterrupt::ready(async move { for (_, provider) in &self.providers { - match provider.on_timeout().await { + match provider.fallback_on_interrupt().await { creds @ Some(_) => return creds, None => {} } @@ -140,18 +155,18 @@ impl ProvideCredentials for CredentialsProviderChain { } } ``` -`CredentialsProviderChain::on_timeout` will invoke each provider's `on_timeout` method until credentials are returned by one of them. It ensures that the updated code snippet for `LazyCredentialsCache` can return credentials from provider 2 in both `Case 1` and `Case 2`. Even if `timeout_future` wins the race, the execution subsequently calls `provider.on_timeout().await` to obtain fallback credentials from provider 2, assuming provider 2's `on_timeout` is implemented to return credentials retrieved on the first call to `CredentialsProviderChain::provide_credentials`. +`CredentialsProviderChain::fallback_on_interrupt` will invoke each provider's `fallback_on_interrupt` method until credentials are returned by one of them. It ensures that the updated code snippet for `LazyCredentialsCache` can return credentials from provider 2 in both `Case 1` and `Case 2`. Even if `timeout_future` wins the race, the execution subsequently calls `provider.fallback_on_interrupt().await` to obtain fallback credentials from provider 2, assuming provider 2's `fallback_on_interrupt` is implemented to return fallback credentials accordingly. -The downside of this simple approach is that the behavior is not clear if more than one credentials provider in the chain can return credentials from their `on_timeout`. Note, however, that it is the exception rather than the norm for a provider's `on_timeout` to return fallback credentials, at least at the time of writing (01/13/2023). The fact that it returns fallback credentials means that the provider successfully loaded credentials at least once, and it usually continues serving credentials on subsequent calls to `provide_credentials`. +The downside of this simple approach is that the behavior is not clear if more than one credentials provider in the chain can return credentials from their `fallback_on_interrupt`. Note, however, that it is the exception rather than the norm for a provider's `fallback_on_interrupt` to return fallback credentials, at least at the time of writing (01/13/2023). The fact that it returns fallback credentials means that the provider successfully loaded credentials at least once, and it usually continues serving credentials on subsequent calls to `provide_credentials`. -Should we have more than one provider in the chain that can potentially return fallback credentials from `on_timeout`, we could configure the behavior of `CredentialsProviderChain` managing in what order and how each `on_timeout` should be executed; it can be possible to introduce a new API without breaking changes. That being said, we first need to investigate and understand the use case behind it to design the right API. +Should we have more than one provider in the chain that can potentially return fallback credentials from `fallback_on_interrupt`, we could configure the behavior of `CredentialsProviderChain` managing in what order and how each `fallback_on_interrupt` should be executed. See the [Appendix](#appendix) section for more details. The use case described there is an extreme edge case, but it's worth exploring what options are available to us with the proposed design. Alternative ----------- In this section, we will describe an alternative approach that we ended up dismissing as unworkable. -Instead of `on_timeout`, we considered the following method to be added to the `ProvideCredentials` trait: +Instead of `fallback_on_interrupt`, we considered the following method to be added to the `ProvideCredentials` trait: ```rust pub trait ProvideCredentials: Send + Sync + std::fmt::Debug { // --snip-- @@ -243,7 +258,62 @@ There are mainly two problems with this approach. The first problem is that as s Changes checklist ----------------- -- [ ] Add `on_timeout` method to the `ProvideCredentials` trait with the default implementation -- [ ] Implement the `OnTimeout` future newtype -- [ ] Implement `CredentialsProviderChain::on_timeout` +- [ ] Add `fallback_on_interrupt` method to the `ProvideCredentials` trait with the default implementation +- [ ] Implement the `FallbackOnInterrupt` future newtype +- [ ] Implement `CredentialsProviderChain::fallback_on_interrupt` +- [ ] Implement `DefaultCredentialsChain::fallback_on_interrupt` - [ ] Add unit tests for `Case 1` and `Case 2` + +Appendix +-------- +We will describe how to customize the behavior for `CredentialsProviderChain::fallback_on_interrupt`. We are only demonstrating how much the proposed design can be extended and currently do not have concrete use cases to implement using what we present in this section. + +As described in the [Proposal](#proposal) section, `CredentialsProviderChain::fallback_on_interrupt` traverses the chain from the head to the tail and returns the first fallback credentials found. This precedence policy works most of the time, but when we have more than one provider in the chain that can potentially return fallback credentials, it could break in the following edge case (we are still basing our discussion on the code snippet from `LazyCredentialsCache` but forget REQ 1 and REQ 2 for the sake of simplicity). + +<p align="center"> +<img width="800" alt="fallback_on_interrupt_appendix excalidraw" src="https://user-images.githubusercontent.com/15333866/213618808-d19892d8-5c83-4039-9940-280dcd2a8cf1.png"> +</p> + +During the first call to `CredentialsProviderChain::provide_credentials`, provider 1 fails to load credentials, maybe due to an internal timeout, and then provider 2 succeeds in loading its credentials (call them credentials 2) and internally stores them for `Provider2::fallback_on_interrupt` to return them subsequently. During the second call, provider 1 succeeds in loading credentials (call them credentials 1) and internally stores them for `Provider1::fallback_on_interrupt` to return them subsequently. Suppose, however, that credentials 1's expiry is earlier than credentials 2's expiry. Finally, during the third call, `CredentialsProviderChain::provide_credentials` did not complete due to an external timeout. `CredentialsProviderChain::fallback_on_interrupt` then returns credentials 1, when it should return credentials 2 whose expiry is later, because of the precedence policy. + +This a case where `CredentialsProviderChain::fallback_on_interrupt` requires the recency policy for fallback credentials found in provider 1 and provider 2, not the precedence policy. The following figure shows how we can set up such a chain: + +<p align="center"> +<img width="832" alt="heterogeneous_policies_for_fallback_on_interrupt" src="https://user-images.githubusercontent.com/15333866/213755875-ac6fddbc-0f1b-4437-af16-6e0dbe08ae04.png"> +</p> + +The outermost chain is a `CredentialsProviderChain` and follows the precedence policy for `fallback_on_interrupt`. It contains a sub-chain that, in turn, contains provider 1 and provider 2. This sub-chain implements its own `fallback_on_interrupt` to realize the recency policy for fallback credentials found in provider 1 and provider 2. Conceptually, we have +``` +pub struct FallbackRecencyChain { + provider_chain: CredentialsProviderChain, +} + +impl ProvideCredentials for FallbackRecencyChain { + fn provide_credentials<'a>(&'a self) -> future::ProvideCredentials<'a> + where + Self: 'a, + { + // Can follow the precedence policy for loading credentials + // if it chooses to do so. + } + + fn fallback_on_interrupt(&self) -> future::FallbackOnInterrupt { + // Iterate over `self.provider_chain` and return + // fallback credentials whose expiry is the most recent. + } +} +``` +We can then compose the entire chain like so: +``` +let provider_1 = /* ... */ +let provider_2 = /* ... */ +let provider_3 = /* ... */ +let sub_chain = CredentialsProviderChain::first_try("Provider1", provider_1) + .or_else("Provider2", provider_2); +let recency_chain = /* Create a FallbackRecencyChain with sub_chain */ +let final_chain = CredentialsProviderChain::first_try("fallback_recency", recency_chain) + .or_else("Provider3", provider_3); +``` +The `fallback_on_interrupt` method on `final_chain` still traverses from the head to the tail, but once it hits `recency_chain`, `fallback_on_interrupt` on `recency_chain` respects the expiry of fallback credentials found in its inner providers. + +What we have presented in this section can be generalized thanks to chain composability. We could have different sub-chains, each implementing its own policy for `fallback_on_interrupt`. From 5bb7f01d759d78b335037b7efb38be57fee67364 Mon Sep 17 00:00:00 2001 From: ysaito1001 <gperson22@gmail.com> Date: Fri, 20 Jan 2023 13:07:58 -0600 Subject: [PATCH 14/18] Update rfc0031_providing_fallback_credentials_on_timeout.md --- .../rfcs/rfc0031_providing_fallback_credentials_on_timeout.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md b/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md index ab5faf1f7a..d0726104c8 100644 --- a/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md +++ b/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md @@ -279,7 +279,7 @@ During the first call to `CredentialsProviderChain::provide_credentials`, provid This a case where `CredentialsProviderChain::fallback_on_interrupt` requires the recency policy for fallback credentials found in provider 1 and provider 2, not the precedence policy. The following figure shows how we can set up such a chain: <p align="center"> -<img width="832" alt="heterogeneous_policies_for_fallback_on_interrupt" src="https://user-images.githubusercontent.com/15333866/213755875-ac6fddbc-0f1b-4437-af16-6e0dbe08ae04.png"> +<img width="700" alt="heterogeneous_policies_for_fallback_on_interrupt" src="https://user-images.githubusercontent.com/15333866/213755875-ac6fddbc-0f1b-4437-af16-6e0dbe08ae04.png"> </p> The outermost chain is a `CredentialsProviderChain` and follows the precedence policy for `fallback_on_interrupt`. It contains a sub-chain that, in turn, contains provider 1 and provider 2. This sub-chain implements its own `fallback_on_interrupt` to realize the recency policy for fallback credentials found in provider 1 and provider 2. Conceptually, we have From 7b62d473b25c0ebaf8fb3880a8987e4cda28fd23 Mon Sep 17 00:00:00 2001 From: Yuki Saito <awsaito@amazon.com> Date: Fri, 20 Jan 2023 22:07:49 -0600 Subject: [PATCH 15/18] Update RFC This commit updates RFC and leaves to discussion how `fallback_on_interrupt` should be implemented, i.e., either as a synchronous primitive or an asynchronous primitive. --- ...oviding_fallback_credentials_on_timeout.md | 67 ++++++++++--------- 1 file changed, 37 insertions(+), 30 deletions(-) diff --git a/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md b/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md index d0726104c8..26601aac0b 100644 --- a/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md +++ b/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md @@ -73,40 +73,49 @@ The figure above illustrates an example with the same setting as the previous fi Proposal -------- -To address the problem in the previous section, we propose to add a new method to the `ProvideCredentials` trait called `fallback_on_interrupt` that looks something like this: +To address the problem in the previous section, we propose to add a new method to the `ProvideCredentials` trait called `fallback_on_interrupt`. This method allows credentials providers to have a fallback mechanism on an external timeout and to serve credentials to users if needed. There are two options as to how it is implemented, either as a synchronous primitive or as an asynchronous primitive. + +#### Option A: Synchronous primitive ```rust pub trait ProvideCredentials: Send + Sync + std::fmt::Debug { // --snip-- - fn fallback_on_interrupt(&self) -> future::FallbackOnInterrupt { - future::FallbackOnInterrupt::ready(None) + fn fallback_on_interrupt(&self) -> Option<Credentials> { + None } } +``` +- :+1: Users can be guided to use only synchronous primitives when implementing `fallback_on_interrupt`. +- :-1: It cannot support cases where fallback credentials are asynchronously retrieved. +#### Option B: Asynchronous primitive +```rust mod future { // --snip-- - // Mimics `FutureExt::now_or_never` using `NowOrLater` with `OnlyReady`. - pub struct FallbackOnInterrupt(NowOrLater<Option<Credentials>, OnlyReady>); + // This cannot use `OnlyReady` in place of `BoxFuture` because + // when a chain of credentials providers implements its own + // `fallback_on_interrupt`, it needs to await fallback credentials + // in its inner providers. Thus, `BoxFuture` is required. + pub struct FallbackOnInterrupt<'a>(NowOrLater<Option<Credentials>, BoxFuture<'a, Option<Credentials>>>); - impl FallbackOnInterrupt { - pub fn ready(credentials: Option<Credentials>) -> Self { - FallbackOnInterrupt(NowOrLater::ready(credentials)) - } - } + // impls for FallbackOnInterrupt similar to those for the ProvideCredentials future newtype +} - impl Future for FallbackOnInterrupt { - type Output = Option<Credentials>; +pub trait ProvideCredentials: Send + Sync + std::fmt::Debug { + // --snip-- - fn poll(mut self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output> { - Pin::new(&mut self.0).poll(cx) - } + fn fallback_on_interrupt<'a>(&'a self) -> future::FallbackOnInterrupt<'a> { + future::FallbackOnInterrupt::ready(None) } } ``` -This method allows credentials providers to have a fallback mechanism on an external timeout and to serve credentials to users if needed. Since it is a fallback plan, the execution should not take too long. The concept of "now or never" fits well and we can use `NowOrLater` by supplying the `OnlyReady` type to realize that. We considerd this as a synchronous primitive but we did not need an async critical section for this operation so an asynchronous primitive should be a reasonable design decision here. +- :+1: It is async from the beginning, so less likely to introduce a breaking change. +- :-1: We may have to consider yet another timeout for `fallback_on_interrupt` itself. + +However it is implemented, the execution should not take too long because it is a fallback plan (we will document it to ensure the users are made aware). It comes with the default implementation, but credentials providers that need a different behavior can choose to override the method. -The default implementation returns a future that immediately yields `None`. Credentials providers that need a different behavior can choose to override the method. +_For the remainder of the RFC, we will choose Option A for the sake of discussion._ The user experience for the code snippet in question will look like this once this proposal is implemented: ```rust @@ -118,7 +127,7 @@ let result = cache async move { let credentials = match future.await { Ok(creds) => creds?, - Err(_err) => match provider.fallback_on_interrupt().await { // can provide fallback credentials + Err(_err) => match provider.fallback_on_interrupt() { // can provide fallback credentials Some(creds) => creds, None => return Err(CredentialsError::provider_timed_out(load_timeout)), } @@ -142,20 +151,18 @@ With that in mind, consider instead the following approach: impl ProvideCredentials for CredentialsProviderChain { // --snip-- - fn fallback_on_interrupt(&self) -> future::FallbackOnInterrupt { - future::FallbackOnInterrupt::ready(async move { - for (_, provider) in &self.providers { - match provider.fallback_on_interrupt().await { - creds @ Some(_) => return creds, - None => {} - } + fn fallback_on_interrupt(&self) -> Option<Credentials> { + for (_, provider) in &self.providers { + match provider.fallback_on_interrupt() { + creds @ Some(_) => return creds, + None => {} } - None - }) + } + None } } ``` -`CredentialsProviderChain::fallback_on_interrupt` will invoke each provider's `fallback_on_interrupt` method until credentials are returned by one of them. It ensures that the updated code snippet for `LazyCredentialsCache` can return credentials from provider 2 in both `Case 1` and `Case 2`. Even if `timeout_future` wins the race, the execution subsequently calls `provider.fallback_on_interrupt().await` to obtain fallback credentials from provider 2, assuming provider 2's `fallback_on_interrupt` is implemented to return fallback credentials accordingly. +`CredentialsProviderChain::fallback_on_interrupt` will invoke each provider's `fallback_on_interrupt` method until credentials are returned by one of them. It ensures that the updated code snippet for `LazyCredentialsCache` can return credentials from provider 2 in both `Case 1` and `Case 2`. Even if `timeout_future` wins the race, the execution subsequently calls `provider.fallback_on_interrupt()` to obtain fallback credentials from provider 2, assuming provider 2's `fallback_on_interrupt` is implemented to return fallback credentials accordingly. The downside of this simple approach is that the behavior is not clear if more than one credentials provider in the chain can return credentials from their `fallback_on_interrupt`. Note, however, that it is the exception rather than the norm for a provider's `fallback_on_interrupt` to return fallback credentials, at least at the time of writing (01/13/2023). The fact that it returns fallback credentials means that the provider successfully loaded credentials at least once, and it usually continues serving credentials on subsequent calls to `provide_credentials`. @@ -258,8 +265,8 @@ There are mainly two problems with this approach. The first problem is that as s Changes checklist ----------------- +- [ ] Decide whether `fallback_on_interrupt` is implemented as a synchronous primitive or an asynchronous primitive - [ ] Add `fallback_on_interrupt` method to the `ProvideCredentials` trait with the default implementation -- [ ] Implement the `FallbackOnInterrupt` future newtype - [ ] Implement `CredentialsProviderChain::fallback_on_interrupt` - [ ] Implement `DefaultCredentialsChain::fallback_on_interrupt` - [ ] Add unit tests for `Case 1` and `Case 2` @@ -297,7 +304,7 @@ impl ProvideCredentials for FallbackRecencyChain { // if it chooses to do so. } - fn fallback_on_interrupt(&self) -> future::FallbackOnInterrupt { + fn fallback_on_interrupt(&self) -> Option<Credentials> { // Iterate over `self.provider_chain` and return // fallback credentials whose expiry is the most recent. } From 7311f87a8654e704e4df09430245e91daf813e7b Mon Sep 17 00:00:00 2001 From: ysaito1001 <gperson22@gmail.com> Date: Sun, 22 Jan 2023 13:22:09 -0600 Subject: [PATCH 16/18] Update rfc0031_providing_fallback_credentials_on_timeout.md --- ...oviding_fallback_credentials_on_timeout.md | 32 ++++++++++++------- 1 file changed, 20 insertions(+), 12 deletions(-) diff --git a/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md b/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md index 26601aac0b..3f627ac694 100644 --- a/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md +++ b/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md @@ -87,6 +87,7 @@ pub trait ProvideCredentials: Send + Sync + std::fmt::Debug { ``` - :+1: Users can be guided to use only synchronous primitives when implementing `fallback_on_interrupt`. - :-1: It cannot support cases where fallback credentials are asynchronously retrieved. +- :-1: It may turn into a blocking operation if it takes longer than it should. #### Option B: Asynchronous primitive ```rust @@ -113,11 +114,9 @@ pub trait ProvideCredentials: Send + Sync + std::fmt::Debug { - :+1: It is async from the beginning, so less likely to introduce a breaking change. - :-1: We may have to consider yet another timeout for `fallback_on_interrupt` itself. -However it is implemented, the execution should not take too long because it is a fallback plan (we will document it to ensure the users are made aware). It comes with the default implementation, but credentials providers that need a different behavior can choose to override the method. +Option A cannot be reversible in the future if we are to support the use case for asynchronously retrieving the fallback credentials, whereas option B allows us to continue supporting both ready and pending futures when retrieving the fallback credentials. Should we require a timeout for `fallback_on_interrupt`, it will be set as a separate configuration for `LazyCredentialsCache` just like `load_timeout` (a timeout future with the configured value can then be raced against a future for `fallback_on_interrupt`). It is possible to introduce that configulation later without breaking existing customers, _assuming_ they do not execute a long-running operation in `fallback_on_interrupt`. -_For the remainder of the RFC, we will choose Option A for the sake of discussion._ - -The user experience for the code snippet in question will look like this once this proposal is implemented: +For thease reasons, we will choose option B over option A. The user experience for the code snippet in question will look like this once this proposal is implemented: ```rust let timeout_future = self.sleeper.sleep(self.load_timeout); // by default self.load_timeout is 5 seconds. // --snip-- @@ -127,7 +126,7 @@ let result = cache async move { let credentials = match future.await { Ok(creds) => creds?, - Err(_err) => match provider.fallback_on_interrupt() { // can provide fallback credentials + Err(_err) => match provider.fallback_on_interrupt().await { // can provide fallback credentials Some(creds) => creds, None => return Err(CredentialsError::provider_timed_out(load_timeout)), } @@ -148,12 +147,11 @@ Considering the two cases we analyzed above, implementing `CredentialsProviderCh With that in mind, consider instead the following approach: ```rust -impl ProvideCredentials for CredentialsProviderChain { +impl CredentialsProviderChain { // --snip-- - - fn fallback_on_interrupt(&self) -> Option<Credentials> { + async fn fallback_credentials(&self) -> Option<Credentials> { for (_, provider) in &self.providers { - match provider.fallback_on_interrupt() { + match provider.fallback_on_interrupt().await { creds @ Some(_) => return creds, None => {} } @@ -161,8 +159,16 @@ impl ProvideCredentials for CredentialsProviderChain { None } } + +impl ProvideCredentials for CredentialsProviderChain { + // --snip-- + + fn fallback_on_interrupt(&self) -> future::FallbackOnInterrupt<'_> { + future::FallbackOnInterrupt::new(self.fallback_credentials()) + } +} ``` -`CredentialsProviderChain::fallback_on_interrupt` will invoke each provider's `fallback_on_interrupt` method until credentials are returned by one of them. It ensures that the updated code snippet for `LazyCredentialsCache` can return credentials from provider 2 in both `Case 1` and `Case 2`. Even if `timeout_future` wins the race, the execution subsequently calls `provider.fallback_on_interrupt()` to obtain fallback credentials from provider 2, assuming provider 2's `fallback_on_interrupt` is implemented to return fallback credentials accordingly. +`CredentialsProviderChain::fallback_on_interrupt` will invoke each provider's `fallback_on_interrupt` method until credentials are returned by one of them. It ensures that the updated code snippet for `LazyCredentialsCache` can return credentials from provider 2 in both `Case 1` and `Case 2`. Even if `timeout_future` wins the race, the execution subsequently calls `provider.fallback_on_interrupt().await` to obtain fallback credentials from provider 2, assuming provider 2's `fallback_on_interrupt` is implemented to return fallback credentials accordingly. The downside of this simple approach is that the behavior is not clear if more than one credentials provider in the chain can return credentials from their `fallback_on_interrupt`. Note, however, that it is the exception rather than the norm for a provider's `fallback_on_interrupt` to return fallback credentials, at least at the time of writing (01/13/2023). The fact that it returns fallback credentials means that the provider successfully loaded credentials at least once, and it usually continues serving credentials on subsequent calls to `provide_credentials`. @@ -265,7 +271,6 @@ There are mainly two problems with this approach. The first problem is that as s Changes checklist ----------------- -- [ ] Decide whether `fallback_on_interrupt` is implemented as a synchronous primitive or an asynchronous primitive - [ ] Add `fallback_on_interrupt` method to the `ProvideCredentials` trait with the default implementation - [ ] Implement `CredentialsProviderChain::fallback_on_interrupt` - [ ] Implement `DefaultCredentialsChain::fallback_on_interrupt` @@ -304,7 +309,7 @@ impl ProvideCredentials for FallbackRecencyChain { // if it chooses to do so. } - fn fallback_on_interrupt(&self) -> Option<Credentials> { + fn fallback_on_interrupt(&self) -> future::FallbackOnInterrupt<'_> { // Iterate over `self.provider_chain` and return // fallback credentials whose expiry is the most recent. } @@ -315,9 +320,12 @@ We can then compose the entire chain like so: let provider_1 = /* ... */ let provider_2 = /* ... */ let provider_3 = /* ... */ + let sub_chain = CredentialsProviderChain::first_try("Provider1", provider_1) .or_else("Provider2", provider_2); + let recency_chain = /* Create a FallbackRecencyChain with sub_chain */ + let final_chain = CredentialsProviderChain::first_try("fallback_recency", recency_chain) .or_else("Provider3", provider_3); ``` From 7aca08bf7d619b4ac18fc9ed63ee7dc6bff4c5b1 Mon Sep 17 00:00:00 2001 From: ysaito1001 <gperson22@gmail.com> Date: Mon, 23 Jan 2023 13:02:28 -0600 Subject: [PATCH 17/18] Update rfc0031_providing_fallback_credentials_on_timeout.md --- ...oviding_fallback_credentials_on_timeout.md | 25 +++++++------------ 1 file changed, 9 insertions(+), 16 deletions(-) diff --git a/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md b/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md index 3f627ac694..437b40600a 100644 --- a/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md +++ b/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md @@ -114,9 +114,9 @@ pub trait ProvideCredentials: Send + Sync + std::fmt::Debug { - :+1: It is async from the beginning, so less likely to introduce a breaking change. - :-1: We may have to consider yet another timeout for `fallback_on_interrupt` itself. -Option A cannot be reversible in the future if we are to support the use case for asynchronously retrieving the fallback credentials, whereas option B allows us to continue supporting both ready and pending futures when retrieving the fallback credentials. Should we require a timeout for `fallback_on_interrupt`, it will be set as a separate configuration for `LazyCredentialsCache` just like `load_timeout` (a timeout future with the configured value can then be raced against a future for `fallback_on_interrupt`). It is possible to introduce that configulation later without breaking existing customers, _assuming_ they do not execute a long-running operation in `fallback_on_interrupt`. +Option A cannot be reversible in the future if we are to support the use case for asynchronously retrieving the fallback credentials, whereas option B allows us to continue supporting both ready and pending futures when retrieving the fallback credentials. However, `fallback_on_interrupt` is supposed to return credentials that have been set aside in case `provide_credentials` is timed out. To express that intent, we choose option A and document that users should NOT go fetch new credentials in `fallback_on_interrupt`. -For thease reasons, we will choose option B over option A. The user experience for the code snippet in question will look like this once this proposal is implemented: +The user experience for the code snippet in question will look like this once this proposal is implemented: ```rust let timeout_future = self.sleeper.sleep(self.load_timeout); // by default self.load_timeout is 5 seconds. // --snip-- @@ -126,7 +126,7 @@ let result = cache async move { let credentials = match future.await { Ok(creds) => creds?, - Err(_err) => match provider.fallback_on_interrupt().await { // can provide fallback credentials + Err(_err) => match provider.fallback_on_interrupt() { // can provide fallback credentials Some(creds) => creds, None => return Err(CredentialsError::provider_timed_out(load_timeout)), } @@ -147,11 +147,12 @@ Considering the two cases we analyzed above, implementing `CredentialsProviderCh With that in mind, consider instead the following approach: ```rust -impl CredentialsProviderChain { +impl ProvideCredentials for CredentialsProviderChain { // --snip-- - async fn fallback_credentials(&self) -> Option<Credentials> { + + fn fallback_on_interrupt(&self) -> Option<Credentials> { { for (_, provider) in &self.providers { - match provider.fallback_on_interrupt().await { + match provider.fallback_on_interrupt() { creds @ Some(_) => return creds, None => {} } @@ -159,16 +160,8 @@ impl CredentialsProviderChain { None } } - -impl ProvideCredentials for CredentialsProviderChain { - // --snip-- - - fn fallback_on_interrupt(&self) -> future::FallbackOnInterrupt<'_> { - future::FallbackOnInterrupt::new(self.fallback_credentials()) - } -} ``` -`CredentialsProviderChain::fallback_on_interrupt` will invoke each provider's `fallback_on_interrupt` method until credentials are returned by one of them. It ensures that the updated code snippet for `LazyCredentialsCache` can return credentials from provider 2 in both `Case 1` and `Case 2`. Even if `timeout_future` wins the race, the execution subsequently calls `provider.fallback_on_interrupt().await` to obtain fallback credentials from provider 2, assuming provider 2's `fallback_on_interrupt` is implemented to return fallback credentials accordingly. +`CredentialsProviderChain::fallback_on_interrupt` will invoke each provider's `fallback_on_interrupt` method until credentials are returned by one of them. It ensures that the updated code snippet for `LazyCredentialsCache` can return credentials from provider 2 in both `Case 1` and `Case 2`. Even if `timeout_future` wins the race, the execution subsequently calls `provider.fallback_on_interrupt()` to obtain fallback credentials from provider 2, assuming provider 2's `fallback_on_interrupt` is implemented to return fallback credentials accordingly. The downside of this simple approach is that the behavior is not clear if more than one credentials provider in the chain can return credentials from their `fallback_on_interrupt`. Note, however, that it is the exception rather than the norm for a provider's `fallback_on_interrupt` to return fallback credentials, at least at the time of writing (01/13/2023). The fact that it returns fallback credentials means that the provider successfully loaded credentials at least once, and it usually continues serving credentials on subsequent calls to `provide_credentials`. @@ -309,7 +302,7 @@ impl ProvideCredentials for FallbackRecencyChain { // if it chooses to do so. } - fn fallback_on_interrupt(&self) -> future::FallbackOnInterrupt<'_> { + fn fallback_on_interrupt(&self) -> Option<Credentials> { // Iterate over `self.provider_chain` and return // fallback credentials whose expiry is the most recent. } From 9608f253f35bb4c22c31df52ca755749dc68fe58 Mon Sep 17 00:00:00 2001 From: ysaito1001 <gperson22@gmail.com> Date: Thu, 26 Jan 2023 13:18:16 -0600 Subject: [PATCH 18/18] Update rfc0031_providing_fallback_credentials_on_timeout.md --- .../rfc0031_providing_fallback_credentials_on_timeout.md | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md b/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md index 437b40600a..98ccfad3d9 100644 --- a/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md +++ b/design/src/rfcs/rfc0031_providing_fallback_credentials_on_timeout.md @@ -1,6 +1,6 @@ RFC: Providing fallback credentials on external timeout ======================================================= -> Status: RFC +> Status: Implemented in [smithy-rs#2246](https://github.com/awslabs/smithy-rs/pull/2246) > > Applies to: client @@ -165,7 +165,8 @@ impl ProvideCredentials for CredentialsProviderChain { The downside of this simple approach is that the behavior is not clear if more than one credentials provider in the chain can return credentials from their `fallback_on_interrupt`. Note, however, that it is the exception rather than the norm for a provider's `fallback_on_interrupt` to return fallback credentials, at least at the time of writing (01/13/2023). The fact that it returns fallback credentials means that the provider successfully loaded credentials at least once, and it usually continues serving credentials on subsequent calls to `provide_credentials`. -Should we have more than one provider in the chain that can potentially return fallback credentials from `fallback_on_interrupt`, we could configure the behavior of `CredentialsProviderChain` managing in what order and how each `fallback_on_interrupt` should be executed. See the [Appendix](#appendix) section for more details. The use case described there is an extreme edge case, but it's worth exploring what options are available to us with the proposed design. +Should we have more than one provider in the chain that can potentially return fallback credentials from `fallback_on_interrupt`, we could configure the behavior of `CredentialsProviderChain` managing in what order and how each `fallback_on_interrupt` should be executed. See the [Possible enhancement +](#possible-enhancement) section for more details. The use case described there is an extreme edge case, but it's worth exploring what options are available to us with the proposed design. Alternative ----------- @@ -269,8 +270,8 @@ Changes checklist - [ ] Implement `DefaultCredentialsChain::fallback_on_interrupt` - [ ] Add unit tests for `Case 1` and `Case 2` -Appendix --------- +Possible enhancement +-------------------- We will describe how to customize the behavior for `CredentialsProviderChain::fallback_on_interrupt`. We are only demonstrating how much the proposed design can be extended and currently do not have concrete use cases to implement using what we present in this section. As described in the [Proposal](#proposal) section, `CredentialsProviderChain::fallback_on_interrupt` traverses the chain from the head to the tail and returns the first fallback credentials found. This precedence policy works most of the time, but when we have more than one provider in the chain that can potentially return fallback credentials, it could break in the following edge case (we are still basing our discussion on the code snippet from `LazyCredentialsCache` but forget REQ 1 and REQ 2 for the sake of simplicity).