-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test infrastructure: inject time everywhere #4180
Conversation
Signed-off-by: Joshua Marantz <jmarantz@google.com>
…hanged mocks. Signed-off-by: Joshua Marantz <jmarantz@google.com>
…(at least one with my own new assert). Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
…rkerFactory::createWorker, Signed-off-by: Joshua Marantz <jmarantz@google.com>
…nt an assert and made it not fire. Signed-off-by: Joshua Marantz <jmarantz@google.com>
…eeds is a MonotonicTimeSource, so no need for the added test complexity. Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
ptal -- tsan looks like a download error. |
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
…ock-time. Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
I added a branch from this PR where I add a format-check to discourage new source code from instantiating a real-time source. I'll issue a PR for that after this gets merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking this on. Overall, I like the approach of dependency injection, some interface level questions before doing the main pass.
include/envoy/api/api.h
Outdated
@@ -22,7 +23,7 @@ class Api { | |||
* Allocate a dispatcher. | |||
* @return Event::DispatcherPtr which is owned by the caller. | |||
*/ | |||
virtual Event::DispatcherPtr allocateDispatcher() PURE; | |||
virtual Event::DispatcherPtr allocateDispatcher(TimeSource& time_source) PURE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: @param
description.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done, i guess :)
include/envoy/common/time.h
Outdated
@@ -37,6 +42,55 @@ class MonotonicTimeSource { | |||
/** | |||
* @return the current monotonic time. | |||
*/ | |||
virtual MonotonicTime currentTime() PURE; | |||
virtual MonotonicTime currentTime() const PURE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can imagine some implementation that would have a non-const semantics, e.g. make some fixup adjustment to maintain monotonicity on each call, but probably fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was questioning this decision myself and now I'm reversing it. Similarly, a simulated system-time (non-monotonic) may want to increment time a tiny bit on each call to ensure that there is entropy for random seeds.
Not sure how I feel about that; we could also rely on impls that need to keep explicit state doing so via mutable. WDYT?
include/envoy/common/time.h
Outdated
* | ||
* TODO(#4160): currently this is just a container for SystemTimeSource and | ||
* MonotonicTimeSource but we should clean that up and just have this as the | ||
* base class. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. I would also like to see less implementation stuff inside the interface, but this is OK for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be addressed as part of the TODO. Adding commentary to that effect.
include/envoy/common/time.h
Outdated
MonotonicTimeSource& monotonic() { return *monotonic_; } | ||
|
||
private: | ||
SystemTimeSource* system_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: prefer SystemTimeSource&
etc. as per Envoy style.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can't; need to be able to assign to support some mocking in the tests. Adding comment.
Note also that this uses Envoy convention at the interfaces, and there are other instances in Envoy that use pointer member vars internally to enable shallow mutation.
* TODO(#4160) the implementations currently manage timer events that | ||
* ignore the time-source, and thus can't be mocked or faked. So it's | ||
* difficult to mock time in an integration test without mocking out | ||
* the dispatcher. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the only benefit here then of providing the timeSource()
to allow for convenient threading of this object across any object that needs time and already has a dispatcher reference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not the only benefit. I think the dispatcher needs to understand time to support timers in a manner consistent with that. I'm just not biting that off in this PR. This interface makes the injection significantly easier in this PR, and will be functionally needed in the future for the dispatcher to work properly with mocked or simulated time.
/** | ||
* @return TimeSource& a reference to the time source. | ||
*/ | ||
virtual TimeSource& timeSource() PURE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why can't this be inferred from dispatcher()
above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can be, but I feel like it's reasonable from an API perspective that the FilterConfig should provide direct access to its time source, even if the impl delegates that as an implementation detail.
/** | ||
* @return the time source used for the server. | ||
*/ | ||
virtual TimeSource& timeSource() PURE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel that time is an essential interface property of a server, independent of whether it delegates ownership of the TimeSource itself to the dispatcher.
/** | ||
* @return TimeSource& the time-source used with the cluster manager. | ||
*/ | ||
virtual TimeSource& timeSource() PURE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is time really an essential property of ClusterManager
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Offhand, I am not an expert in the class; this is the first I've touched it, IIRC.
However, it appears based on its API to care about load-balancing, which I think typically has a time-component implied, no?
From a practical perspective, I need to find a time-source from places that already have a ClusterManager plumbed in and not much else. E.g. AsyncClientImpl
IMO you don't need a lot of excuses to want to know about time. Even a pure algorithm may want to employ timeouts, etc. Something I'm not proud of here is that PerfAnnotation in this PR still needs its own prod time-source, but I can't solve all problems in this PR, and I'm not sure we have a TimeSource available everywhere I might want to measure perf.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AsyncClientImpl
should have a dispatcher where it can infer TimeSource
. Are there other cases where we need this?
In general, I'm of the persuasion that I agree with what you write above, but that doesn't necessitate making TimeSource
a part of the interface. ClusterManagerImpl
can take TimeSource
as a constructor arg, it doesn't have to be in the business of exporting this out to the rest of the system.
Can you see if it's practical to avoid adding to the interface here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is practical but will enlarge the CL significantly from what I can tell. Can I suggest a compromise where I add a TODO to the interfaces where it doesn't make (IMO) obvious sense (Dispatcher, arguably Server) to remove them when we've got everything plumbed directly?
I think the scary one is filters; I didn't want to change how every filter gets instantiated, preferring instead to provide access to the TimeSource via FilterConfig. Happy to remove that, but it will require changing every filter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per IM I have a follow-up in progress to remove this: jmarantz@b9a9cee
Rats; I wound up committing the requested changes to a branch. Time for some git magic... |
Signed-off-by: Joshua Marantz <jmarantz@google.com>
I think that worked; ptal. |
…4226) The management of the timer_ variable in the test class was a little funky, and looked leak-prone. This PR cleans this up a little,. Without this PR, #4180 memory-leaks on this test, I think due to overwriting timer_. This is all toward resolution of #4160 Risk Level: Low. Testing: //test/common/config:grpc_mux_impl_test Docs Changes: n/a Release Notes: n/a Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
…to use 'SimulatedTestTime' which hasn't been built yet. Signed-off-by: Joshua Marantz <jmarantz@google.com>
…if they are not copyable. Signed-off-by: Joshua Marantz <jmarantz@google.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jmarantz thanks again for working on this. This PR is at the point where any reasonable reviewer is likely to make mistakes and conflate signal/noise, so can you just go through the comments I have and make the changes and we can merge? Also, please keep merging on each push to ensure we don't end up with a merge race.
/** | ||
* @return TimeSource& the time-source used with the cluster manager. | ||
*/ | ||
virtual TimeSource& timeSource() PURE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AsyncClientImpl
should have a dispatcher where it can infer TimeSource
. Are there other cases where we need this?
In general, I'm of the persuasion that I agree with what you write above, but that doesn't necessitate making TimeSource
a part of the interface. ClusterManagerImpl
can take TimeSource
as a constructor arg, it doesn't have to be in the business of exporting this out to the rest of the system.
Can you see if it's practical to avoid adding to the interface here?
PerfAnnotationContext* context_; | ||
MonotonicTime start_time_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: why reordering?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initializing context is needed (from a lazy-init static) prior to computing the start-time.
@@ -4,14 +4,15 @@ | |||
|
|||
namespace Envoy { | |||
|
|||
TokenBucketImpl::TokenBucketImpl(uint64_t max_tokens, double fill_rate, | |||
MonotonicTimeSource& time_source) | |||
TokenBucketImpl::TokenBucketImpl(uint64_t max_tokens, MonotonicTimeSource& monotonic_time_source, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: why reordering?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because fill_rate is optional, but time-source is not.
@@ -16,10 +16,10 @@ class TokenBucketImpl : public TokenBucket { | |||
* @param max_tokens supplies the maximun number of tokens in the bucket. | |||
* @param fill_rate supplies the number of tokens that will return to the bucket on each second. | |||
* The default is 1. | |||
* @param time_source supplies the time source. The default is ProdMonotonicTimeSource. | |||
* @param time_source supplies the time source. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Params are now out of order..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -25,6 +25,8 @@ class AsyncClientImpl final : public AsyncClient { | |||
AsyncStream* start(const Protobuf::MethodDescriptor& service_method, | |||
AsyncStreamCallbacks& callbacks) override; | |||
|
|||
TimeSource& timeSource() { return cm_.timeSource(); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, you don't have the dispatcher directly here, but see https://github.com/envoyproxy/envoy/blob/master/source/common/grpc/async_client_impl.cc#L70
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that seems plausible. Actually that's effectively pulling it from cluster_manager anyway, where it looks like it goes through a process of doing a search that can throw an exception if it fails.
I'd rather just get the time-source directly.
@@ -3,6 +3,8 @@ | |||
#include <string> | |||
#include <vector> | |||
|
|||
#include "envoy/common/time.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should there be a corresponding BUiLD file change to deps?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the zipkin rule already had an explicit dependency to time_interface.
@@ -94,6 +95,7 @@ class FilesystemSubscriptionTestHarness : public SubscriptionTestHarness { | |||
|
|||
const std::string path_; | |||
std::string version_; | |||
RealTestTime test_time_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Real" doesn't seem very precise for a naming convention here. How about DangerousDeprecatedTestTime
? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my sed script is busy working on this; will be in the upcoming commit. This is a sufficiently nasty name to dissuade people writing new tests from using it, I hope :)
I'm not 100% sure that we should never use real-time in tests; there may be some corner cases where you really want that, such as timing out an algorithm by polling real-time. But I'm happy to switch the name for now, and once most of them are removed, we can consider renaming back.
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome.
Description: This is one step toward addressing #4160 though this strategy may need to be backed out and re-worked. See the bug comments for more detail on the plan. The main value of this PR is to get rid of the Prod*TimeSource::instance_ statics, so we can inject time into tests. This PR doesn't actually inject time into tests differently; it just makes time injectable.
Apologies about the gigantic PR. The follow-ups can be more incremental.
Risk Level: medium: this should be non-functional, but a lot of lines of code changed. There is a fairly significant risk of merge issues too, so when merging, after this, we should be careful.
Testing: //test/...
Docs Changes: n/a
Release Notes: n/a