Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test infrastructure: inject time everywhere #4180

Merged
merged 23 commits into from
Aug 24, 2018

Conversation

jmarantz
Copy link
Contributor

@jmarantz jmarantz commented Aug 16, 2018

Description: This is one step toward addressing #4160 though this strategy may need to be backed out and re-worked. See the bug comments for more detail on the plan. The main value of this PR is to get rid of the Prod*TimeSource::instance_ statics, so we can inject time into tests. This PR doesn't actually inject time into tests differently; it just makes time injectable.

Apologies about the gigantic PR. The follow-ups can be more incremental.

Risk Level: medium: this should be non-functional, but a lot of lines of code changed. There is a fairly significant risk of merge issues too, so when merging, after this, we should be careful.
Testing: //test/...
Docs Changes: n/a
Release Notes: n/a

Signed-off-by: Joshua Marantz <jmarantz@google.com>
…hanged mocks.

Signed-off-by: Joshua Marantz <jmarantz@google.com>
…(at least one with my own new assert).

Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
…rkerFactory::createWorker,

Signed-off-by: Joshua Marantz <jmarantz@google.com>
…nt an assert and made it not fire.

Signed-off-by: Joshua Marantz <jmarantz@google.com>
…eeds is a MonotonicTimeSource, so no need for the added test complexity.

Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
@jmarantz jmarantz changed the title WIP: inject time everywhere test infrastructure: inject time everywhere Aug 17, 2018
@jmarantz
Copy link
Contributor Author

ptal -- tsan looks like a download error.

Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
…ock-time.

Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
@jmarantz
Copy link
Contributor Author

I added a branch from this PR where I add a format-check to discourage new source code from instantiating a real-time source.

jmarantz@c1f0dba

I'll issue a PR for that after this gets merged.

Copy link
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking this on. Overall, I like the approach of dependency injection, some interface level questions before doing the main pass.

@@ -22,7 +23,7 @@ class Api {
* Allocate a dispatcher.
* @return Event::DispatcherPtr which is owned by the caller.
*/
virtual Event::DispatcherPtr allocateDispatcher() PURE;
virtual Event::DispatcherPtr allocateDispatcher(TimeSource& time_source) PURE;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: @param description.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, i guess :)

@@ -37,6 +42,55 @@ class MonotonicTimeSource {
/**
* @return the current monotonic time.
*/
virtual MonotonicTime currentTime() PURE;
virtual MonotonicTime currentTime() const PURE;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can imagine some implementation that would have a non-const semantics, e.g. make some fixup adjustment to maintain monotonicity on each call, but probably fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was questioning this decision myself and now I'm reversing it. Similarly, a simulated system-time (non-monotonic) may want to increment time a tiny bit on each call to ensure that there is entropy for random seeds.

Not sure how I feel about that; we could also rely on impls that need to keep explicit state doing so via mutable. WDYT?

*
* TODO(#4160): currently this is just a container for SystemTimeSource and
* MonotonicTimeSource but we should clean that up and just have this as the
* base class.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. I would also like to see less implementation stuff inside the interface, but this is OK for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be addressed as part of the TODO. Adding commentary to that effect.

MonotonicTimeSource& monotonic() { return *monotonic_; }

private:
SystemTimeSource* system_;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: prefer SystemTimeSource& etc. as per Envoy style.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't; need to be able to assign to support some mocking in the tests. Adding comment.

Note also that this uses Envoy convention at the interfaces, and there are other instances in Envoy that use pointer member vars internally to enable shallow mutation.

* TODO(#4160) the implementations currently manage timer events that
* ignore the time-source, and thus can't be mocked or faked. So it's
* difficult to mock time in an integration test without mocking out
* the dispatcher.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the only benefit here then of providing the timeSource() to allow for convenient threading of this object across any object that needs time and already has a dispatcher reference?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not the only benefit. I think the dispatcher needs to understand time to support timers in a manner consistent with that. I'm just not biting that off in this PR. This interface makes the injection significantly easier in this PR, and will be functionally needed in the future for the dispatcher to work properly with mocked or simulated time.

/**
* @return TimeSource& a reference to the time source.
*/
virtual TimeSource& timeSource() PURE;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't this be inferred from dispatcher() above?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can be, but I feel like it's reasonable from an API perspective that the FilterConfig should provide direct access to its time source, even if the impl delegates that as an implementation detail.

/**
* @return the time source used for the server.
*/
virtual TimeSource& timeSource() PURE;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel that time is an essential interface property of a server, independent of whether it delegates ownership of the TimeSource itself to the dispatcher.

/**
* @return TimeSource& the time-source used with the cluster manager.
*/
virtual TimeSource& timeSource() PURE;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is time really an essential property of ClusterManager?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Offhand, I am not an expert in the class; this is the first I've touched it, IIRC.

However, it appears based on its API to care about load-balancing, which I think typically has a time-component implied, no?

From a practical perspective, I need to find a time-source from places that already have a ClusterManager plumbed in and not much else. E.g. AsyncClientImpl

IMO you don't need a lot of excuses to want to know about time. Even a pure algorithm may want to employ timeouts, etc. Something I'm not proud of here is that PerfAnnotation in this PR still needs its own prod time-source, but I can't solve all problems in this PR, and I'm not sure we have a TimeSource available everywhere I might want to measure perf.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AsyncClientImpl should have a dispatcher where it can infer TimeSource. Are there other cases where we need this?

In general, I'm of the persuasion that I agree with what you write above, but that doesn't necessitate making TimeSource a part of the interface. ClusterManagerImpl can take TimeSource as a constructor arg, it doesn't have to be in the business of exporting this out to the rest of the system.

Can you see if it's practical to avoid adding to the interface here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is practical but will enlarge the CL significantly from what I can tell. Can I suggest a compromise where I add a TODO to the interfaces where it doesn't make (IMO) obvious sense (Dispatcher, arguably Server) to remove them when we've got everything plumbed directly?

I think the scary one is filters; I didn't want to change how every filter gets instantiated, preferring instead to provide access to the TimeSource via FilterConfig. Happy to remove that, but it will require changing every filter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per IM I have a follow-up in progress to remove this: jmarantz@b9a9cee

@jmarantz
Copy link
Contributor Author

Rats; I wound up committing the requested changes to a branch. Time for some git magic...

Signed-off-by: Joshua Marantz <jmarantz@google.com>
@jmarantz
Copy link
Contributor Author

I think that worked; ptal.

Signed-off-by: Joshua Marantz <jmarantz@google.com>
htuch pushed a commit that referenced this pull request Aug 23, 2018
…4226)

The management of the timer_ variable in the test class was a little funky, and looked leak-prone. This PR cleans this up a little,. Without this PR, #4180 memory-leaks on this test, I think due to overwriting timer_. This is all toward resolution of #4160

Risk Level: Low.
Testing: //test/common/config:grpc_mux_impl_test
Docs Changes: n/a
Release Notes: n/a

Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
…to use

'SimulatedTestTime' which hasn't been built yet.

Signed-off-by: Joshua Marantz <jmarantz@google.com>
…if they are not copyable.

Signed-off-by: Joshua Marantz <jmarantz@google.com>
Copy link
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jmarantz thanks again for working on this. This PR is at the point where any reasonable reviewer is likely to make mistakes and conflate signal/noise, so can you just go through the comments I have and make the changes and we can merge? Also, please keep merging on each push to ensure we don't end up with a merge race.

/**
* @return TimeSource& the time-source used with the cluster manager.
*/
virtual TimeSource& timeSource() PURE;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AsyncClientImpl should have a dispatcher where it can infer TimeSource. Are there other cases where we need this?

In general, I'm of the persuasion that I agree with what you write above, but that doesn't necessitate making TimeSource a part of the interface. ClusterManagerImpl can take TimeSource as a constructor arg, it doesn't have to be in the business of exporting this out to the rest of the system.

Can you see if it's practical to avoid adding to the interface here?

PerfAnnotationContext* context_;
MonotonicTime start_time_;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: why reordering?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initializing context is needed (from a lazy-init static) prior to computing the start-time.

@@ -4,14 +4,15 @@

namespace Envoy {

TokenBucketImpl::TokenBucketImpl(uint64_t max_tokens, double fill_rate,
MonotonicTimeSource& time_source)
TokenBucketImpl::TokenBucketImpl(uint64_t max_tokens, MonotonicTimeSource& monotonic_time_source,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: why reordering?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because fill_rate is optional, but time-source is not.

@@ -16,10 +16,10 @@ class TokenBucketImpl : public TokenBucket {
* @param max_tokens supplies the maximun number of tokens in the bucket.
* @param fill_rate supplies the number of tokens that will return to the bucket on each second.
* The default is 1.
* @param time_source supplies the time source. The default is ProdMonotonicTimeSource.
* @param time_source supplies the time source.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Params are now out of order..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will fix.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -25,6 +25,8 @@ class AsyncClientImpl final : public AsyncClient {
AsyncStream* start(const Protobuf::MethodDescriptor& service_method,
AsyncStreamCallbacks& callbacks) override;

TimeSource& timeSource() { return cm_.timeSource(); }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, you don't have the dispatcher directly here, but see https://github.com/envoyproxy/envoy/blob/master/source/common/grpc/async_client_impl.cc#L70

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that seems plausible. Actually that's effectively pulling it from cluster_manager anyway, where it looks like it goes through a process of doing a search that can throw an exception if it fails.

I'd rather just get the time-source directly.

@@ -3,6 +3,8 @@
#include <string>
#include <vector>

#include "envoy/common/time.h"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be a corresponding BUiLD file change to deps?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the zipkin rule already had an explicit dependency to time_interface.

@@ -94,6 +95,7 @@ class FilesystemSubscriptionTestHarness : public SubscriptionTestHarness {

const std::string path_;
std::string version_;
RealTestTime test_time_;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Real" doesn't seem very precise for a naming convention here. How about DangerousDeprecatedTestTime? :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my sed script is busy working on this; will be in the upcoming commit. This is a sufficiently nasty name to dissuade people writing new tests from using it, I hope :)

I'm not 100% sure that we should never use real-time in tests; there may be some corner cases where you really want that, such as timing out an algorithm by polling real-time. But I'm happy to switch the name for now, and once most of them are removed, we can consider renaming back.

Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Copy link
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome.

@htuch htuch merged commit 9281a72 into envoyproxy:master Aug 24, 2018
@jmarantz jmarantz deleted the inject-time-source branch August 24, 2018 19:05
htuch pushed a commit that referenced this pull request Aug 28, 2018
Follow-up to #4180 -- removes ClusterManager::timeSource() as an API. Another step toward resolving #4160

Risk Level: low
Testing: //test/...
Docs Changes: n/a
Release Notes: n/a

Signed-off-by: Joshua Marantz <jmarantz@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants