Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

config: full delta xDS (including ADS) support #7293

Merged
merged 113 commits into from
Oct 1, 2019

Conversation

fredlas
Copy link
Contributor

@fredlas fredlas commented Jun 15, 2019

Delta ADS/xDS support. Added NewGrpcMuxImpl, a GrpcMux implementation that supports delta (and is intended to soon take over for state-of-the-world). DeltaSubscriptionImpl (which can soon be renamed to/replace GrpcSubscriptionImpl) now uses that new class. NewGrpcMuxImpl is the first user of the recently added WatchMap.

Risk Level: medium (huge new thing that people aren't using yet)
Testing: added delta param variant to ads_integration_test
Docs Changes: added source/common/config/README.md and accompanying diagram, added to root/configuration/overview.rst and root/intro/arch_overview/operations/dynamic_configuration.rst.
Release Notes: added support for delta xDS (including ADS) delivery
Fixes #4991

fredlas added 30 commits March 26, 2019 18:10
Signed-off-by: Fred Douglas <fredlas@google.com>
Signed-off-by: Fred Douglas <fredlas@google.com>
Signed-off-by: Fred Douglas <fredlas@google.com>
Signed-off-by: Fred Douglas <fredlas@google.com>
Signed-off-by: Fred Douglas <fredlas@google.com>
Signed-off-by: Fred Douglas <fredlas@google.com>
Signed-off-by: Fred Douglas <fredlas@google.com>
Signed-off-by: Fred Douglas <fredlas@google.com>
Signed-off-by: Fred Douglas <fredlas@google.com>
Signed-off-by: Fred Douglas <fredlas@google.com>
Signed-off-by: Fred Douglas <fredlas@google.com>
Signed-off-by: Fred Douglas <fredlas@google.com>
Signed-off-by: Fred Douglas <fredlas@google.com>
Signed-off-by: Fred Douglas <fredlas@google.com>
Signed-off-by: Fred Douglas <fredlas@google.com>
Signed-off-by: Fred Douglas <fredlas@google.com>
Signed-off-by: Fred Douglas <fredlas@google.com>
Signed-off-by: Fred Douglas <fredlas@google.com>
Signed-off-by: Fred Douglas <fredlas@google.com>
Signed-off-by: Fred Douglas <fredlas@google.com>
Signed-off-by: Fred Douglas <fredlas@google.com>
Signed-off-by: Fred Douglas <fredlas@google.com>
Signed-off-by: Fred Douglas <fredlas@google.com>
Signed-off-by: Fred Douglas <fredlas@google.com>
Signed-off-by: Fred Douglas <fredlas@google.com>
Signed-off-by: Fred Douglas <fredlas@google.com>
Signed-off-by: Fred Douglas <fredlas@google.com>
Signed-off-by: Fred Douglas <fredlas@google.com>
Signed-off-by: Fred Douglas <fredlas@google.com>
Signed-off-by: Fred Douglas <fredlas@google.com>
@fredlas
Copy link
Contributor Author

fredlas commented Sep 27, 2019

Thanks, done.

Copy link
Member

@lizan lizan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost good to go, comments around stats / test coverage.

const std::string& version_info) {
callbacks_.onConfigUpdate(resources, version_info);
stats_.update_success_.inc();
stats_.update_attempt_.inc();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this increment be before calling callbacks_.onConfigUpdate(resources, version_info);, which may throw exception? (same for other similar place)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. This exact ordering is ultimately carried over from the old code... and for a second I thought you had found a bug in it!

What actually happens is the throw that can happen within that onConfigUpdate will get caught by the caller of this onConfigUpdate, who will then switch over into onConfigUpdateFailed, which has its own update_attempt.inc().

I think it's safe to call this unacceptably subtle behavior. It's not possible to make it as completely clean as I would like given the current logic, but I have at least changed it so that we're no longer relying on a throw->catch->otherFunction to update the stat. However, looking more closely, I actually don't think the current logic gives the most accurate value for this stat, so I left TODOs.

updateWatch(type_url, watch, {});
auto entry = subscriptions_.find(type_url);
if (entry == subscriptions_.end()) {
ENVOY_LOG(error, "removeWatch() called for non-existent subscription {}.", type_url);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No test coverage here, how could this happen? Shouldn't be just an ASSERT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, it should not happen, and keeping it from never happening only needs the logic within this class to be right. So, I have made this and most other instances in this file into asserts.

message->system_version_info());
auto sub = subscriptions_.find(message->type_url());
if (sub == subscriptions_.end()) {
ENVOY_LOG(warn,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No coverage here either.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is the one instance I will leave as not an assert, since it's about the server's behavior.

Added a test, in new_grpc_mux_impl_test (temporary, will be merged back into grpc_mux_impl_test upon unification).

@@ -0,0 +1,238 @@
#include "common/config/new_grpc_mux_impl.h"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. The ratelimit-related ones are handled in grpc_mux_impl_test, so the unification PR should take care of those.

…ent sub test

Signed-off-by: Fred Douglas <fredlas@google.com>
Signed-off-by: Fred Douglas <fredlas@google.com>
Copy link
Contributor

@itayd itayd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please ignore, testing

Signed-off-by: Fred Douglas <fredlas@google.com>
Signed-off-by: Fred Douglas <fredlas@google.com>
@fredlas
Copy link
Contributor Author

fredlas commented Sep 30, 2019

Well, the tests all passed, but clang_tidy had a couple of issues; should be fixed now. Actually, it had many issues, all in server.cc, but IIUC the issues that are pre-existing in parts of the code I'm not modifying won't be blockers.

lizan
lizan previously approved these changes Sep 30, 2019
Copy link
Member

@lizan lizan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, Thanks! Any further comment @lambdai @htuch?

Signed-off-by: Fred Douglas <fredlas@google.com>
Signed-off-by: Fred Douglas <fredlas@google.com>
@lambdai
Copy link
Contributor

lambdai commented Oct 1, 2019

LGTM!

Copy link
Member

@lizan lizan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Changes in api/ is one line doc only, so I'm going to merge this. We can iterate further from there.

@lizan lizan merged commit abb14cd into envoyproxy:master Oct 1, 2019
@fredlas
Copy link
Contributor Author

fredlas commented Oct 1, 2019

Thanks very much! It is great to have it in. I will try to get the delta+SotW unification in swiftly, so we will have nice clean, modular gRPC xDS code, more amenable to future changes and additions.

danzh2010 pushed a commit to danzh2010/envoy that referenced this pull request Oct 4, 2019
Delta ADS/xDS support. Added NewGrpcMuxImpl, a GrpcMux implementation that supports delta (and is intended to soon take over for state-of-the-world). DeltaSubscriptionImpl (which can soon be renamed to/replace GrpcSubscriptionImpl) now uses that new class. NewGrpcMuxImpl is the first user of the recently added WatchMap.

Risk Level: medium (huge new thing that people aren't using yet)
Testing: added delta param variant to ads_integration_test
Docs Changes: added source/common/config/README.md and accompanying diagram, added to root/configuration/overview.rst and root/intro/arch_overview/operations/dynamic_configuration.rst.
Release Notes: added support for delta xDS (including ADS) delivery
Fixes envoyproxy#4991

Signed-off-by: Fred Douglas <fredlas@google.com>
@hzxuzhonghu
Copy link
Contributor

@fredlas Does the api doc need update?

DELTA_GRPC is not yet entirely implemented! Initially, only CDS is available. Do not use for other xDSes. TODO(fredlas) update/remove this warning when appropriate.

https://www.envoyproxy.io/docs/envoy/latest/api-v2/api/v2/core/config_source.proto#enum-core-apiconfigsource-apitype

@fredlas
Copy link
Contributor Author

fredlas commented Oct 9, 2019

@fredlas Does the api doc need update?

DELTA_GRPC is not yet entirely implemented! Initially, only CDS is available. Do not use for other xDSes. TODO(fredlas) update/remove this warning when appropriate.

https://www.envoyproxy.io/docs/envoy/latest/api-v2/api/v2/core/config_source.proto#enum-core-apiconfigsource-apitype

Looks like it, thanks for pointing it out! I'll add that to #8478

@wbpcode
Copy link
Member

wbpcode commented Oct 12, 2019

I have read the relevant documentation and code, but I still don't know how the lazy loading and other functions are implemented. I didn't find the code to update the 'watch' at runtime. Or is this just an infrastructure, and specific features such as lazy loading are not implemented? @fredlas

@fredlas
Copy link
Contributor Author

fredlas commented Oct 14, 2019

Correct, lazy loading is for later. That was actually the motivation behind renaming all of this to "delta", to emphasize the aspect it actually includes, while leaving "incremental" to describe the eventual inclusion of lazy loading. (Which will also need to reach way into Envoy, through the various xDS users, all the way to the data plane, and generally be an even bigger change than this one, I think).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement incremental xDS in Envoy
10 participants