Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

connect: reconcile how upstream configuration works with discovery chains #6225

Merged
merged 3 commits into from
Aug 2, 2019

Conversation

rboyer
Copy link
Member

@rboyer rboyer commented Jul 26, 2019

The following upstream config fields for connect sidecars sanely
integrate into discovery chain resolution:

  • Destination Namespace/Datacenter: Compilation occurs locally but using
    different default values for namespaces and datacenters. The xDS
    clusters that are created are named as they normally would be.

  • Mesh Gateway Mode (single upstream): If set this value overrides any
    value computed for any resolver for the entire discovery chain. The xDS
    clusters that are created may be named differently (see below).

  • Mesh Gateway Mode (whole sidecar): If set this value overrides any
    value computed for any resolver for the entire discovery chain. If this
    is specifically overridden for a single upstream this value is ignored
    in that case. The xDS clusters that are created may be named differently
    (see below).

  • Protocol (in opaque config): If set this value overrides the value
    computed when evaluating the entire discovery chain. If the normal chain
    would be TCP or if this override is set to TCP then the result is that
    we explicitly disable L7 Routing and Splitting. The xDS clusters that
    are created may be named differently (see below).

  • Connect Timeout (in opaque config): If set this value overrides the
    value for any resolver in the entire discovery chain. The xDS clusters
    that are created may be named differently (see below).

If any of the above overrides affect the actual result of compiling the
discovery chain (i.e. "tcp" becomes "grpc" instead of being a no-op
override to "tcp") then the relevant parameters are hashed and provided
to the xDS layer as a prefix for use in naming the Clusters. This is to
ensure that if one Upstream discovery chain has no overrides and
tangentially needs a cluster named "api.default.XXX", and another
Upstream does have overrides for "api.default.XXX" that they won't
cross-pollinate against the operator's wishes.

Fixes #6159

@rboyer rboyer added this to the 1.6.0-final milestone Jul 26, 2019
@rboyer rboyer requested a review from a team July 26, 2019 18:33
@rboyer rboyer self-assigned this Jul 26, 2019
@@ -305,6 +304,203 @@ func TestState_WatchesAndUpdates(t *testing.T) {
stages []verificationStage
}

newConnectProxyCase := func(meshGatewayProxyConfigValue structs.MeshGatewayMode) testCase {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I lifted the definition of the former "connect-proxy" case into a generator function so I could reuse the whole setup for different proxy-devel values of the mesh gateway mode without copypasta.

leafWatchID: genVerifyLeafWatch("web", "dc1"),
intentionsWatchID: genVerifyIntentionWatch("web", "dc1"),
"upstream:prepared_query:query": genVerifyPreparedQueryWatch("query", "dc1"),
"discovery-chain:api": genVerifyDiscoveryChainWatch(&structs.DiscoveryChainRequest{
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything goes through discovery chain now except for prepared queries.

agent/xds/sni.go Outdated
@@ -45,3 +45,10 @@ func QuerySNI(service string, datacenter string, cfgSnap *proxycfg.ConfigSnapsho
func TargetSNI(target structs.DiscoveryTarget, cfgSnap *proxycfg.ConfigSnapshot) string {
return ServiceSNI(target.Service, target.ServiceSubset, target.Namespace, target.Datacenter, cfgSnap)
}

func CustomizeSNI(sni string, chain *structs.CompiledDiscoveryChain) string {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only used when computing the envoy names for clusters.

"resources": [
{
"@type": "type.googleapis.com/envoy.api.v2.Cluster",
"name": "78ebd528.db.default.dc1.internal.11111111-2222-3333-4444-555555555555.consul",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the main subtle difference now.

@@ -90,10 +90,10 @@ function init_workdir {
cp consul-base-cfg/* workdir/${DC}/consul/

# Add any overrides if there are any (no op if not)
find ${CASE_DIR} -name '*.hcl' -maxdepth 1 -type f -exec cp -f {} workdir/${DC}/consul \;
find ${CASE_DIR} -maxdepth 1 -name '*.hcl' -type f -exec cp -f {} workdir/${DC}/consul \;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed an unrelated warning I was seeing executing find with args in the wrong order.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thats bsd vs gnu coreutils for you.

@rboyer rboyer force-pushed the validate-upstreams branch from 1145897 to 82c4252 Compare July 29, 2019 14:23
@rboyer rboyer force-pushed the upstreams-in-chain branch from 1d9f4b2 to 1e0f33b Compare July 29, 2019 14:23
@rboyer rboyer modified the milestones: 1.6.0-final, 1.6.0-beta4 Jul 29, 2019
@rboyer rboyer changed the base branch from validate-upstreams to release/1-6 August 1, 2019 18:27
…ains

The following upstream config fields for connect sidecars sanely
integrate into discovery chain resolution:

- Destination Namespace/Datacenter: Compilation occurs locally but using
different default values for namespaces and datacenters. The xDS
clusters that are created are named as they normally would be.

- Mesh Gateway Mode (single upstream): If set this value overrides any
value computed for any resolver for the entire discovery chain. The xDS
clusters that are created may be named differently (see below).

- Mesh Gateway Mode (whole sidecar): If set this value overrides any
value computed for any resolver for the entire discovery chain. If this
is specifically overridden for a single upstream this value is ignored
in that case. The xDS clusters that are created may be named differently
(see below).

- Protocol (in opaque config): If set this value overrides the value
computed when evaluating the entire discovery chain. If the normal chain
would be TCP or if this override is set to TCP then the result is that
we explicitly disable L7 Routing and Splitting. The xDS clusters that
are created may be named differently (see below).

- Connect Timeout (in opaque config): If set this value overrides the
value for any resolver in the entire discovery chain. The xDS clusters
that are created may be named differently (see below).

If any of the above overrides affect the actual result of compiling the
discovery chain (i.e. "tcp" becomes "grpc" instead of being a no-op
override to "tcp") then the relevant parameters are hashed and provided
to the xDS layer as a prefix for use in naming the Clusters. This is to
ensure that if one Upstream discovery chain has no overrides and
tangentially needs a cluster named "api.default.XXX", and another
Upstream does have overrides for "api.default.XXX" that they won't
cross-pollinate against the operator's wishes.

Fixes #6159
@rboyer
Copy link
Member Author

rboyer commented Aug 1, 2019

In a discussion @banks raised the point that OverrideProtocol and OverrideMeshGateway make some sense getting passed all the way into the compiler, but that OverrideConnectTimeout could be handled during the CDS pass of xDS.

The one place this breaks down is when a cluster gets shared between two upstreams in the same sidecar proxy:

  • upstream1: name=db (split: [db, old-db])
  • upstream2: name=old-db, connectTimeoutOverride=33 seconds

In this configuration upstream1 needs clusters named db and old-db both using the values computed by the standard discovery chain process. upstream2 needs just old-db but it would need to change the value of the connect timeout used.

The solution in this PR would hash the connect timeout override value and prepend the old-db cluster for use by upstream2

Without that it is undefined which old-db cluster ends up being sent to envoy (and may flap).

I'm going to keep this as-is for now and open an issue (#6262) to look into ways of achieving the same thing perhaps in a way that doesn't necessarily need to take a trip through the compiler.

agent/xds/clusters.go Outdated Show resolved Hide resolved
@@ -90,10 +90,10 @@ function init_workdir {
cp consul-base-cfg/* workdir/${DC}/consul/

# Add any overrides if there are any (no op if not)
find ${CASE_DIR} -name '*.hcl' -maxdepth 1 -type f -exec cp -f {} workdir/${DC}/consul \;
find ${CASE_DIR} -maxdepth 1 -name '*.hcl' -type f -exec cp -f {} workdir/${DC}/consul \;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thats bsd vs gnu coreutils for you.

@rboyer rboyer merged commit 6393edb into release/1-6 Aug 2, 2019
@rboyer rboyer deleted the upstreams-in-chain branch August 2, 2019 03:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants