Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clientv3: fix secure endpoint failover, refactor with gRPC 1.22 upgrade #10911

Merged
merged 12 commits into from
Jul 26, 2019

Conversation

gyuho
Copy link
Contributor

@gyuho gyuho commented Jul 19, 2019

Fix kubernetes/kubernetes#72102.

Manually tested, and works:

Server 1 certificate:

X509v3 extensions:
    X509v3 Key Usage: critical
        Digital Signature, Key Encipherment
    X509v3 Extended Key Usage:
        TLS Web Server Authentication, TLS Web Client Authentication
    X509v3 Basic Constraints: critical
        CA:FALSE
    X509v3 Subject Alternative Name:
        DNS:localhost, IP Address:127.0.0.1, IP Address:192.168.223.156

Server 2 certificate:

X509v3 extensions:
    X509v3 Key Usage: critical
        Digital Signature, Key Encipherment
    X509v3 Extended Key Usage:
        TLS Web Server Authentication, TLS Web Client Authentication
    X509v3 Basic Constraints: critical
        CA:FALSE
    X509v3 Subject Alternative Name:
        DNS:localhost, IP Address:127.0.0.1, IP Address:192.168.121.180

Server 3 certificate:

X509v3 extensions:
    X509v3 Key Usage: critical
        Digital Signature, Key Encipherment
    X509v3 Extended Key Usage:
        TLS Web Server Authentication, TLS Web Client Authentication
    X509v3 Basic Constraints: critical
        CA:FALSE
    X509v3 Subject Alternative Name:
        DNS:localhost, IP Address:127.0.0.1, IP Address:192.168.150.136
  1. set up 3 instances 192.168.223.156, 192.168.121.180, 192.168.150.136
  2. start etcd cluster with TLS enabled
  3. shut down the first node 192.168.223.156
  4. send client requests with --endpoints 192.168.223.156:2379,192.168.121.180:2379,192.168.150.136:2379
  5. client requests may fail on first endpoint 192.168.223.156 but client balancer should fali-over to other endpoints
$ ETCD_CLIENT_DEBUG=1 ETCDCTL_API=3 /usr/local/bin/etcdctl   --endpoints 192.168.223.156:2379,192.168.121.180:2379,192.168.150.136:2379   --cacert ${HOME}/certs/etcd-root-ca.pem   --cert ${HOME}/certs/s1.pem   --key ${HOME}/certs/s1-key.pem   put foo bar

{"level":"info","ts":1563550687.373153,"caller":"balancer/balancer.go:79","msg":"built balancer","balancer-id":"bvncglqjjf4a","policy":"etcd-client-roundrobin-balanced","resolver-target":"endpoint://client-7ba31ebc-ac8e-4fdf-b973-d03190a7b920/192.168.223.156:2379"}
{"level":"info","ts":1563550687.3732452,"caller":"balancer/balancer.go:131","msg":"resolved","balancer-id":"bvncglqjjf4a","addresses":["192.168.121.180:2379","192.168.150.136:2379","192.168.223.156:2379"]}
{"level":"info","ts":1563550687.373312,"caller":"balancer/balancer.go:189","msg":"state changed","balancer-id":"bvncglqjjf4a","connected":false,"subconn":"0xc000350aa0","address":"192.168.223.156:2379","old-state":"IDLE","new-state":"CONNECTING"}
{"level":"info","ts":1563550687.3733385,"caller":"balancer/balancer.go:189","msg":"state changed","balancer-id":"bvncglqjjf4a","connected":false,"subconn":"0xc000350ac0","address":"192.168.121.180:2379","old-state":"IDLE","new-state":"CONNECTING"}
{"level":"info","ts":1563550687.3733535,"caller":"balancer/balancer.go:189","msg":"state changed","balancer-id":"bvncglqjjf4a","connected":false,"subconn":"0xc000350ae0","address":"192.168.150.136:2379","old-state":"IDLE","new-state":"CONNECTING"}
{"level":"info","ts":1563550687.3736932,"caller":"balancer/balancer.go:189","msg":"state changed","balancer-id":"bvncglqjjf4a","connected":false,"subconn":"0xc000350aa0","address":"192.168.223.156:2379","old-state":"CONNECTING","new-state":"TRANSIENT_FAILURE"}
{"level":"info","ts":1563550687.3815699,"caller":"balancer/balancer.go:189","msg":"state changed","balancer-id":"bvncglqjjf4a","connected":true,"subconn":"0xc000350ac0","address":"192.168.121.180:2379","old-state":"CONNECTING","new-state":"READY"}
{"level":"info","ts":1563550687.3816028,"caller":"balancer/balancer.go:257","msg":"generated picker","balancer-id":"bvncglqjjf4a","policy":"etcd-client-roundrobin-balanced","subconn-ready":["192.168.121.180:2379 (0xc000350ac0)"],"subconn-size":1}
{"level":"info","ts":1563550687.3828685,"caller":"balancer/balancer.go:189","msg":"state changed","balancer-id":"bvncglqjjf4a","connected":true,"subconn":"0xc000350ae0","address":"192.168.150.136:2379","old-state":"CONNECTING","new-state":"READY"}
{"level":"info","ts":1563550687.3828938,"caller":"balancer/balancer.go:257","msg":"generated picker","balancer-id":"bvncglqjjf4a","policy":"etcd-client-roundrobin-balanced","subconn-ready":["192.168.121.180:2379 (0xc000350ac0)","192.168.150.136:2379 (0xc000350ae0)"],"subconn-size":2}
OK

Without this patch, it fails:

{"level":"info","ts":1563549545.0493703,"caller":"balancer/balancer.go:193","msg":"state changed","balancer-id":"bvnc1znj92bp","connected":false,"subconn":"0xc00038ad00","address":"192.168.121.180:2379","old-state":"CONNECTING","new-state":"TRANSIENT_FAILURE"}
{"level":"warn","ts":"2019-07-19T15:19:07.203Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-1ac3c2d5-25e2-45af-ac0c-3b71b11dce83/192.168.223.156:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest connection error: connection error: desc = \"transport: authentication handshake failed: x509: certificate is valid for 127.0.0.1, 192.168.121.180, not 192.168.223.156\""}

Action Items

  • Replace authTokenCredential with custom credential bundle
  • Improve logging
  • CI testing
  • Enable client balancer testing in functional tests? Functional tests are running with secure endpoints. This could have been caught.

/cc @jpbetz @xiang90

@gyuho gyuho added the WIP label Jul 19, 2019
@gyuho
Copy link
Contributor Author

gyuho commented Jul 19, 2019

This is the initial proposal. I will improve the logging and docs around this change.
CI tests may fail. Will look into this.

@hexfusion @jingyih @wenjiaswe @spzala Can you take a look as well?

This should be the last major change before 3.4 code freeze.

@dims
Copy link
Contributor

dims commented Jul 19, 2019

@gyuho is the crux of the change this bit? dd73d08 db61ee1 or is there another change in grpc that is needed along with dd73d08 db61ee1?

@jingyih jingyih added this to the etcd-v3.4 milestone Jul 19, 2019
@gyuho
Copy link
Contributor Author

gyuho commented Jul 19, 2019

@dims dd73d08 db61ee1 is the only change needed in etcd. Other than upgrading gRPC in etcd, we don't need any change in gRPC. Newer gRPC is needed because older version does not support credential bundle.

@dims
Copy link
Contributor

dims commented Jul 19, 2019

excellent. thanks @gyuho

@gyuho gyuho changed the title clientv3: fix secure endpoint failover, upgrade gRPC to 1.22 clientv3: fix secure endpoint failover, refactor with gRPC 1.22 upgrade Jul 22, 2019
@gyuho gyuho force-pushed the balancer branch 2 times, most recently from 874f1bf to 010e6d7 Compare July 23, 2019 15:37
@gyuho gyuho removed the WIP label Jul 23, 2019
@gyuho gyuho requested a review from jpbetz July 23, 2019 17:55
@gyuho gyuho self-assigned this Jul 23, 2019
@jingyih
Copy link
Contributor

jingyih commented Jul 23, 2019

Thanks for implementing the fix. Do we have a test to verify the original bug was fixed?

@gyuho
Copy link
Contributor Author

gyuho commented Jul 23, 2019

@jingyih

See

Manually tested, and works:

We can only test it manually, because the fix is made in our TLS SAN field authentication :)

I manually confirmed that this fixes the original bug.

@gyuho gyuho force-pushed the balancer branch 2 times, most recently from 2e3d59b to 1b4a401 Compare July 25, 2019 17:32
return nil, nil
}

// transportCredential implements "grpccredentials.PerRPCCredentials" interface.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "grpccredentials.TransportCredentials"

@@ -71,6 +71,19 @@ func newTransportCredential(cfg *tls.Config) *transportCredential {
}

func (tc *transportCredential) ClientHandshake(ctx context.Context, authority string, rawConn net.Conn) (net.Conn, grpccredentials.AuthInfo, error) {
target := rawConn.RemoteAddr().String()
if authority != target {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to check authority is an IP address rather than a DNS name?

@xiang90
Copy link
Contributor

xiang90 commented Jul 25, 2019

the core idea looks good to me.

@gyuho
Copy link
Contributor Author

gyuho commented Jul 25, 2019

@xiang90 Good point. I made a change to overwrite authority only when the target is IP. Manually tested with SRV records:

diff --git a/tests/docker-dns-srv/certs/server-ca-csr.json b/tests/docker-dns-srv/certs/server-ca-csr.json
index 661de3799..f3a23c656 100644
--- a/tests/docker-dns-srv/certs/server-ca-csr.json
+++ b/tests/docker-dns-srv/certs/server-ca-csr.json
@@ -19,8 +19,6 @@
     "m4.etcd.local",
     "m5.etcd.local",
     "m6.etcd.local",
-    "etcd.local",
-    "127.0.0.1",
-    "localhost"
+    "etcd.local"
   ]
 }

Before (if we overwrite authority)

{"level":"warn","ts":"2019-07-25T21:14:38.454Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-0d92363c-2ca8-42a9-ae36-ddb1700a016e/m5.etcd.local.:23791","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest connection error: connection error: desc = "transport: authentication handshake failed: x509: cannot validate certificate for 127.0.0.1 because it doesn't contain any IP SANs""}

After (if we overwrite only when it's an IP)

No error

@gyuho gyuho force-pushed the balancer branch 3 times, most recently from 8c64b9f to 3b39c57 Compare July 25, 2019 22:12
@etcd-io etcd-io deleted a comment from codecov-io Jul 25, 2019
@codecov-io
Copy link

codecov-io commented Jul 25, 2019

Codecov Report

Merging #10911 into master will decrease coverage by 0.23%.
The diff coverage is 81.76%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #10911      +/-   ##
==========================================
- Coverage    63.9%   63.66%   -0.24%     
==========================================
  Files         400      400              
  Lines       37688    37415     -273     
==========================================
- Hits        24085    23821     -264     
- Misses      11966    11976      +10     
+ Partials     1637     1618      -19
Impacted Files Coverage Δ
embed/serve.go 37.38% <0%> (ø) ⬆️
clientv3/balancer/picker/roundrobin_balanced.go 100% <100%> (ø) ⬆️
clientv3/balancer/utils.go 100% <100%> (ø) ⬆️
clientv3/balancer/picker/err.go 100% <100%> (ø) ⬆️
clientv3/balancer/connectivity/connectivity.go 100% <100%> (ø)
etcdserver/api/v3rpc/grpc.go 100% <100%> (ø) ⬆️
clientv3/balancer/picker/picker.go 50% <50%> (ø)
clientv3/client.go 75.95% <70.58%> (-2.58%) ⬇️
clientv3/credentials/credentials.go 77.27% <77.27%> (ø)
clientv3/balancer/balancer.go 85.07% <90.47%> (+0.94%) ⬆️
... and 35 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 50babc1...a7b8034. Read the comment docs.

Copy link
Contributor

@jpbetz jpbetz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a first pass review. I'll do a second pass tomorrow to make sure I fully understand some of the details.

oldAggrState := bb.currentState
bb.currentState = bb.csEvltr.recordTransition(old, s)
oldAggrState := bb.connectivityRecorder.GetCurrentState()
bb.connectivityRecorder.RecordTransition(old, s)

// Regenerate picker when one of the following happens:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update comment to reflect that we're calling updatedPicker now instead regeneratePicker?

rc.mu.RLock()
state = rc.cur
rc.mu.RUnlock()
return state
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Can we just do:

rc.mu.RLock()
defer rc.mu.RUnlock()
return rc.cur

?

rc.numConnecting += updateVal
case connectivity.TransientFailure:
rc.numTransientFailure += updateVal
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Log a warn or something if none of the cases match?

// Error is error picker policy.
Error Policy = iota

// RoundrobinBalanced balance loads over multiple endpoints
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

balances

// Picker implements gRPC picker.
// Leave empty if "Policy" field is not custom.
// TODO: currently custom policy is not supported.
// Picker picker.Picker
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove comments about picker configuration and Custom picker enum in this PR and instead open an issue about it? It's a adds quite a bit of noise in this PR, might be better to just leave it out of the code until we decide to implement something. Same for custom balancers.

}

func (tc *transportCredential) Clone() grpccredentials.TransportCredentials {
return tc.gtc.Clone()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return &transportCredential{
  gtc: tc.gtc.Clone()
}

? (I realize we're not using it for anything, but just to future proof..)

// perRPCCredential implements "grpccredentials.PerRPCCredentials" interface.
type perRPCCredential struct {
authToken string
authTokenMu *sync.RWMutex
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pointer needed? just do authTokenMu sync.RWMutex to eliminate authTokenMu: new(sync.RWMutex) below?

}
}

func (tc *transportCredential) ClientHandshake(ctx context.Context, authority string, rawConn net.Conn) (net.Conn, grpccredentials.AuthInfo, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks right to me.

@wenjiaswe Would you double check this function as well? Docs for ClientHandshake here: https://godoc.org/google.golang.org/grpc/credentials#TransportCredentials

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked the function, lgtm, thanks!

@jpbetz
Copy link
Contributor

jpbetz commented Jul 26, 2019

@xiang90 @gyuho Will this fix be back ported to release-3.3 branch? (hopefully a v3.3.14)?

I don't think 3.3 had the problem this fixes. This is a fix we need a result of the transition to the grpc load balancer that was introduced on master for 3.4.

@gyuho gyuho merged commit 89d4002 into etcd-io:master Jul 26, 2019
@jingyih
Copy link
Contributor

jingyih commented Jul 26, 2019

The original issue from Kubernetes repo was reported with etcd version 3.2.24. etcd 3.2 and 3.3 do not use the grpc load balancer, but still have the same issue which needs to be fixed (primarily by 13e26a4.)

I think we should consider backporting (part of) this.

@dims
Copy link
Contributor

dims commented Jul 26, 2019

@jingyih correct. and as i said before in #10911 (comment) we tried patching up the ClientHandshake in grpc directly (the vendored copy in k/k) to confirm that the patch works in manual testing

@gyuho
Copy link
Contributor Author

gyuho commented Jul 27, 2019

@dims @jpbetz We did something similar for etcd 3.2 (ref. kubernetes/kubernetes#57480) to backport client balancer fixes from 3.3 to 3.2. We can look into backporting this after 3.4 release.

@frittentheke
Copy link

@gyuho massive thanks for fixing this.
Regarding fixing this for older versions, what would keep Kubernetes or rather kubeadm from straight installing 3.4 of etcd inatead of a point release of 3.2 or 3.3 containing this kind of bugfix?

@dims
Copy link
Contributor

dims commented Jul 29, 2019

@frittentheke the fix is in the etcd client code that is vendored into k/k and hence part of api server. So just updating the etcd "server" will not help in any way to mitigate the problem

@frittentheke
Copy link

frittentheke commented Jul 29, 2019

@dims urgh, yeah, my bad.

But since this solely is a client issue, there is no need for any backporting required, is there?
The only issue that might needs raising with them, is to pull in a more recent version of the etcd client lib for their next point releases of all still supported K8s versions right?

@dims
Copy link
Contributor

dims commented Jul 29, 2019

@frittentheke "etcd client lib" with the fix in this PR, note that this drags in new grpc as well. Note that this combination has not landed in k/k master, so there is no testing at all at this point :) Also see my note in the k/k issue - kubernetes/kubernetes#72102 (comment)

@pariminaresh
Copy link

pariminaresh commented Aug 22, 2019

We are experiencing the same issue as API Server is crashing out every time the first etcd server goes down. Earlier using etcd v3.3.13 and upgraded this to v3.3.15 after coming across this PR and kubernetes/kubernetes#72102.

But I still see that apiserver is crashing out.After spending some time to make sure I'm using the proper versions, decided to take help from other folks here. I believe I might be missing something. Please correct me if so. Thanks!

Below are the details:

  1. 3 node cluster : 10.66.0.162,10.66.0.166,10.66.0.168
  2. shutdown 10.66.0.166 & running test on 10.66.0.162
1# etcdctl -version
etcdctl version: 3.3.15
API version: 2

2# etcd -version
etcd Version: 3.3.15
Git SHA: 94745a4ee
Go Version: go1.12.9
Go OS/Arch: linux/amd64

3# ETCDCTL_API=3 etcdctl --endpoints=10.66.0.166:2379,10.66.0.162:2379,10.66.0.168:2379 --cacert /etc/kubernetes/pki/etcd/ca.pem --cert /etc/kubernetes/pki/etcd/client.pem put --key /etc/kubernetes/pki/etcd/client-key.pem foo bar
{"level":"warn","ts":"2019-08-22T21:37:36.613Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-3fc8c37e-a22f-4452-a372-9c98664d645e/10.66.0.166:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest connection error: connection error: desc = \"transport: authentication handshake failed: x509: certificate is valid for 10.66.0.162, not 10.66.0.166\""}

4# ETCDCTL_API=3 etcdctl --endpoints=10.66.0.162:2379,10.66.0.166:2379,10.66.0.168:2379 --cacert /etc/kubernetes/pki/etcd/ca.pem --cert /etc/kubernetes/pki/etcd/client.pem put --key /etc/kubernetes/pki/etcd/client-key.pem foo bar
OK

As you notice here, I expected 'OK' response in the 3rd step as well. Do I need re-generate the certificates in any specific way? the same certs works fine when the first server is up. Please point me in right direction.

@zhangxinZKCD
Copy link

I use a domain name to issue a certificate,the error still exists!

{"level":"warn","ts":"2019-09-17T19:23:17.183+0800","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-8728de64-f8ea-4354-8151-4365e42b3acd/e2341-1.etcd.dpool.com.cn:2341","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest connection error: connection error: desc = "transport: authentication handshake failed: x509: certificate is valid for e2341-4.etcd.dpool.com.cn, not e2341-1.etcd.dpool.com.cn""}

jsok added a commit to jsok/vault that referenced this pull request Sep 23, 2019
Contains an important fix in clientv3 that allows vault to
successfully failover to another etcdv3 endpoint in the event that the
current active connection becomes unavailable.

See also:
 * etcd-io/etcd#9949
 * etcd-io/etcd#10911
 * https://github.com/etcd-io/etcd/blob/master/CHANGELOG-3.3.md#v3314-2019-08-16

Fixes hashicorp#4961
caseydavenport added a commit to caseydavenport/libcalico-go that referenced this pull request Dec 17, 2019
Specifically, we are looking for this fix etcd-io/etcd#10911

Which, was included in v3.3.14 according to the changelog: https://github.com/etcd-io/etcd/blob/master/CHANGELOG-3.3.md#v3314-2019-08-16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet