-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reload tls config #5419
Reload tls config #5419
Conversation
42cc35b
to
16c1c80
Compare
@@ -3559,6 +3550,10 @@ func (a *Agent) ReloadConfig(newCfg *config.RuntimeConfig) error { | |||
// the checks and service registrations. | |||
a.loadTokens(newCfg) | |||
|
|||
if err := a.tlsConfigurator.Update(newCfg.ToTLSUtilConfig()); err != nil { | |||
return fmt.Errorf("Failed reloading tls configuration: %s", err) | |||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is one of the cornerstones of this PR, the config is being updated here! And every tls.Config
created afterwards will have the updates.
VerifyIncoming: c.VerifyIncoming, | ||
VerifyIncomingRPC: c.VerifyIncomingRPC, | ||
VerifyIncomingHTTPS: c.VerifyIncomingHTTPS, | ||
VerifyOutgoing: c.VerifyOutgoing, | ||
VerifyServerHostname: c.VerifyServerHostname, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ups, how could I forget that?!
} | ||
return c, nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NewConfigurator
uses Update
to set the config because Update
also performs all the checks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the time it passed the checks in Update
there is nothing left that could go wrong with generating a tls.Config
.
tlsConfig.ClientCAs = c.cas | ||
tlsConfig.RootCAs = c.cas | ||
|
||
tlsConfig.MinVersion = TLSLookup[c.base.TLSMinVersion] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is possible because TLSLookup
also contains ""
with golang's default. And because check
makes sure the version correctly matches.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That PR comment would probably make a good code comment for future readers!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✔️
config := c.commonTLSConfig(c.base.VerifyIncomingRPC) | ||
config.GetConfigForClient = func(*tls.ClientHelloInfo) (*tls.Config, error) { | ||
return c.IncomingRPCConfig(), nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using GetConfigForClient
is another cornerstone of this PR, it will query for a new configuration for each new client. It enables reloading tls config for the RPC server that accepts connections in consul.
tlsutil/config.go
Outdated
config := c.commonTLSConfig(c.base.VerifyIncomingHTTPS) | ||
config.GetConfigForClient = func(hello *tls.ClientHelloInfo) (*tls.Config, error) { | ||
config := c.IncomingHTTPSConfig() | ||
config.NextProtos = hello.SupportedProtos |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a test that uses a http2 client and for whatever reason, using GetConfigForClient
made that test fail. Until I put in the supported protocols.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this cause issues if a client requests protos ["foo", "http"]? Wouldn't copying the values indicate to the Go TLS library that "foo" is the preferred protocol.?
tlsutil/config.go
Outdated
tlsConfig, err := c.OutgoingRPCConfig() | ||
if err != nil { | ||
return nil, err | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And this is another important line because creating the actual configuration is postponed until later. Otherwise it outgoing connections would never pick up changes in the configuration.
tlsutil/config.go
Outdated
verifyServerHostname := c.base.VerifyServerHostname | ||
verifyOutgoing := c.base.VerifyOutgoing | ||
domain := c.base.Domain | ||
c.RUnlock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Carefully read everything we need from the config and guard that with a lock.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks awesome @i0rek! 🎉
Locking issue needs a fix but it should be trivial.
I think the caveats/limitations are reasonable for now but agree we should document them.
One thing I didn't see in the tests here (but did skim so could be wrong) - do we have tests that cover actual verification logic for certs?
Like attempt to use the TLS config to actuall establish TLS connects as both the client and server with both good and bad certs/hostnames etc and check the desired outcome?
That seems like a pretty important thing to have confidence that the security properties we are assuming actually hold through the refactor etc.
defer c.Unlock() | ||
return c.checks[id] | ||
c.RLock() | ||
config := c.OutgoingRPCConfig() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm OutgoingRPCConfig
also takes an RLock
on the same lock.
Multiple Read locks are allowed so this won't obvious deadlock and tests will pass, but there is a subtle bug here due to Go's RWMutex implementation that I happened to read about the other day.
This means that it is always unsafe to call RLock on a RWMutex that the same goroutine already has read locked.
I don't know if -race
is smart enough to spot that but we should avoid it here explicitly - maybe with an internal outgoingRPCConfigLocked
implementation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great catch! -race
is not smart enough. :(
I am restructuring the code a little so that all reading happens from independent small functions that can be locked without having to think about nested locking.
@banks All the existing tests that check with a server and a client are still there and I also added one: Lines 57 to 226 in cab155a
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great and is going to make quite a few of our users very happy.
Just a couple requests for comments (as I started wondering how things worked and eventually tracked it down to dead code only there for easing testing) and a couple of other question.
@@ -89,7 +89,11 @@ type Client struct { | |||
// NewClient is used to construct a new Consul client from the | |||
// configuration, potentially returning an error | |||
func NewClient(config *Config) (*Client, error) { | |||
return NewClientLogger(config, nil, tlsutil.NewConfigurator(config.ToTLSUtilConfig())) | |||
c, err := tlsutil.NewConfigurator(config.ToTLSUtilConfig(), nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be useful to note here that a normal Consul Agent doesn't use this method and instead will pass in its own TLS configurator.
At first I was thinking that the agent and server/client would have different configurators (and thus reloading would not work) but realized that you have it passing them to NewClientLogger and NewServerLogger. Adding a comment here to mention whats going on would probably be good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✔️
@@ -253,7 +253,11 @@ type Server struct { | |||
} | |||
|
|||
func NewServer(config *Config) (*Server, error) { | |||
return NewServerLogger(config, nil, new(token.Store), tlsutil.NewConfigurator(config.ToTLSUtilConfig())) | |||
c, err := tlsutil.NewConfigurator(config.ToTLSUtilConfig(), nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same thing here. A comment about this configurator not being used for a normal Consul agent would be helpful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✔️
tlsutil/config.go
Outdated
config := c.commonTLSConfig(c.base.VerifyIncomingHTTPS) | ||
config.GetConfigForClient = func(hello *tls.ClientHelloInfo) (*tls.Config, error) { | ||
config := c.IncomingHTTPSConfig() | ||
config.NextProtos = hello.SupportedProtos |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this cause issues if a client requests protos ["foo", "http"]? Wouldn't copying the values indicate to the Go TLS library that "foo" is the preferred protocol.?
@mkeeler thank you for the review, I addressed all your points. |
@i0rek Might want to check on those travis tests before merging. Looks like the tlsutil tests are failing to build.
|
This PR introduces reloading tls configuration. The interesting thing about this is that it is not about reloading the configuration, which is easy. But about making sure the reload actually has an effect.
verify_incoming
,verify_outgoing
,verify_server_hostname