Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reload tls config #5419

Merged
merged 31 commits into from
Mar 13, 2019
Merged

Reload tls config #5419

merged 31 commits into from
Mar 13, 2019

Conversation

hanshasselberg
Copy link
Member

@hanshasselberg hanshasselberg commented Mar 4, 2019

This PR introduces reloading tls configuration. The interesting thing about this is that it is not about reloading the configuration, which is easy. But about making sure the reload actually has an effect.

  • describe in which cases it works
    • reload certificates
    • reload CAs
    • reload verify_incoming, verify_outgoing, verify_server_hostname
  • describe limitations
    • it is not possible to turn TLS on or off! you want to have it enabled and then reload
  • document accordingly on website

@hanshasselberg hanshasselberg changed the base branch from no_server_name_rpc to master March 6, 2019 12:02
@hanshasselberg hanshasselberg marked this pull request as ready for review March 6, 2019 12:26
@@ -3559,6 +3550,10 @@ func (a *Agent) ReloadConfig(newCfg *config.RuntimeConfig) error {
// the checks and service registrations.
a.loadTokens(newCfg)

if err := a.tlsConfigurator.Update(newCfg.ToTLSUtilConfig()); err != nil {
return fmt.Errorf("Failed reloading tls configuration: %s", err)
}
Copy link
Member Author

@hanshasselberg hanshasselberg Mar 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is one of the cornerstones of this PR, the config is being updated here! And every tls.Config created afterwards will have the updates.

VerifyIncoming: c.VerifyIncoming,
VerifyIncomingRPC: c.VerifyIncomingRPC,
VerifyIncomingHTTPS: c.VerifyIncomingHTTPS,
VerifyOutgoing: c.VerifyOutgoing,
VerifyServerHostname: c.VerifyServerHostname,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ups, how could I forget that?!

}
return c, nil
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NewConfigurator uses Update to set the config because Update also performs all the checks!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the time it passed the checks in Update there is nothing left that could go wrong with generating a tls.Config.

tlsConfig.ClientCAs = c.cas
tlsConfig.RootCAs = c.cas

tlsConfig.MinVersion = TLSLookup[c.base.TLSMinVersion]
Copy link
Member Author

@hanshasselberg hanshasselberg Mar 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is possible because TLSLookup also contains "" with golang's default. And because check makes sure the version correctly matches.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That PR comment would probably make a good code comment for future readers!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️

config := c.commonTLSConfig(c.base.VerifyIncomingRPC)
config.GetConfigForClient = func(*tls.ClientHelloInfo) (*tls.Config, error) {
return c.IncomingRPCConfig(), nil
}
Copy link
Member Author

@hanshasselberg hanshasselberg Mar 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using GetConfigForClient is another cornerstone of this PR, it will query for a new configuration for each new client. It enables reloading tls config for the RPC server that accepts connections in consul.

config := c.commonTLSConfig(c.base.VerifyIncomingHTTPS)
config.GetConfigForClient = func(hello *tls.ClientHelloInfo) (*tls.Config, error) {
config := c.IncomingHTTPSConfig()
config.NextProtos = hello.SupportedProtos
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a test that uses a http2 client and for whatever reason, using GetConfigForClient made that test fail. Until I put in the supported protocols.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this cause issues if a client requests protos ["foo", "http"]? Wouldn't copying the values indicate to the Go TLS library that "foo" is the preferred protocol.?

tlsConfig, err := c.OutgoingRPCConfig()
if err != nil {
return nil, err
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And this is another important line because creating the actual configuration is postponed until later. Otherwise it outgoing connections would never pick up changes in the configuration.

verifyServerHostname := c.base.VerifyServerHostname
verifyOutgoing := c.base.VerifyOutgoing
domain := c.base.Domain
c.RUnlock()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Carefully read everything we need from the config and guard that with a lock.

@hanshasselberg hanshasselberg requested a review from a team March 6, 2019 12:45
Copy link
Member

@banks banks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks awesome @i0rek! 🎉

Locking issue needs a fix but it should be trivial.

I think the caveats/limitations are reasonable for now but agree we should document them.

One thing I didn't see in the tests here (but did skim so could be wrong) - do we have tests that cover actual verification logic for certs?

Like attempt to use the TLS config to actuall establish TLS connects as both the client and server with both good and bad certs/hostnames etc and check the desired outcome?

That seems like a pretty important thing to have confidence that the security properties we are assuming actually hold through the refactor etc.

defer c.Unlock()
return c.checks[id]
c.RLock()
config := c.OutgoingRPCConfig()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm OutgoingRPCConfig also takes an RLock on the same lock.

Multiple Read locks are allowed so this won't obvious deadlock and tests will pass, but there is a subtle bug here due to Go's RWMutex implementation that I happened to read about the other day.

See https://stackoverflow.com/questions/30547916/goroutine-blocks-when-calling-rwmutex-rlock-twice-after-an-rwmutex-unlock

This means that it is always unsafe to call RLock on a RWMutex that the same goroutine already has read locked.

I don't know if -race is smart enough to spot that but we should avoid it here explicitly - maybe with an internal outgoingRPCConfigLocked implementation?

Copy link
Member Author

@hanshasselberg hanshasselberg Mar 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great catch! -race is not smart enough. :(
I am restructuring the code a little so that all reading happens from independent small functions that can be locked without having to think about nested locking.

@hanshasselberg
Copy link
Member Author

@banks All the existing tests that check with a server and a client are still there and I also added one:

func TestConfigurator_outgoingWrapper_OK(t *testing.T) {
config := Config{
CAFile: "../test/hostname/CertAuth.crt",
CertFile: "../test/hostname/Alice.crt",
KeyFile: "../test/hostname/Alice.key",
VerifyServerHostname: true,
VerifyOutgoing: true,
Domain: "consul",
}
client, errc := startTLSServer(&config)
if client == nil {
t.Fatalf("startTLSServer err: %v", <-errc)
}
c, err := NewConfigurator(config, nil)
require.NoError(t, err)
wrap := c.OutgoingRPCWrapper()
require.NotNil(t, wrap)
tlsClient, err := wrap("dc1", client)
require.NoError(t, err)
defer tlsClient.Close()
err = tlsClient.(*tls.Conn).Handshake()
require.NoError(t, err)
err = <-errc
require.NoError(t, err)
}
func TestConfigurator_outgoingWrapper_noverify_OK(t *testing.T) {
config := Config{
CAFile: "../test/hostname/CertAuth.crt",
CertFile: "../test/hostname/Alice.crt",
KeyFile: "../test/hostname/Alice.key",
Domain: "consul",
}
client, errc := startTLSServer(&config)
if client == nil {
t.Fatalf("startTLSServer err: %v", <-errc)
}
c, err := NewConfigurator(config, nil)
require.NoError(t, err)
wrap := c.OutgoingRPCWrapper()
require.NotNil(t, wrap)
tlsClient, err := wrap("dc1", client)
require.NoError(t, err)
defer tlsClient.Close()
err = tlsClient.(*tls.Conn).Handshake()
require.NoError(t, err)
err = <-errc
require.NoError(t, err)
}
func TestConfigurator_outgoingWrapper_BadDC(t *testing.T) {
config := Config{
CAFile: "../test/hostname/CertAuth.crt",
CertFile: "../test/hostname/Alice.crt",
KeyFile: "../test/hostname/Alice.key",
VerifyServerHostname: true,
VerifyOutgoing: true,
Domain: "consul",
}
client, errc := startTLSServer(&config)
if client == nil {
t.Fatalf("startTLSServer err: %v", <-errc)
}
c, err := NewConfigurator(config, nil)
require.NoError(t, err)
wrap := c.OutgoingRPCWrapper()
tlsClient, err := wrap("dc2", client)
require.NoError(t, err)
err = tlsClient.(*tls.Conn).Handshake()
_, ok := err.(x509.HostnameError)
require.True(t, ok)
tlsClient.Close()
<-errc
}
func TestConfigurator_outgoingWrapper_BadCert(t *testing.T) {
config := Config{
CAFile: "../test/ca/root.cer",
CertFile: "../test/key/ourdomain.cer",
KeyFile: "../test/key/ourdomain.key",
VerifyServerHostname: true,
VerifyOutgoing: true,
Domain: "consul",
}
client, errc := startTLSServer(&config)
if client == nil {
t.Fatalf("startTLSServer err: %v", <-errc)
}
c, err := NewConfigurator(config, nil)
require.NoError(t, err)
wrap := c.OutgoingRPCWrapper()
tlsClient, err := wrap("dc1", client)
require.NoError(t, err)
err = tlsClient.(*tls.Conn).Handshake()
if _, ok := err.(x509.HostnameError); !ok {
t.Fatalf("should get hostname err: %v", err)
}
tlsClient.Close()
<-errc
}
func TestConfigurator_wrapTLS_OK(t *testing.T) {
config := Config{
CAFile: "../test/ca/root.cer",
CertFile: "../test/key/ourdomain.cer",
KeyFile: "../test/key/ourdomain.key",
VerifyOutgoing: true,
}
client, errc := startTLSServer(&config)
if client == nil {
t.Fatalf("startTLSServer err: %v", <-errc)
}
c, err := NewConfigurator(config, nil)
require.NoError(t, err)
tlsClient, err := c.wrapTLSClient("dc1", client)
require.NoError(t, err)
tlsClient.Close()
err = <-errc
require.NoError(t, err)
}
func TestConfigurator_wrapTLS_BadCert(t *testing.T) {
serverConfig := &Config{
CertFile: "../test/key/ssl-cert-snakeoil.pem",
KeyFile: "../test/key/ssl-cert-snakeoil.key",
}
client, errc := startTLSServer(serverConfig)
if client == nil {
t.Fatalf("startTLSServer err: %v", <-errc)
}
clientConfig := Config{
CAFile: "../test/ca/root.cer",
VerifyOutgoing: true,
}
c, err := NewConfigurator(clientConfig, nil)
require.NoError(t, err)
tlsClient, err := c.wrapTLSClient("dc1", client)
require.Error(t, err)
require.Nil(t, tlsClient)
err = <-errc
require.NoError(t, err)
}
.

@hanshasselberg hanshasselberg requested a review from banks March 8, 2019 14:44
Copy link
Member

@mkeeler mkeeler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great and is going to make quite a few of our users very happy.

Just a couple requests for comments (as I started wondering how things worked and eventually tracked it down to dead code only there for easing testing) and a couple of other question.

@@ -89,7 +89,11 @@ type Client struct {
// NewClient is used to construct a new Consul client from the
// configuration, potentially returning an error
func NewClient(config *Config) (*Client, error) {
return NewClientLogger(config, nil, tlsutil.NewConfigurator(config.ToTLSUtilConfig()))
c, err := tlsutil.NewConfigurator(config.ToTLSUtilConfig(), nil)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be useful to note here that a normal Consul Agent doesn't use this method and instead will pass in its own TLS configurator.

At first I was thinking that the agent and server/client would have different configurators (and thus reloading would not work) but realized that you have it passing them to NewClientLogger and NewServerLogger. Adding a comment here to mention whats going on would probably be good.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️

@@ -253,7 +253,11 @@ type Server struct {
}

func NewServer(config *Config) (*Server, error) {
return NewServerLogger(config, nil, new(token.Store), tlsutil.NewConfigurator(config.ToTLSUtilConfig()))
c, err := tlsutil.NewConfigurator(config.ToTLSUtilConfig(), nil)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same thing here. A comment about this configurator not being used for a normal Consul agent would be helpful.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️

tlsutil/config.go Outdated Show resolved Hide resolved
config := c.commonTLSConfig(c.base.VerifyIncomingHTTPS)
config.GetConfigForClient = func(hello *tls.ClientHelloInfo) (*tls.Config, error) {
config := c.IncomingHTTPSConfig()
config.NextProtos = hello.SupportedProtos
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this cause issues if a client requests protos ["foo", "http"]? Wouldn't copying the values indicate to the Go TLS library that "foo" is the preferred protocol.?

@hanshasselberg
Copy link
Member Author

@mkeeler thank you for the review, I addressed all your points.

@mkeeler
Copy link
Member

mkeeler commented Mar 12, 2019

@i0rek Might want to check on those travis tests before merging. Looks like the tlsutil tests are failing to build.

tlsutil/config_test.go:671:19: undefined: nextProtos

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants