Skip to content

Commit

Permalink
fix: strip monotonic clock reading in cert check
Browse files Browse the repository at this point in the history
In Go, when two time.Time objects have a monotonic clock value, calls to
time.After will use the monotonic readings to compare the two values.
However, when the CPU goes to sleep and later awakes, the monotonic
clock will stop on some systems. As a result, the time comparison will
produce inaccurate results. See the Go documentation on time for a
discussion of the details: https://pkg.go.dev/time#hdr-Monotonic_Clocks.

In an earlier commit [1], we added code to ensure the ephemeral
certificate was not expired. If it was, we blocked the connection
attempt on a certificate refresh. This solved one bug where a laptop
went to sleep, later awoke, and was unable to connect given the existing
certificate had expired. However, the change in #686 fixed this bug, but
only for clients connecting with built-in authentication.

[1] #686

When the Go Connector is configured to use built-in authentication, it
compares the ephemeral certificate's expiration (a time.Time object
produced from JSON deserialization and therefore lacking a monotonic clock
reading) with time.Now(). Because only one of the two time values has a
monotonic reading (i.e. time.Now()), the wall clock time is used to
compare the two vales and calls to time.After produce valid results.

When the Go Connector is configured to use auto IAM authentication, it
updates the certificate's expiration to be the earlier of the
certificate's expiration or the OAuth2 token's expiration. In the
majority of cases, the OAuth2 token will expire earlier and so its
expiration will be set to the certificate's expiration. Further, the
OAuth2 token's expiration is a time.Time value *with* a monotonic clock
reading. As a result, calls to time.After will be using time values that
both have a monotonic clock reading, and so are susceptible to
producing the wrong result when a machine has been asleep for some
time.

For example, with debug logging enabled in the Proxy, we observed the
following logs after suspending the VM running the Proxy for more
than an hour, resuming the VM, and then attempting to connect:

Accepted connection from 127.0.0.1:56782
Now = 2024-03-10T04:34:37Z, Current cert expiration = 2024-03-10T00:38:48Z
Cert is valid = true
Dialing <INSTANCE_IP_ADDR>
connection aborted - error reading from instance: remote error: tls: bad certificate

Even though the cert expiration is clearly before "now," the Go
Connector incorrectly concludes the certificate is still valid. This is
on account of the erroneous monotonic clock value in time.Now that is
then used in calls to time.After.

This commit resolves this last bug by stripping the monotonic clock
value from both times before doing the comparison. In addition to
time.Round(0), time.UTC() will also strip the monotonic reading. This
commit uses time.UTC().

Past, related issues:

GoogleCloudPlatform/cloud-sql-proxy#1788
GoogleCloudPlatform/cloud-sql-proxy#1199
#402
  • Loading branch information
enocom committed Mar 10, 2024
1 parent aecc030 commit aace409
Showing 1 changed file with 18 additions and 11 deletions.
29 changes: 18 additions & 11 deletions dialer.go
Original file line number Diff line number Diff line change
Expand Up @@ -268,7 +268,7 @@ func (d *Dialer) Dial(ctx context.Context, icn string, opts ...DialOption) (conn
// The TLS handshake will not fail on an expired client certificate. It's
// not until the first read where the client cert error will be surfaced.
// So check that the certificate is valid before proceeding.
if invalidClientCert(cn, d.logger, tlsConfig) {
if !validClientCert(cn, d.logger, tlsConfig) {
d.logger.Debugf("[%v] Refreshing certificate now", cn.String())
i.ForceRefresh()
// Block on refreshed connection info
Expand Down Expand Up @@ -332,27 +332,34 @@ func (d *Dialer) Dial(ctx context.Context, icn string, opts ...DialOption) (conn
}), nil
}

func invalidClientCert(cn instance.ConnName, l debug.Logger, c *tls.Config) bool {
// validClientCert checks that the ephemeral client certificate retrieved from
// the cache is unexpired. The time comparisons strip the monotonic clock value
// to ensure an accurate result, even after laptop sleep.
func validClientCert(cn instance.ConnName, l debug.Logger, c *tls.Config) bool {
// The following conditions should be impossible (no certs, nil leaf), but
// just in case there's an unknown edge case, check assumptions before
// proceeding.
if len(c.Certificates) == 0 {
return true
return false
}
if c.Certificates[0].Leaf == nil {
return true
return false
}
now := time.Now()
notAfter := c.Certificates[0].Leaf.NotAfter
invalid := now.After(notAfter)
// Use UTC() to strip monotonic clock value to guard against inaccurate
// comparisons, especially after laptop sleep.
// See the comments on the monotonic clock in the Go documentation for
// details: https://pkg.go.dev/time#hdr-Monotonic_Clocks
now := time.Now().UTC()
expiration := c.Certificates[0].Leaf.NotAfter.UTC()
valid := expiration.After(now)
l.Debugf(
"[%v] Now = %v, Current cert expiration = %v",
cn.String(),
now.UTC().Format(time.RFC3339),
notAfter.UTC().Format(time.RFC3339),
now.Format(time.RFC3339),
expiration.Format(time.RFC3339),
)
l.Debugf("[%v] Cert is valid = %v", cn.String(), !invalid)
return invalid
l.Debugf("[%v] Cert is valid = %v", cn.String(), valid)
return valid
}

// EngineVersion returns the engine type and version for the instance
Expand Down

0 comments on commit aace409

Please sign in to comment.