Fix config naming and add tests for new CA configs

hashicorp · Jan 18, 2019 · 411b239 · 411b239
1 parent fea202d
commit 411b239
Show file tree

Hide file tree

Showing 4 changed files with 24 additions and 16 deletions.
diff --git a/agent/cache-types/connect_ca_leaf.go b/agent/cache-types/connect_ca_leaf.go
@@ -21,13 +21,13 @@ const ConnectCALeafName = "connect-ca-leaf"
 
 // caChangeJitterWindow is the time over which we spread each round of retries
 // when attempting to get a new certificate following a root rotation. It's
-// selected to be a trade-off between not making rotation unneccesarily slow on
+// selected to be a trade-off between not making rotation unnecessarily slow on
 // a tiny cluster while not hammering the servers on a huge cluster
 // unnecessarily hard. Servers rate limit to protect themselves from the
 // expensive crypto work, but in practice have 10k+ RPCs all in the same second
 // will cause a major disruption even on large servers due to downloading the
 // payloads, parsing msgpack etc. Instead we pick a window that for now is fixed
-// but later might be either user configurable (not nice since it because
+// but later might be either user configurable (not nice since it would become
 // another hard-to-tune value) or set dynamically by the server based on it's
 // knowledge of how many certs need to be rotated. Currently the server doesn't
 // know that so we pick something that is reasonable. We err on the side of

diff --git a/agent/config/runtime_test.go b/agent/config/runtime_test.go
@@ -3004,8 +3004,10 @@ func TestFullConfig(t *testing.T) {
 			"connect": {
 				"ca_provider": "consul",
 				"ca_config": {
-					"RotationPeriod": "90h",
-					"LeafCertTTL": "1h"
+					"rotation_period": "90h",
+					"leaf_cert_ttl": "1h",
+					"csr_max_per_second": 100,
+					"csr_max_concurrent": 2
 				},
 				"enabled": true,
 				"proxy_defaults": {
@@ -3552,6 +3554,10 @@ func TestFullConfig(t *testing.T) {
 				ca_config {
 					rotation_period = "90h"
 					leaf_cert_ttl = "1h"
+					# hack float since json parses numbers as float and we have to
+					# assert against the same thing
+					csr_max_per_second = 100.0
+					csr_max_concurrent = 2.0
 				}
 				enabled = true
 				proxy_defaults {
@@ -4204,8 +4210,10 @@ func TestFullConfig(t *testing.T) {
 		ConnectSidecarMaxPort:   9999,
 		ConnectCAProvider:       "consul",
 		ConnectCAConfig: map[string]interface{}{
-			"RotationPeriod": "90h",
-			"LeafCertTTL":    "1h",
+			"RotationPeriod":   "90h",
+			"LeafCertTTL":      "1h",
+			"CSRMaxPerSecond":  float64(100),
+			"CSRMaxConcurrent": float64(2),
 		},
 		ConnectProxyAllowManagedRoot:            false,
 		ConnectProxyAllowManagedAPIRegistration: false,

diff --git a/website/source/docs/agent/options.html.md b/website/source/docs/agent/options.html.md
@@ -878,11 +878,11 @@ default will automatically work with some tooling.
           service instances before the time it takes to rotate is impacted. For
           larger deployments we recommend increasing this based on the expected
           number of server instances and server resources, or use
-          `csr_max_concurrency` instead if servers have more than one core.
+          `csr_max_concurrent` instead if servers have more than one core.
           Setting this to zero disables rate limiting. Added in 1.4.1.
 
-        * <a name="ca_csr_max_concurrency"></a><a
-          href="#ca_csr_max_concurrency">`csr_max_concurrency`</a> Sets a limit
+        * <a name="ca_csr_max_concurrent"></a><a
+          href="#ca_csr_max_concurrent">`csr_max_concurrent`</a> Sets a limit
           on how many Certificate Signing Requests will be processed
           concurrently. Defaults to 0 (disabled). This is useful when you have
           more than one or two cores available to the server. For example on an

diff --git a/website/source/docs/guides/performance.html.md b/website/source/docs/guides/performance.html.md
@@ -180,17 +180,17 @@ cluster. Typically these operations are fast on modern hardware, however when
 the CA is changed or it's key rotated, the leader will face an influx of
 requests for new certificates for every service instance running.
 
-While the client agents smear these over 30 seconds to avoid an immediate
-thundering herd, they don't have enough information to tune that smearing based
-on the number of certificates in use in the cluster so picking longer smearing
-results in artificially slow rotations for small clusters.
+While the client agents distribute these randomly over 30 seconds to avoid an
+immediate thundering herd, they don't have enough information to tune that
+period based on the number of certificates in use in the cluster so picking
+longer smearing results in artificially slow rotations for small clusters.
 
 Smearing requests over 30s is sufficient to bring RPC load to a reasonable level
 in all but the very largest clusters, but the extra CPU load from cryptographic
 operations could impact the server's normal work. To limit that, Consul since
 1.4.1 exposes two ways to limit the impact Certificate signing has on the leader
 [`csr_max_per_second`](/docs/agent/options.html#ca_csr_max_per_second) and
-[`csr_max_concurrency`](/docs/agent/options.html#csr_max_concurrency).
+[`csr_max_concurrent`](/docs/agent/options.html#ca_csr_max_concurrent).
 
 By default we set a limit of 50 per second which is reasonable on modest
 hardware but may be too low and impact rotation times if more than 1500 service
@@ -204,12 +204,12 @@ process?
 
 For larger production deployments, we generally recommend multiple CPU cores for
 servers to handle the normal workload. With four or more cores available, it's
-simpler to limit signing CPU impact with `csr_max_concurrency` rather than tune
+simpler to limit signing CPU impact with `csr_max_concurrent` rather than tune
 the rate limit. This effectively sets how many CPU cores can be monopolized by
 certificate signing work (although it doesn't pin that work to specific cores).
 In this case `csr_max_per_second` should be disabled (set to `0`).
 
-For example if you have an 8 core server, setting `csr_max_concurrency` to `1`
+For example if you have an 8 core server, setting `csr_max_concurrent` to `1`
 would allow you to process CSRs as fast as a single core can (which is likely
 sufficient for the very large clusters), without consuming all available
 CPU cores and impacting normal server work or stability.