Skip to content

Commit

Permalink
[api] expand node affinity; support multiple reqs
Browse files Browse the repository at this point in the history
Previously we supported node affinity by either 1) using the default
heuristic of the isolation group name being the zone, or 2) by
specifying a single key and a set of values to match an m3db pod to a
node.

This limits users from specifying more complex terms such as "nodes in
zone A with instance type 'large'".

This change allows users to specify multiple node affinity terms, as
well as specifying none at all if they have no need for node affinity.
This is a breaking change as users must specify the zone key in their
cluster configs to get the same zone affinity as before, and this will
be called out in the release notes.

Eventually we'll add the ability to specify entire affinity terms per
isolation-group, but those can be extremely verbose and this still
provides a nice shortcut for most use cases.
  • Loading branch information
schallert committed Apr 29, 2019
1 parent 0289858 commit 5a30711
Show file tree
Hide file tree
Showing 20 changed files with 380 additions and 415 deletions.
36 changes: 28 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -80,18 +80,38 @@ spec:
- http://etcd-1.etcd:2379
- http://etcd-2.etcd:2379
isolationGroups:
- name: <zone-x>
numInstances: 1
- name: <zone-y>
numInstances: 1
- name: <zone-z>
numInstances: 1
- name: group1
numInstances: 1
nodeAffinityTerms:
- key: failure-domain.beta.kubernetes.io/zone
values:
- <zone-a>
- name: group2
numInstances: 1
nodeAffinityTerms:
- key: failure-domain.beta.kubernetes.io/zone
values:
- <zone-b>
- name: group3
numInstances: 1
nodeAffinityTerms:
- key: failure-domain.beta.kubernetes.io/zone
values:
- <zone-c>
podIdentityConfig:
sources:
- PodUID
sources: []
namespaces:
- name: metrics-10s:2d
preset: 10s:2d
dataDirVolumeClaimTemplate:
metadata:
name: m3db-data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100Gi
```

### Resizing a Cluster
Expand Down
15 changes: 13 additions & 2 deletions docs/api.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ This document enumerates the Custom Resource Definitions used by the M3DB Operat
* [M3DBCluster](#m3dbcluster)
* [M3DBClusterList](#m3dbclusterlist)
* [M3DBStatus](#m3dbstatus)
* [NodeAffinityTerm](#nodeaffinityterm)
* [IndexOptions](#indexoptions)
* [Namespace](#namespace)
* [NamespaceOptions](#namespaceoptions)
Expand Down Expand Up @@ -63,8 +64,7 @@ IsolationGroup defines the name of zone as well attributes for the zone configur
| Field | Description | Scheme | Required |
| ----- | ----------- | ------ | -------- |
| name | Name is the value that will be used in StatefulSet labels, pod labels, and M3DB placement \"isolationGroup\" fields. | string | true |
| nodeAffinityKey | NodeAffinityKey is the node label that will be used in corresponding StatefulSet match expression to assign pods to nodes. Defaults to \"failure-domain.beta.kubernetes.io/zone\". | string | false |
| nodeAffinityValues | NodeSelectorValues is the node label value that will be used to assign pods to nodes. Defaults to the isolation group's name, but can be overridden to allow multiple IsolationGroups to be assigned to the same zone. | []string | false |
| nodeAffinityTerms | NodeAffinityTerms is an array of NodeAffinityTerm requirements, which are ANDed together to indicate what nodes an isolation group can be assigned to. | [][NodeAffinityTerm](#nodeaffinityterm) | false |
| numInstances | NumInstances defines the number of instances. | int32 | true |
| storageClassName | StorageClassName is the name of the StorageClass to use for this isolation group. This allows ensuring that PVs will be created in the same zone as the pinned statefulset on Kubernetes < 1.12 (when topology aware volume scheduling was introduced). Only has effect if the clusters `dataDirVolumeClaimTemplate` is non-nil. If set, the volume claim template will have its storageClassName field overridden per-isolationgroup. If unset the storageClassName of the volumeClaimTemplate will be used. | string | false |

Expand Down Expand Up @@ -107,6 +107,17 @@ M3DBStatus contains the current state the M3DB cluster along with a human readab

[Back to TOC](#table-of-contents)

## NodeAffinityTerm

NodeAffinityTerm represents a node label and a set of label values, any of which can be matched to assign a pod to a node.

| Field | Description | Scheme | Required |
| ----- | ----------- | ------ | -------- |
| key | Key is the label of the node. | string | true |
| values | Values is an array of values, any of which a node can have for a pod to be assigned to it. | []string | true |

[Back to TOC](#table-of-contents)

## IndexOptions

IndexOptions defines parameters for indexing.
Expand Down
48 changes: 36 additions & 12 deletions docs/getting_started/create_cluster.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,12 +47,24 @@ spec:
- http://etcd-1.etcd:2379
- http://etcd-2.etcd:2379
isolationGroups:
- name: us-east1-b
numInstances: 1
- name: us-east1-c
numInstances: 1
- name: us-east1-d
numInstances: 1
- name: group1
numInstances: 1
nodeAffinityTerms:
- key: failure-domain.beta.kubernetes.io/zone
values:
- <zone-a>
- name: group2
numInstances: 1
nodeAffinityTerms:
- key: failure-domain.beta.kubernetes.io/zone
values:
- <zone-b>
- name: group3
numInstances: 1
nodeAffinityTerms:
- key: failure-domain.beta.kubernetes.io/zone
values:
- <zone-c>
podIdentityConfig:
sources:
- PodUID
Expand Down Expand Up @@ -114,12 +126,24 @@ spec:
replicationFactor: 3
numberOfShards: 256
isolationGroups:
- name: us-east1-b
numInstances: 1
- name: us-east1-c
numInstances: 1
- name: us-east1-d
numInstances: 1
- name: group1
numInstances: 1
nodeAffinityTerms:
- key: failure-domain.beta.kubernetes.io/zone
values:
- <zone-a>
- name: group2
numInstances: 1
nodeAffinityTerms:
- key: failure-domain.beta.kubernetes.io/zone
values:
- <zone-b>
- name: group3
numInstances: 1
nodeAffinityTerms:
- key: failure-domain.beta.kubernetes.io/zone
values:
- <zone-c>
etcdEndpoints:
- http://etcd-0.etcd:2379
- http://etcd-1.etcd:2379
Expand Down
Loading

0 comments on commit 5a30711

Please sign in to comment.