Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add local K8S remote cluster support #2543

Merged
merged 40 commits into from
Mar 2, 2020

Conversation

barkbay
Copy link
Contributor

@barkbay barkbay commented Feb 11, 2020

Fixes #2203

What this PR does:

  • It adds the remoteClusters field (as usual naming can be discussed) to the Elasticsearch schema:
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
  name: cluster-1
  namespace: ns1
spec:
  version: 7.5.2
  remoteClusters:
  - name: ns2-cluster-2
    elasticsearchRef:
      name: cluster-2
      namespace: ns2
    #config:
    #  transport.ping_schedule: 30s
    #  transport.compress: true
    #  skip_unavailable: true
  • Adds a new controller to copy certificate authorities between clusters. Each remote certificate is copied into the namespace of the peer cluster.
  • The elasticsearch controller then concatenates all the remote certificates into an new Secret called <cluster-name>-es-remote-ca

Here is a example of the new Secrets:

$ k get es,secrets -n ns1                    
NAME                                                   HEALTH   NODES   VERSION   PHASE   AGE
elasticsearch.elasticsearch.k8s.elastic.co/cluster-1   green    3       7.5.2     Ready   2m

NAME                                          TYPE                                  DATA   AGE
[...]
secret/cluster-1-es-remote-ca                 Opaque                                1      2m21s
secret/cluster-1-ns2-cluster-2-es-remote-ca   Opaque                                1      2m18s

$ k get es,secrets -n ns2
NAME                                                   HEALTH   NODES   VERSION   PHASE   AGE
elasticsearch.elasticsearch.k8s.elastic.co/cluster-2   green    3       7.5.2     Ready   1m

NAME                                          TYPE                                  DATA   AGE
[...]
secret/cluster-2-es-remote-ca                 Opaque                                1      81s
secret/cluster-2-ns1-cluster-1-es-remote-ca   Opaque                                1      80s
  • The elasticsearch controller also does the right API calls to setup the remote cluster in the settings. The last configuration is stored as an annotation in the Elasticsearch resource in order to avoid to do unnecessary calls:
apiVersion: v1
items:
- apiVersion: elasticsearch.k8s.elastic.co/v1
  kind: Elasticsearch
  metadata:
    annotations:
      elasticsearch.k8s.elastic.co/remote-clusters: '[{"name":"ns2-cluster-2","configHash":"3795207740"}]'
  • This PR also creates a transport service:
$ k get svc -n ns1
NAME                     TYPE           CLUSTER-IP    EXTERNAL-IP      PORT(S)          AGE
cluster-1-es-default     ClusterIP      None          <none>           <none>           10m
cluster-1-es-http        LoadBalancer   10.28.XX.XX   35.195.XX.XX     9200:32076/TCP   10m
cluster-1-es-transport   ClusterIP      None          <none>           9300/TCP         10m

Side notes:

  • I have added some spans for performance monitoring but I still need to check if they are relevant.
  • Don't forget to start a trial or use an Enterprise license if you want to do some tests

@barkbay barkbay added >feature Adds or discusses adding a feature to the product v1.1.0 labels Feb 12, 2020
@barkbay
Copy link
Contributor Author

barkbay commented Feb 17, 2020

jenkins test this please

@barkbay barkbay marked this pull request as ready for review February 17, 2020 07:26
@barkbay
Copy link
Contributor Author

barkbay commented Feb 18, 2020

jenkins test this please

Copy link
Collaborator

@pebrc pebrc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! I just took a quick look.


// RemoteClusters defines some remote Elasticsearch clusters.
type RemoteClusters struct {
K8sLocal []K8sLocalRemoteCluster `json:"k8sLocal,omitempty"`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know we discussed this before but I am still not 💯 on the naming here, mostly gut feeling/aesthetics reasons, so apologies if I don't have a better suggestion right now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to find alternatives. If we look at eg. VolumeSources, there's a flat list of optional sources expressed in a single struct. Maybe we could remove one layer here?

type RemoteCluster struct {
  Config RemoteClusterConfig `json:"config,omitempty"`

  ElasticsearchRef *commonv1.ObjectSelector
  ElasticCloudRef *ElasticCloudRef
  NetworkRef *NetworkRef
}
spec:
  remoteClusters:
  - elasticsearchRef:
      name: foo
      namespace: bar
    config:
        transport.ping_schedule: 30s
  - elasticsearchRef:
      name: bar
      namespace: baz
    config:
        transport.ping_schedule: 10s
  - elasticCloudRef:
       clusterId: 1234
    config:
        transport.ping_schedule: 10s
  - networkRef:
        seeds: 1.2.3.4:9300
    config:
        transport.ping_schedule: 10s

ElasticsearchRef is implicitly the "local" ref. If later on we want to extend it to cross-k8s cluster boundaries we can either extend the existing ElasticsearchRef with more fields (eg. elasticsearchref.k8sCluster), either introduce a new type (federatedElasticsearchRef).

pkg/apis/common/v1/common.go Outdated Show resolved Hide resolved
pkg/apis/elasticsearch/v1/elasticsearch_types.go Outdated Show resolved Hide resolved
pkg/controller/elasticsearch/services/services.go Outdated Show resolved Hide resolved
pkg/controller/elasticsearch/services/services.go Outdated Show resolved Hide resolved
Name: TransportServiceName(es.Name),
Labels: label.NewLabels(nsn),
},
Spec: corev1.ServiceSpec{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we allow users to control the spec like we do for the HTTP service? For remoting outside of the k8s cluster users probably want this to be of type LoadBalancer for example.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes sense. That said this PR is already pretty large and is targeting clusters inside a same k8s cluster. I would do that in a dedicated PR, I can create an issue to track this effort.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Contributor

@sebgl sebgl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great work!

There is a lot of code so I have a lot of comments, sorry 😄

One concern I have is that I feel lost when reading the association controller reconciliation code. It's hard to figure out what local, what's remote, how the local CA also needs to be remote, etc. I don't know if we can make that clearer? Maybe through smaller functions and more comments about the high-level things we want to do.

config/crds/all-crds.yaml Outdated Show resolved Hide resolved

// RemoteClusters defines some remote Elasticsearch clusters.
type RemoteClusters struct {
K8sLocal []K8sLocalRemoteCluster `json:"k8sLocal,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to find alternatives. If we look at eg. VolumeSources, there's a flat list of optional sources expressed in a single struct. Maybe we could remove one layer here?

type RemoteCluster struct {
  Config RemoteClusterConfig `json:"config,omitempty"`

  ElasticsearchRef *commonv1.ObjectSelector
  ElasticCloudRef *ElasticCloudRef
  NetworkRef *NetworkRef
}
spec:
  remoteClusters:
  - elasticsearchRef:
      name: foo
      namespace: bar
    config:
        transport.ping_schedule: 30s
  - elasticsearchRef:
      name: bar
      namespace: baz
    config:
        transport.ping_schedule: 10s
  - elasticCloudRef:
       clusterId: 1234
    config:
        transport.ping_schedule: 10s
  - networkRef:
        seeds: 1.2.3.4:9300
    config:
        transport.ping_schedule: 10s

ElasticsearchRef is implicitly the "local" ref. If later on we want to extend it to cross-k8s cluster boundaries we can either extend the existing ElasticsearchRef with more fields (eg. elasticsearchref.k8sCluster), either introduce a new type (federatedElasticsearchRef).


const (
// RemoteClusterNamespaceLabelName used to represent the namespace of the RemoteCluster in a TrustRelationship.
RemoteClusterNamespaceLabelName = "remotecluster.k8s.elastic.co/namespace"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should probably have elasticsearch.k8s.elastic.co in the name.
Can we do elasticsearch.k8s.elastic.co/remote-cluster: {"namespace": "ns", "name": "cluster-name"}?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we can put some json in a label (also I'm not sure it's easy to create a label selector against it)

pkg/controller/elasticsearch/services/services.go Outdated Show resolved Hide resolved
@barkbay
Copy link
Contributor Author

barkbay commented Feb 25, 2020

I have updated the PR description with the new spec format, ICYMI here is an example:

apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
  name: cluster-1
  namespace: ns1
spec:
  version: 7.5.2
  remoteClusters:
  - name: ns2-cluster-2
    elasticsearchRef:
      name: cluster-2
      namespace: ns2
  - elasticsearchRef:
      name: cluster-3
      namespace: ns3
    name: ns3-cluster-3
  nodeSets:
    - name: default
      count: 3
      config:
        node.store.allow_mmap: false

@barkbay
Copy link
Contributor Author

barkbay commented Feb 26, 2020

jenkins test this please

@barkbay
Copy link
Contributor Author

barkbay commented Feb 26, 2020

@sebgl I split the remote CA reconciler and add some comments, I also tried to address most of your comments.

@barkbay barkbay requested review from pebrc and sebgl February 26, 2020 09:04
Copy link
Collaborator

@pebrc pebrc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, this is really impressive work. I think I am 👍 on merging and if necessary follow up with two PRs:

  • making the transport service user configurable
  • if we decide to use file based config for remote clusters to switch to that

Leaving the final ✅ to @sebgl who did the more in-depth review

pkg/apis/elasticsearch/v1/elasticsearch_types.go Outdated Show resolved Hide resolved
)

// Add creates a new RemoteCa Controller and adds it to the manager with default RBAC.
func Add(mgr manager.Manager, accessReviewer rbac.AccessReviewer, params operator.Parameters) error {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any particular reason why you did not move everything from elasticsearch/remotecluster/remoteca into this package?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At first it was only in elasticsearch/remotecluster/remoteca The reason is that this controller only makes sense in the context of Elasticsearch.

I moved a part of it to make it "visible", I'm not sure I would go further because this controller is dependent on the Elasticsearch one, but I'm happy to move everything if you think it makes sense.

@sebgl
Copy link
Contributor

sebgl commented Feb 27, 2020

Thanks for making the changes, I understood the code much better now regarding what's remote vs. local and why we reconcile 2 secrets per association.

Most of my comments above are nitpicks. Except this one that I'd like to understand. I think we are reconciling the same secret multiple times in different reconciliations targeting different clusters and we may not have to?

Copy link
Contributor

@sebgl sebgl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Great work!

@barkbay barkbay merged commit eefc8af into elastic:master Mar 2, 2020
@barkbay
Copy link
Contributor Author

barkbay commented Mar 2, 2020

@pebrc @sebgl It was a large PR, thank you both for your thorough code review ! ❤️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>feature Adds or discusses adding a feature to the product v1.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support for remote ES clusters within a single k8s cluster
3 participants