Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow the operator to be configured from a file #3570

Merged
merged 10 commits into from
Aug 14, 2020

Conversation

charith-elastic
Copy link
Contributor

Adds the ability to configure the operator from a file. This is a backward-compatible change that retains the existing configuration methods using flags and environment variables. Users can provide the configuration file using the --config flag. If there are any conflicts, the order of precedence for resolving them is:

  • Flag
  • Environment variable
  • File

This PR also includes:

  • Update to the all-in-one.yaml to use a ConfigMap for operator configuration
  • Update to the operator configuration documentation
  • A minor refactoring of the command code

Fixes #3401

@charith-elastic charith-elastic added >enhancement Enhancement of existing functionality v1.3.0 labels Jul 31, 2020
Copy link
Collaborator

@pebrc pebrc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good! I have only one question regarding the override behaviour which mostly relevant for OLM I believe.

If you use a combination of all or some of the methods listed above, the descending order of precedence in case of a conflict is as follows:

- Flag
- Environment variable
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is probably hard to change the precedence with viper, right? For OLM deployments the only customization option is via environment variables as far as I can tell from this https://github.com/operator-framework/operator-lifecycle-manager/blob/master/doc/design/subscription-config.md and it would be good if they took precedence, but I guess we could also just avoid flags in the OLM manifest?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, Viper does not allow configuring the order of precedence. OLM is an interesting problem. As far as I can tell, they don't yet support config maps to be bundled with the CSV. However, they do seem to allow the user to mount volumes using the Subscription object. I haven't tested whether that actually works but assuming that it does, then I propose the following:

  • We bundle a default config file in the operator container image at a well known path (say /conf/eck.yaml).
  • In the CSV deployment section, we only set the --config flag which will point to the default config file above.

With this approach, in addition to environment variables, the user has the option of using the Subscription to mount a config map at /conf that will shadow the default file bundled with the operator. That ought to take care of the configuration problem for OLM. WDYT?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that sounds interesting. We could do that in a separate PR if you want.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created #3591 to track it.

}

return nil
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice improvement over the original 👍

cmd/manager/main.go Outdated Show resolved Hide resolved
cmd/manager/main.go Outdated Show resolved Hide resolved
@charith-elastic
Copy link
Contributor Author

As discussed during the team meeting, currently there is no effective way to cleanly stop the goroutines that are spawned in the code. Therefore, instead of trying to restart eveything in-process, the operator will simply exit and rely on an external mechanism (Statefulset/Pod controller) to restart it.

Copy link
Contributor

@barkbay barkbay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments about concurrent modifications and op notification.

cmd/manager/main.go Outdated Show resolved Hide resolved
cmd/manager/main.go Outdated Show resolved Hide resolved
cmd/manager/main.go Outdated Show resolved Hide resolved
@barkbay
Copy link
Contributor

barkbay commented Aug 12, 2020

Regarding fsnotify we used it in the past but replaced it with a simpler mechanism. I'm wondering if the same concerns may apply here.

charith-elastic and others added 3 commits August 12, 2020 09:48
Co-authored-by: Michael Morello <michael.morello@gmail.com>
@charith-elastic
Copy link
Contributor Author

The watch mechanism in Viper seems to be fairly well implemented (including accounting for config map updates) so I think I would go with fsnotify for the time being. It's not the end of the world if a certain config update is not detected in time -- therefore it's a fairly low risk to take. If it turns out that there are fundamental problems with it, we can revisit and consider an alternative approach.

@charith-elastic
Copy link
Contributor Author

Jenkins test this please

Copy link
Contributor

@thbkrkr thbkrkr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

I left some debatable nitpicks related to the flag descriptions. Feel free to pick what do you think is relevant as I'm not a native english speaker.

cmd/manager/main.go Outdated Show resolved Hide resolved
cmd/manager/main.go Outdated Show resolved Hide resolved
cmd/manager/main.go Outdated Show resolved Hide resolved
cmd/manager/main.go Outdated Show resolved Hide resolved
cmd/manager/main.go Outdated Show resolved Hide resolved
cmd/manager/main.go Outdated Show resolved Hide resolved
cmd/manager/main.go Outdated Show resolved Hide resolved
cmd/manager/main.go Outdated Show resolved Hide resolved
cmd/manager/main.go Outdated Show resolved Hide resolved
cmd/manager/main.go Outdated Show resolved Hide resolved
Co-authored-by: Thibault Richard <thbkrkr@users.noreply.github.com>
ContainerRegistryFlag = "container-registry"
DebugHTTPListenFlag = "debug-http-listen"
DisableConfigWatch = "disable-config-watch"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: naming: config vs configuration, we are already using the latter in WebhookConfigurationNameFlag = "webhook-configuration-name"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing out the inconsistency. I don't think expanding config to configuration achieves much (there's no ambiguity) other than making the flags comically long. Since the webhook flag hasn't been released yet, I think I will rename that instead. Even Kubernetes uses ConfigMap instead of ConfigurationMap 😄.

apiVersion: v1
kind: ConfigMap
metadata:
name: {{ .Values.operator.name }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

> k get cm -n elastic-system 
NAME                    DATA   AGE
elastic-operator        1      5m
elastic-operator-uuid   1      19s

Should we call the ConfigMap elastic-operator-configuration to make its purpose more explicit ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have seen quite a fair amount of applications simply using the name of the application and I think that convention makes sense. It is a config map after all so the purpose is obvious without the additional -configuration suffix. If there are strong opinions against that, I'd be happy to reconsider.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a config map after all so the purpose is obvious

I'm not sure I would agree with that, we already have 2 ConfigMaps which are not really related to the configuration and we might have a third one for the election mechanism:

k get cm -n elastic-system
NAME                      DATA   AGE
elastic-licensing         4      18h
elastic-operator-leader   0      18h
elastic-operator-uuid     1      18h

I would be curious what other people's thoughts are, but I don't want to block this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your point. I am probably building up the convention in my mind because I have been conditioned by it but that probably doesn't apply to everybody. I'll merge this for now and if it turns out to be confusing, we still have time to change it before the next release goes out.

Simply because I have it in hand, Istio config maps look like the following.

istio                                  2      63d
istio-ca-root-cert                     1      63d
istio-leader                           0      63d
istio-namespace-controller-election    0      63d
istio-security                         1      63d
istio-sidecar-injector                 2      63d
istio-validation-controller-election   0      63d
kiali                                  1      58d
prometheus                             1      63d

I am not suggesting that this is a canonical example. However, I think it's quite obvious what each config map does. I reckon that ours would look similar.

docs/operating-eck/operator-config.asciidoc Outdated Show resolved Hide resolved
docs/operating-eck/operator-config.asciidoc Show resolved Hide resolved
Copy link
Contributor

@barkbay barkbay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@charith-elastic charith-elastic merged commit c57a043 into elastic:master Aug 14, 2020
@charith-elastic charith-elastic deleted the operator-config-file branch August 14, 2020 07:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement Enhancement of existing functionality v1.3.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Rely on a configMap for operator-level settings
4 participants