Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automate TLS secrets generation for Kubernetes family infrastructures #220

Merged
merged 15 commits into from
Apr 28, 2020

Conversation

mmorhun
Copy link
Contributor

@mmorhun mmorhun commented Apr 15, 2020

Signed-off-by: Mykola Morhun mmorhun@redhat.com

What this PR does

This PR adds checks into Che Operator reconcile loop for required for Che deployment TLS secrets on Kubernetes infrastructure, and if they are absent, it starts job to generate them.

Which issues this PR fixes or references

eclipse-che/che#16546

How to test

  • Via chectl:
    • Build operator image, push it
    • Disable TLS secrets check and then run:
checlt server:start --platform=minikube --installer=operator --self-signed-cert --tls --che-operator-image=<operator-image>
  • Via local-debug.sh:
    Copy default CR, and change following fields:
spec.server.tlsSupport: true
spec.server.selfSignedCert: true
spec.k8s.ingressDomain: <cluster-domain>

deploy/operator-local.yaml Outdated Show resolved Hide resolved
@@ -265,6 +265,14 @@ func (r *ReconcileChe) Reconcile(request reconcile.Request) (reconcile.Result, e
}
}

// Handle Che TLS certificates on Kubernetes like infrastructures
if instance.Spec.Server.TlsSupport && instance.Spec.Server.SelfSignedCert && !isOpenShift {
shouldReturn, reconsileResult, err := HandleCheTLSSecrets(instance, r)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we return something like status structure?
Something like this
https://github.com/eclipse/che-operator/blob/master/pkg/deploy/deployment.go#L42

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Theoretically we can, but I am against it. We need to return 3 values from the method and we can do it using embedded in the golang mechanism. Why do we need to introduce new abstraction then?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be consistent in the implementations.

Copy link
Contributor

@AndrienkoAleksandr AndrienkoAleksandr Apr 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 About consistent implementation

@@ -132,6 +133,29 @@ func (r *ReconcileChe) CreateNewRoleBinding(instance *orgv1.CheCluster, roleBind
return nil
}

// CreateNewJob deploys new instance of given job
func (r *ReconcileChe) CreateNewJob(instance *orgv1.CheCluster, job *batchv1.Job) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Create a syncJobToCluster function with the same logic as for pvc, ingress, route ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is better to keep ability to create an object (in our case job) configuration and then separately flush it to cluster. Because we want to have universal methods, but we never know for sure which aspects of it we may want to change in the future. Changing the sync* method signature is not good as it will affect other consumers. Updating object in runtime is worse or even impossible.
I may move this method into job.go and rename it to match sync* pattern, but I don't want to loose the flexibility here.

var shouldReturn = true

// HandleCheTLSSecrets handles TLS secrets required for Che deployment
func HandleCheTLSSecrets(checluster *orgv1.CheCluster, r *ReconcileChe) (bool, reconcile.Result, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, please see my answer above. Moreover, inside internal function we behave the same as in reconcile loop.

@AndrienkoAleksandr
Copy link
Contributor

Interesting, some day maybe we will have tls generation from the box https://github.com/operator-framework/operator-sdk/blob/master/doc/proposals/tls-utilities.md

@mmorhun
Copy link
Contributor Author

mmorhun commented Apr 16, 2020

@AndrienkoAleksandr I've already seen that. Unfortunately current functionality they provide has only basic stuff which doesn't suite our needs. But thanks for commenting.

pkg/deploy/job.go Outdated Show resolved Hide resolved
// NewJob creates new job configuration by giben parameters.
func NewJob(checluster *orgv1.CheCluster, name string, namespace string, image string, serviceAccountName string, env map[string]string, backoffLimit int32) *batchv1.Job {
labels := GetLabels(checluster, util.GetValue(checluster.Spec.Server.CheFlavor, DefaultCheFlavor))
labels["component"] = "che-create-tls-secret-job"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would keep `che component

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not possible due to conflicts with Che master. All Che related pods have lebel app=che including this job.

}

// NewJob creates new job configuration by giben parameters.
func NewJob(checluster *orgv1.CheCluster, name string, namespace string, image string, serviceAccountName string, env map[string]string, backoffLimit int32) *batchv1.Job {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can get namespace from checluster, not needed to pass it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't configure ttlSecondsAfterFinished,
backoffLimit, why do we need to configure it? let's use some default value for instance 5.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for case when we need to start a job in different than Che namespace.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We indeed don't configure ttlSecondsAfterFinished so I may leave it with default value. As for backoffLimit, to me it may vary for different jobs, so I left it configurable. I would like to use some default values, but golang doesn't provide such ability.

Copy link
Contributor

@AndrienkoAleksandr AndrienkoAleksandr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested, works Good.

@tolusha
Copy link
Contributor

tolusha commented Apr 23, 2020

time="2020-04-23T08:25:54+03:00" level=info msg="Creating a new object: Job, name: che-tls-job"
time="2020-04-23T08:25:54+03:00" level=info msg="Creating a new object: Job, name: che-tls-job"
time="2020-04-23T08:25:54+03:00" level=error msg="Failed to create che-tls-job Job: jobs.batch \"che-tls-job\" already exists"
time="2020-04-23T08:25:54+03:00" level=error msg="Error creating Che TLS job \"che-tls-job\": jobs.batch \"che-tls-job\" already exists"
time="2020-04-23T08:25:55+03:00" level=info msg="Creating a new object: Job, name: che-tls-job"
time="2020-04-23T08:25:55+03:00" level=error msg="Failed to create che-tls-job Job: jobs.batch \"che-tls-job\" already exists"
time="2020-04-23T08:25:55+03:00" level=error msg="Error creating Che TLS job \"che-tls-job\": jobs.batch \"che-tls-job\" already exists"
time="2020-04-23T08:25:56+03:00" level=info msg="Creating a new object: Job, name: che-tls-job"
time="2020-04-23T08:25:56+03:00" level=error msg="Failed to create che-tls-job Job: jobs.batch \"che-tls-job\" already exists"
time="2020-04-23T08:25:56+03:00" level=error msg="Error creating Che TLS job \"che-tls-job\": jobs.batch \"che-tls-job\" already exists"
time="2020-04-23T08:25:57+03:00" level=info msg="Creating a new object: Job, name: che-tls-job"
time="2020-04-23T08:25:57+03:00" level=error msg="Failed to create che-tls-job Job: jobs.batch \"che-tls-job\" already exists"
time="2020-04-23T08:25:57+03:00" level=error msg="Error creating Che TLS job \"che-tls-job\": jobs.batch \"che-tls-job\" already exists"

@mmorhun
Copy link
Contributor Author

mmorhun commented Apr 23, 2020

@tolusha thanks. It was hard to reproduce the issue. So actually the only way I can see it is to:

  1. Stop Che operator
  2. Delete Che secrets and all deployments, but leave namespace with the job
  3. Start operator again

This looks like artificial for me, but better to fix it. I can do it, when you let me know your thought about proposed approach.

mmorhun and others added 14 commits April 27, 2020 12:23
Signed-off-by: Mykola Morhun <mmorhun@redhat.com>
…oCluster.

Signed-off-by: Mykola Morhun <mmorhun@redhat.com>
Signed-off-by: Mykola Morhun <mmorhun@redhat.com>
Signed-off-by: Anatoliy Bazko <abazko@redhat.com>
Signed-off-by: Anatoliy Bazko <abazko@redhat.com>
Signed-off-by: Anatoliy Bazko <abazko@redhat.com>
Signed-off-by: Anatoliy Bazko <abazko@redhat.com>
Signed-off-by: Anatoliy Bazko <abazko@redhat.com>
Signed-off-by: Anatoliy Bazko <abazko@redhat.com>
Signed-off-by: Anatoliy Bazko <abazko@redhat.com>
Signed-off-by: Anatoliy Bazko <abazko@redhat.com>
Signed-off-by: Anatoliy Bazko <abazko@redhat.com>
Signed-off-by: Mykola Morhun <mmorhun@redhat.com>
Signed-off-by: Mykola Morhun <mmorhun@redhat.com>
@mmorhun mmorhun force-pushed the che-16546 branch 2 times, most recently from af1c15f to e3625c0 Compare April 28, 2020 12:30
Signed-off-by: Mykola Morhun <mmorhun@redhat.com>
@mmorhun mmorhun merged commit bc47b7b into master Apr 28, 2020
@mmorhun mmorhun deleted the che-16546 branch April 28, 2020 13:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants