Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

controller refresh certificate #3489

Merged
merged 9 commits into from
Jan 17, 2024

Conversation

ashutosji
Copy link
Contributor

What type of PR is this?

Uncomment only one /kind <> line, press enter to put that in a new line, and remove leading whitespace from that line:

/kind breaking
/kind bug
/kind cleanup
/kind documentation
/kind feature
/kind hotfix
/kind release

What this PR does / Why we need it:

Which issue(s) this PR fixes:#3133

Closes #3133

Special notes for your reviewer:

Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: ashutosji
Once this PR has been reviewed and has the lgtm label, please assign markmandel for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Comment on lines 283 to 296
newHTTPServer := &http.Server{
Addr: ":8081",
Handler: httpsServer.Mux,
}
newHTTPServer.TLSConfig = &tls.Config{
Certificates: []tls.Certificate{*newCert},
}

// Update the TLS configuration
go func() {
if err := newHTTPServer.ListenAndServeTLS("", ""); err != nil {
logger.WithError(err).Error("Failed to update TLS configuration")
}
}()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @markmandel and @zmerlynn,
The moto is to update reloaded certificates in the existing server. But, however i did not find a way to update httpsServer.TLSConfig. I thought of creating a new server and updating the reloaded certificates. But it doesn't seems correct to me and technically it is wrong. Since, we are using custom server by utilizing this package https://github.com/googleforgames/agones/blob/main/pkg/util/https/server.go. Can i get right approach or suggestions on this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found a great StackOverflow post on this, with links to several examples:
https://stackoverflow.com/questions/37473201/is-there-a-way-to-update-the-tls-certificates-in-a-net-http-server-without-any-d

TL;DR - https://pkg.go.dev/crypto/tls#Config exposes a GetCertificate(...) function that will be called when a certificate is required - returning whatever is the current certificate from that is the appropriate way to go.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, Thanks I will check this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@agones-bot
Copy link
Collaborator

Build Succeeded 👏

Build Id: c1338ef2-b911-4a77-bd13-881ef76c8b7c

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

  • git fetch https://github.com/googleforgames/agones.git pull/3489/head:pr_3489 && git checkout pr_3489
  • helm install agones ./install/helm/agones --namespace agones-system --set agones.image.registry=us-docker.pkg.dev/agones-images/ci --set agones.image.tag=1.36.0-dev-e32bb21-amd64

@ashutosji ashutosji force-pushed the controller-refresh-certificate branch from e32bb21 to e58b9d5 Compare November 15, 2023 18:17
@agones-bot
Copy link
Collaborator

Build Failed 😱

Build Id: bcc4805e-be14-4fca-9b71-cf3571d3c7cd

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@ashutosji ashutosji force-pushed the controller-refresh-certificate branch from e58b9d5 to 97c811a Compare November 21, 2023 15:28
@agones-bot
Copy link
Collaborator

Build Failed 😱

Build Id: 49dcef0d-03d2-48b6-b43e-00ff8f4d8e26

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@agones-bot
Copy link
Collaborator

Build Failed 😱

Build Id: da863859-46c1-4a80-b6d7-705a5c911137

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@ashutosji ashutosji force-pushed the controller-refresh-certificate branch from 3b9f1c7 to d7f1a17 Compare December 28, 2023 13:43
@agones-bot
Copy link
Collaborator

Build Succeeded 👏

Build Id: c254a1f5-317a-4e92-b651-9b05ff5501a5

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

  • git fetch https://github.com/googleforgames/agones.git pull/3489/head:pr_3489 && git checkout pr_3489
  • helm install agones ./install/helm/agones --namespace agones-system --set agones.image.registry=us-docker.pkg.dev/agones-images/ci --set agones.image.tag=1.38.0-dev-d7f1a17-amd64

Copy link
Collaborator

@gongmax gongmax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think we should implement it like the pattern mentioned in https://stackoverflow.com/questions/37473201/is-there-a-way-to-update-the-tls-certificates-in-a-net-http-server-without-any-d. I.e. have a structure to keep the certs, which get refreshed bya goroutine to watch for certificate changes (i.e. watch on the tlsDir). Then the tls.TLSConfig.GetCertificate can be set to a func that simply return whatever is in the structure rather than load the cert again.
The major concern I have for the current implementation is that I'm not quite sure when the GetCertificate is called and whether it could miss some corner case, which would not be an issue in the option that using goroutine to watch for certificate changes.

@ashutosji ashutosji force-pushed the controller-refresh-certificate branch from d7f1a17 to 6c38905 Compare January 3, 2024 10:10
@agones-bot
Copy link
Collaborator

Build Failed 😱

Build Id: 2854c057-8936-4ce7-a81f-a341b6cc4c9a

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@ashutosji
Copy link
Contributor Author

I still think we should implement it like the pattern mentioned in https://stackoverflow.com/questions/37473201/is-there-a-way-to-update-the-tls-certificates-in-a-net-http-server-without-any-d. I.e. have a structure to keep the certs, which get refreshed bya goroutine to watch for certificate changes (i.e. watch on the tlsDir). Then the tls.TLSConfig.GetCertificate can be set to a func that simply return whatever is in the structure rather than load the cert again. The major concern I have for the current implementation is that I'm not quite sure when the GetCertificate is called and whether it could miss some corner case, which would not be an issue in the option that using goroutine to watch for certificate changes.

You are right! Rather than loading the file every time, I can store the current certificate in the structure and have GetCertificate function which will simple return the certificate.

But however, There are two issue that I am facing:

  1. Again started getting the same Error from server (InternalError): error when creating "https://raw.githubusercontent.com/googleforgames/agones/release-1.37.0/examples/simple-game-server/gameserver.yaml": Internal error occurred: failed calling webhook "mutations.agones.dev": failed to call webhook: Post "https://agones-controller-service.agones-system.svc:443/mutate?timeout=10s": tls: failed to verify certificate: x509: certificate signed by unknown authority
  2. Facing the issue while handling watch certificate in the unit test here: https://github.com/googleforgames/agones/blob/main/pkg/util/https/server_test.go#L41

@agones-bot
Copy link
Collaborator

Build Failed 😱

Build Id: 5faf7f2e-7cff-4292-867c-c52f5872a702

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

Copy link
Collaborator

@gongmax gongmax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think line 123 should be changed to err := s.tls.ListenAndServeTLS("", ""), similar as https://github.com/googleforgames/agones/blob/main/cmd/allocator/main.go#L351

}

// Start a goroutine to watch for certificate changes
go s.watchForCertificateChanges()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't need to be a gorountine, just s.watchForCertificateChanges() should be fine

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I made WatchForCertificateChanges() function public so that it should be accessible from extensions/controller.

s.logger.WithError(err).Fatal("could not create watcher for TLS certs")
}

defer cancelTLS()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will close the watcher after this func returns. I think you need to return this cancelTLS() function to the caller (actually up to the main.go of controller/extension) and defer it there, so the watcher can keep watching the events.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks you so much.
Yes, watcher was getting close after func return. I have modified the WatchForCertificateChanges() function and called it in extensions. It's working!

@ashutosji ashutosji force-pushed the controller-refresh-certificate branch from fbca193 to fcde21b Compare January 11, 2024 16:11
@agones-bot
Copy link
Collaborator

Build Failed 😱

Build Id: 88eaf97a-dd31-4f07-931f-f5a2aec7152b

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@ashutosji ashutosji force-pushed the controller-refresh-certificate branch from fcde21b to fd4eb4f Compare January 16, 2024 10:03
@agones-bot
Copy link
Collaborator

Build Failed 😱

Build Id: dc3b4cc3-16dc-435d-9b26-c5381238287d

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@agones-bot
Copy link
Collaborator

Build Succeeded 👏

Build Id: d1f89fab-f8ea-420e-bd2b-1df02c6216b7

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

  • git fetch https://github.com/googleforgames/agones.git pull/3489/head:pr_3489 && git checkout pr_3489
  • helm install agones ./install/helm/agones --namespace agones-system --set agones.image.registry=us-docker.pkg.dev/agones-images/ci --set agones.image.tag=1.38.0-dev-1c99845-amd64

Comment on lines 42 to 43
Certs *cryptotls.Certificate
CertMu sync.Mutex
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: these fields doesn't have to exported (beginning with upper case) since they are only used in this package, right? Better to change them to unexported for stricter restriction. Similar for the CertServer in Server.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah! Thanks for pointing. It should be unexported type.

@gongmax gongmax marked this pull request as ready for review January 16, 2024 21:21
Copy link
Collaborator

@gongmax gongmax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, only have one minor comment.

@agones-bot
Copy link
Collaborator

Build Succeeded 👏

Build Id: fb9ea940-891d-40e3-ba6d-879c77605b66

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

  • git fetch https://github.com/googleforgames/agones.git pull/3489/head:pr_3489 && git checkout pr_3489
  • helm install agones ./install/helm/agones --namespace agones-system --set agones.image.registry=us-docker.pkg.dev/agones-images/ci --set agones.image.tag=1.38.0-dev-c4f2ab7-amd64

@gongmax gongmax merged commit 752a52d into googleforgames:main Jan 17, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Refreshing controller cert if its changed
4 participants