Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Graceful shutdown for the API server (#18642) #20981

Merged

Conversation

andrii-korotkov-verkada
Copy link
Contributor

@andrii-korotkov-verkada andrii-korotkov-verkada commented Nov 28, 2024

Closes #18642
fix #18576

Implements a graceful shutdown the the API server. Without this, ArgoCD API server will eventually return 502 during rolling update. However, healthcheck would return 503 if the server is terminating.

Checklist:

  • Either (a) I've created an enhancement proposal and discussed it with the community, (b) this is a bug fix, or (c) this does not need to be in the release notes.
  • The title of the PR states what changed and the related issues number (used for the release note).
  • The title of the PR conforms to the Toolchain Guide
  • I've included "Closes [ISSUE #]" or "Fixes [ISSUE #]" in the description to automatically close the associated issue.
  • I've updated both the CLI and UI to expose my feature, or I plan to submit a second PR with them.
  • Does this PR require documentation updates?
  • I've updated documentation as required by this PR.
  • I have signed off all my commits as required by DCO
  • I have written unit and/or e2e tests for my change. PRs without these are unlikely to be merged.
  • My build is green (troubleshooting builds).
  • My new feature complies with the feature status guidelines.
  • I have added a brief description of why this PR is necessary and/or what this PR solves.
  • Optional. My organization is added to USERS.md.
  • Optional. For bug fixes, I've indicated what older releases this fix should be cherry-picked into (this may or may not happen depending on risk/complexity).

@andrii-korotkov-verkada andrii-korotkov-verkada requested a review from a team as a code owner November 28, 2024 06:25
Copy link

bunnyshell bot commented Nov 28, 2024

❌ Preview Environment deleted from Bunnyshell

Available commands (reply to this comment):

  • 🚀 /bns:deploy to deploy the environment

Copy link

codecov bot commented Nov 28, 2024

Codecov Report

Attention: Patch coverage is 75.42373% with 29 lines in your changes missing coverage. Please review.

Project coverage is 55.24%. Comparing base (bd5d76f) to head (4a37d02).
Report is 14 commits behind head on master.

Files with missing lines Patch % Lines
server/server.go 85.14% 12 Missing and 3 partials ⚠️
cmd/argocd-server/commands/argocd_server.go 0.00% 8 Missing ⚠️
util/cache/redis.go 20.00% 3 Missing and 1 partial ⚠️
util/session/state.go 0.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #20981      +/-   ##
==========================================
+ Coverage   55.02%   55.24%   +0.21%     
==========================================
  Files         324      324              
  Lines       55472    55572     +100     
==========================================
+ Hits        30522    30699     +177     
+ Misses      22329    22257      -72     
+ Partials     2621     2616       -5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@andrii-korotkov-verkada andrii-korotkov-verkada force-pushed the 18642-server-graceful-shutdown branch 11 times, most recently from 8280700 to a8ca06a Compare November 28, 2024 15:53
@andrii-korotkov-verkada andrii-korotkov-verkada marked this pull request as draft November 28, 2024 16:00
@andrii-korotkov-verkada andrii-korotkov-verkada force-pushed the 18642-server-graceful-shutdown branch 8 times, most recently from badea03 to 240c2d5 Compare November 28, 2024 17:34
Copy link
Collaborator

@leoluz leoluz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check my comments

@@ -44,6 +44,7 @@ message Settings {
bool appsInAnyNamespaceEnabled = 24;
bool impersonationEnabled = 25;
string installationID = 26;
repeated string additionalUrls = 27 [(gogoproto.customname) = "AdditionalURLs"];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why this is needed. If there is no strong reason to add an additional attribute to the settings please, remove this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the comment below. I also don't see a problem with adding the field, since URL is already there as well as a number of other fields. Why do you wanna avoid adding it?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PRs should be concise and fix one main problem. This provides flexibility while cherry-picking the work in different release branches. The additionalUrls, while a good addition to the API, it doesn't belong in this PR as it changes the API and is not directly related wit the problem that we are trying to address.
Please provide another PR to add support to the additionalUrls.

// Should be healthy.
checkHealth(t, true)
// Should trigger API server restart.
fixture.SetParamInSettingConfigMap("additionalUrls", "- http://test")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that you are only using this additionalUrls to restart the API server? If so, there are other fields in the configmap that can be used for the same purpose.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modifying additional urls is the safest way to trigger reload that I could find. Feel free to suggest the alternative. I was hesitant to change URL, being afraid it can mess up the things completely. I also wanted to test that server is indeed restarted with new parameters in place (otherwise the test may succeed even if restart didn't happen).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what it restarts on

argo-cd/server/server.go

Lines 674 to 734 in 02d6866

for {
newSettings := <-updateCh
a.settings = newSettings
newDexCfgBytes, err := dexutil.GenerateDexConfigYAML(a.settings, a.DexTLSConfig == nil || a.DexTLSConfig.DisableTLS)
errorsutil.CheckError(err)
if string(newDexCfgBytes) != string(prevDexCfgBytes) {
log.Infof("dex config modified. restarting")
break
}
if checkOIDCConfigChange(prevOIDCConfig, a.settings) {
log.Infof("oidc config modified. restarting")
break
}
if prevURL != a.settings.URL {
log.Infof("url modified. restarting")
break
}
if !reflect.DeepEqual(prevAdditionalURLs, a.settings.AdditionalURLs) {
log.Infof("additionalURLs modified. restarting")
break
}
if prevGitHubSecret != a.settings.WebhookGitHubSecret {
log.Infof("github secret modified. restarting")
break
}
if prevGitLabSecret != a.settings.WebhookGitLabSecret {
log.Infof("gitlab secret modified. restarting")
break
}
if prevBitbucketUUID != a.settings.WebhookBitbucketUUID {
log.Infof("bitbucket uuid modified. restarting")
break
}
if prevBitbucketServerSecret != a.settings.WebhookBitbucketServerSecret {
log.Infof("bitbucket server secret modified. restarting")
break
}
if prevGogsSecret != a.settings.WebhookGogsSecret {
log.Infof("gogs secret modified. restarting")
break
}
if !reflect.DeepEqual(prevExtConfig, a.settings.ExtensionConfig) {
prevExtConfig = a.settings.ExtensionConfig
log.Infof("extensions configs modified. Updating proxy registry...")
err := a.extensionManager.UpdateExtensionRegistry(a.settings)
if err != nil {
log.Errorf("error updating extensions configs: %s", err)
} else {
log.Info("extensions configs updated successfully")
}
}
if !a.ArgoCDServerOpts.Insecure {
var newCert, newCertKey string
if a.settings.Certificate != nil {
newCert, newCertKey = tlsutil.EncodeX509KeyPairString(*a.settings.Certificate)
}
if newCert != prevCert || newCertKey != prevCertKey {
log.Infof("tls certificate modified. reloading certificate")
// No need to break out of this loop since TlsConfig.GetCertificate will automagically reload the cert.
}
}
:

  • URL
  • AdditionalURLs
  • Various secrets/certificates
  • Dex/Oidc config
  • Extensions
    Out of these, additional urls seems to be the only one without potential side effects. As of others, I'm not even sure what effects they can have, but modifying them to some test values doesn't seem promising.

Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>
Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>
Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>
Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>
@@ -44,6 +44,7 @@ message Settings {
bool appsInAnyNamespaceEnabled = 24;
bool impersonationEnabled = 25;
string installationID = 26;
repeated string additionalUrls = 27 [(gogoproto.customname) = "AdditionalURLs"];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PRs should be concise and fix one main problem. This provides flexibility while cherry-picking the work in different release branches. The additionalUrls, while a good addition to the API, it doesn't belong in this PR as it changes the API and is not directly related wit the problem that we are trying to address.
Please provide another PR to add support to the additionalUrls.

Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>
Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>
Copy link
Collaborator

@leoluz leoluz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
Thank you!

@leoluz leoluz merged commit 730363f into argoproj:master Dec 3, 2024
27 checks passed
@andrii-korotkov-verkada andrii-korotkov-verkada deleted the 18642-server-graceful-shutdown branch December 3, 2024 20:34
adriananeci pushed a commit to adriananeci/argo-cd that referenced this pull request Dec 4, 2024
…20981)

* fix: Graceful shutdown for the API server (argoproj#18642)

Closes argoproj#18642

Implements a graceful shutdown the the API server. Without this, ArgoCD API server will eventually return 502 during rolling update. However, healthcheck would return 503 if the server is terminating.

Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>
Co-authored-by: Leonardo Luz Almeida <leonardo_almeida@intuit.com>
Co-authored-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com>

* Init server only once, but keep re-initializing listeners

Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>

* Check error for SetParamInSettingConfigMap as needed after fresh master

Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>

* Prevent a data race

Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>

* Remove unused variable, don't pass lock when not necessary

Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>

* Try overriding URL instead of additional URLs

Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>

* Use a more specific url

Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>

---------

Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>
Co-authored-by: Leonardo Luz Almeida <leonardo_almeida@intuit.com>
Co-authored-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com>
Signed-off-by: Adrian Aneci <aneci@adobe.com>
gcp-cherry-pick-bot bot pushed a commit that referenced this pull request Dec 9, 2024
* fix: Graceful shutdown for the API server (#18642)

Closes #18642

Implements a graceful shutdown the the API server. Without this, ArgoCD API server will eventually return 502 during rolling update. However, healthcheck would return 503 if the server is terminating.

Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>
Co-authored-by: Leonardo Luz Almeida <leonardo_almeida@intuit.com>
Co-authored-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com>

* Init server only once, but keep re-initializing listeners

Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>

* Check error for SetParamInSettingConfigMap as needed after fresh master

Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>

* Prevent a data race

Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>

* Remove unused variable, don't pass lock when not necessary

Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>

* Try overriding URL instead of additional URLs

Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>

* Use a more specific url

Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>

---------

Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>
Co-authored-by: Leonardo Luz Almeida <leonardo_almeida@intuit.com>
Co-authored-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com>
@OpenGuidou
Copy link
Contributor

/cherry-pick release-2.13

gcp-cherry-pick-bot bot pushed a commit that referenced this pull request Dec 9, 2024
* fix: Graceful shutdown for the API server (#18642)

Closes #18642

Implements a graceful shutdown the the API server. Without this, ArgoCD API server will eventually return 502 during rolling update. However, healthcheck would return 503 if the server is terminating.

Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>
Co-authored-by: Leonardo Luz Almeida <leonardo_almeida@intuit.com>
Co-authored-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com>

* Init server only once, but keep re-initializing listeners

Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>

* Check error for SetParamInSettingConfigMap as needed after fresh master

Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>

* Prevent a data race

Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>

* Remove unused variable, don't pass lock when not necessary

Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>

* Try overriding URL instead of additional URLs

Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>

* Use a more specific url

Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>

---------

Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>
Co-authored-by: Leonardo Luz Almeida <leonardo_almeida@intuit.com>
Co-authored-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com>
crenshaw-dev added a commit to crenshaw-dev/argo-cd that referenced this pull request Dec 17, 2024
…rgoproj#20981)"

This reverts commit 730363f.

Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com>
pasha-codefresh added a commit to pasha-codefresh/argo-cd that referenced this pull request Dec 17, 2024
andrii-korotkov-verkada added a commit to andrii-korotkov-verkada/argo-cd that referenced this pull request Dec 17, 2024
…20981)

* fix: Graceful shutdown for the API server (argoproj#18642)

Closes argoproj#18642

Implements a graceful shutdown the the API server. Without this, ArgoCD API server will eventually return 502 during rolling update. However, healthcheck would return 503 if the server is terminating.

Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>
Co-authored-by: Leonardo Luz Almeida <leonardo_almeida@intuit.com>
Co-authored-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com>

* Init server only once, but keep re-initializing listeners

Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>

* Check error for SetParamInSettingConfigMap as needed after fresh master

Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>

* Prevent a data race

Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>

* Remove unused variable, don't pass lock when not necessary

Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>

* Try overriding URL instead of additional URLs

Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>

* Use a more specific url

Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>

---------

Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>
Co-authored-by: Leonardo Luz Almeida <leonardo_almeida@intuit.com>
Co-authored-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com>
leoluz added a commit that referenced this pull request Dec 17, 2024
* fix: Graceful shutdown for the API server (#18642) (#20981)

* fix: Graceful shutdown for the API server (#18642)

Closes #18642

Implements a graceful shutdown the the API server. Without this, ArgoCD API server will eventually return 502 during rolling update. However, healthcheck would return 503 if the server is terminating.

Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>
Co-authored-by: Leonardo Luz Almeida <leonardo_almeida@intuit.com>
Co-authored-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com>

* Init server only once, but keep re-initializing listeners

Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>

* Check error for SetParamInSettingConfigMap as needed after fresh master

Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>

* Prevent a data race

Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>

* Remove unused variable, don't pass lock when not necessary

Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>

* Try overriding URL instead of additional URLs

Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>

* Use a more specific url

Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>

---------

Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>
Co-authored-by: Leonardo Luz Almeida <leonardo_almeida@intuit.com>
Co-authored-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com>

* Use a custom signal for graceful restart

Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>

* Re-run tests

Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>

---------

Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>
Co-authored-by: Leonardo Luz Almeida <leonardo_almeida@intuit.com>
Co-authored-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com>
gcp-cherry-pick-bot bot pushed a commit that referenced this pull request Dec 17, 2024
* fix: Graceful shutdown for the API server (#18642) (#20981)

* fix: Graceful shutdown for the API server (#18642)

Closes #18642

Implements a graceful shutdown the the API server. Without this, ArgoCD API server will eventually return 502 during rolling update. However, healthcheck would return 503 if the server is terminating.

Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>
Co-authored-by: Leonardo Luz Almeida <leonardo_almeida@intuit.com>
Co-authored-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com>

* Init server only once, but keep re-initializing listeners

Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>

* Check error for SetParamInSettingConfigMap as needed after fresh master

Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>

* Prevent a data race

Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>

* Remove unused variable, don't pass lock when not necessary

Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>

* Try overriding URL instead of additional URLs

Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>

* Use a more specific url

Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>

---------

Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>
Co-authored-by: Leonardo Luz Almeida <leonardo_almeida@intuit.com>
Co-authored-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com>

* Use a custom signal for graceful restart

Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>

* Re-run tests

Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>

---------

Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>
Co-authored-by: Leonardo Luz Almeida <leonardo_almeida@intuit.com>
Co-authored-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com>
pasha-codefresh pushed a commit that referenced this pull request Dec 18, 2024
* fix: Graceful shutdown for the API server (#18642) (#20981)

* fix: Graceful shutdown for the API server (#18642)

Closes #18642

Implements a graceful shutdown the the API server. Without this, ArgoCD API server will eventually return 502 during rolling update. However, healthcheck would return 503 if the server is terminating.





* Init server only once, but keep re-initializing listeners



* Check error for SetParamInSettingConfigMap as needed after fresh master



* Prevent a data race



* Remove unused variable, don't pass lock when not necessary



* Try overriding URL instead of additional URLs



* Use a more specific url



---------





* Use a custom signal for graceful restart



* Re-run tests



---------

Signed-off-by: Andrii Korotkov <andrii.korotkov@verkada.com>
Co-authored-by: Andrii Korotkov <137232734+andrii-korotkov-verkada@users.noreply.github.com>
Co-authored-by: Leonardo Luz Almeida <leonardo_almeida@intuit.com>
Co-authored-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

Implement graceful shutdown in all Argo CD components ArgoCD server doesn't pick up the new OIDC secret
3 participants