-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Watch directories for certificate hot-reload #4159
Watch directories for certificate hot-reload #4159
Conversation
Hi @pavolloffay, I noticed that you did the certificate reload implementation few years back. It would be greatly appreciated if you could take a look at this PR! Thank you 🙏 |
52b2d2b
to
0c8af96
Compare
I've corrected a formatting mistake. @yurishkuro could you please re-trigger the tests. Thank you! |
Codecov ReportBase: 97.12% // Head: 97.09% // Decreases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## main #4159 +/- ##
==========================================
- Coverage 97.12% 97.09% -0.04%
==========================================
Files 296 296
Lines 17452 17493 +41
==========================================
+ Hits 16950 16984 +34
- Misses 404 409 +5
- Partials 98 100 +2
Flags with carried forward coverage won't be shown. Click here to find out more.
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
Coverage results were accidentally decreased, hopefully fixed now with the latest commit. Not sure about CIT OpenSearch failure though. It looks to me there might have been similar errors before with that same job, maybe a fluke? |
Yes apparently CIT OpenSearch failure was a fluke. However coverage was still under target %. I think one test was missing even before: replace CA certs with bad ones during update. I added that now. Besides that one, the missed lines are error branches that might be difficult to produce without stubbing the library calls. |
Yeah OS tests have been failing quite often recently. I'll rerun them till they pass. Just curious, why was it necessary to regenerate test certificates? |
Ok, thank you!
I wanted to have second server certificate for new test case BTW as a shameless plug, I've created a tool to generate test certs with Go API to avoid storing test certs and keys in the first place. But since it is just maintained by me alone, I assume Jaeger does not want that as a new dependency :-) |
I merged some fixes for opensearch integration test, please rebase, hopefully it should clear the failure |
Instead of adding inotify watches on files, which mostly get removed during certificate renewal, add watches for the parent directories and consider inotify events only as indication that some of the files might have changed. The change is required for the most typical atomic file update schemes, for example for the scheme used by Kubernetes to update secret volumes. Signed-off-by: Tero Saarni <tero.saarni@est.tech>
8e0380b
to
07dde79
Compare
@yurishkuro I've now rebased, as well as removed the test data changes that were not necessary and squashed commits. |
@yurishkuro It is OK to me if you push #4165 instead of this PR. Thank you! |
Signed-off-by: Yuri Shkuro <github@ysh.us>
I ended up doing cherry-pick here as well but either way is OK! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
## Which problem is this PR solving? This PR fixes the certificate hot-reload on Kubernetes. Resolves jaegertracing#3968 ## Short description of the changes This PR moves the inotify watch from certificate files to monitoring the parent directory instead, as also recommended [here](https://github.com/fsnotify/fsnotify/blob/c6f5cfa163edb0f1bb78be3a77053ee14c48a3ce/backend_inotify.go#L246-L254) by the [fsnotify](https://github.com/fsnotify/fsnotify) library. Without this change, certificate hot-reload only works if the content of certificate or key files is overwritten. That approach is rarely used for update since it is not atomic and it risks files being read in the middle of the update. Therefore more often the files are replaced by a rename operation, by swapping the old file with a new one. It allows the file to be completely written and closed before exposing it to the application. However, in this update scheme, the old file gets deleted and the current inotify watch on file level is automatically terminated. No reload happens and warning message like following was printed into log ```json {"msg":"Certificate has been removed, using the last known version","certificate":"/certs/server-ca.pem"} ``` Example use case: When Kubelet writes certificate and key files on Kubernetes Secret volume, the target files are symbolic links to files in a different directory - a hidden subdirectory in a same level as the files themselves. During update, that directory is swapped with a new one, while the symbolic links remains the same. This guarantees atomic swap for both certificate and key files at once. It also means that any rename event received at the parent directory level might indicate that the files were replaced, even if name of the renamed file was not any of the files being monitored. Therefore this PR proposes checking the hashes of the files to detect changes. With this change, following update schemes will work: * Replace file content in-place (old behaviour: e.g. `cat newfile > cert.pem`) * Update the files by making any rename operation in the parent directory (Kubernetes secret volume update scheme: `mv ..data_tmp ..data`) * Update the files by renaming the target files (e.g. `mv cert.pem.tmp cert.pem`) Signed-off-by: Tero Saarni <tero.saarni@est.tech> Signed-off-by: Tero Saarni <tero.saarni@est.tech> Signed-off-by: Yuri Shkuro <github@ysh.us> Co-authored-by: Yuri Shkuro <github@ysh.us>
## Which problem is this PR solving? This PR fixes the certificate hot-reload on Kubernetes. Resolves jaegertracing#3968 ## Short description of the changes This PR moves the inotify watch from certificate files to monitoring the parent directory instead, as also recommended [here](https://github.com/fsnotify/fsnotify/blob/c6f5cfa163edb0f1bb78be3a77053ee14c48a3ce/backend_inotify.go#L246-L254) by the [fsnotify](https://github.com/fsnotify/fsnotify) library. Without this change, certificate hot-reload only works if the content of certificate or key files is overwritten. That approach is rarely used for update since it is not atomic and it risks files being read in the middle of the update. Therefore more often the files are replaced by a rename operation, by swapping the old file with a new one. It allows the file to be completely written and closed before exposing it to the application. However, in this update scheme, the old file gets deleted and the current inotify watch on file level is automatically terminated. No reload happens and warning message like following was printed into log ```json {"msg":"Certificate has been removed, using the last known version","certificate":"/certs/server-ca.pem"} ``` Example use case: When Kubelet writes certificate and key files on Kubernetes Secret volume, the target files are symbolic links to files in a different directory - a hidden subdirectory in a same level as the files themselves. During update, that directory is swapped with a new one, while the symbolic links remains the same. This guarantees atomic swap for both certificate and key files at once. It also means that any rename event received at the parent directory level might indicate that the files were replaced, even if name of the renamed file was not any of the files being monitored. Therefore this PR proposes checking the hashes of the files to detect changes. With this change, following update schemes will work: * Replace file content in-place (old behaviour: e.g. `cat newfile > cert.pem`) * Update the files by making any rename operation in the parent directory (Kubernetes secret volume update scheme: `mv ..data_tmp ..data`) * Update the files by renaming the target files (e.g. `mv cert.pem.tmp cert.pem`) Signed-off-by: Tero Saarni <tero.saarni@est.tech> Signed-off-by: Tero Saarni <tero.saarni@est.tech> Signed-off-by: Yuri Shkuro <github@ysh.us> Co-authored-by: Yuri Shkuro <github@ysh.us> Signed-off-by: shubbham1215 <sawaikershubham@gmail.com>
Which problem is this PR solving?
This PR fixes the certificate hot-reload on Kubernetes.
Resolves #3968
Short description of the changes
This PR moves the inotify watch from certificate files to monitoring the parent directory instead, as also recommended here by the fsnotify library.
Without this change, certificate hot-reload only works if the content of certificate or key files is overwritten. That approach is rarely used for update since it is not atomic and it risks files being read in the middle of the update. Therefore more often the files are replaced by a rename operation, by swapping the old file with a new one. It allows the file to be completely written and closed before exposing it to the application. However, in this update scheme, the old file gets deleted and the current inotify watch on file level is automatically terminated. No reload happens and warning message like following was printed into log
Example use case:
When Kubelet writes certificate and key files on Kubernetes Secret volume, the target files are symbolic links to files in a different directory - a hidden subdirectory in a same level as the files themselves. During update, that directory is swapped with a new one, while the symbolic links remains the same. This guarantees atomic swap for both certificate and key files at once. It also means that any rename event received at the parent directory level might indicate that the files were replaced, even if name of the renamed file was not any of the files being monitored. Therefore this PR proposes checking the hashes of the files to detect changes.
With this change, following update schemes will work:
cat newfile > cert.pem
)mv ..data_tmp ..data
)mv cert.pem.tmp cert.pem
)Signed-off-by: Tero Saarni tero.saarni@est.tech