-
Notifications
You must be signed in to change notification settings - Fork 330
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix race: Make informed watcher start wait for Add event 🏎️ #2036
Conversation
Codecov Report
@@ Coverage Diff @@
## master #2036 +/- ##
==========================================
+ Coverage 67.05% 67.37% +0.31%
==========================================
Files 213 215 +2
Lines 8986 9073 +87
==========================================
+ Hits 6026 6113 +87
- Misses 2686 2687 +1
+ Partials 274 273 -1
Continue to review full report at Codecov.
|
7ba9278
to
68f36a2
Compare
I can't tell if the pull-knative-pkg-unit-tests failure is legit 🤔 |
/test pull-knative-pkg-unit-tests |
// Pretend that all the defaulted ConfigMaps were just created. This is done before we start | ||
// the informer to ensure that if a defaulted ConfigMap does exist, then the real value is | ||
// processed after the default one. | ||
i.triggerAddEventForDefaultedConfigMaps(addConfigMapEvent) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think WaitForAllKeys
won't actually wait for config map updates from the API server since processing the default configmaps here lowers the wait group count to 0. I'm assuming that's expected just confirming.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right, that's the case but only when using defaulted config maps - i wasnt totally clear on the expectations when using defaulted them, does this match what you think it should do? (i can add some comments at least explaining this if you can confirm this is what we want)
it DOES wait when you're not using defaulted config maps (e.g. in tekton pipelines afaik we're not using defaulted configmaps: https://github.com/tektoncd/pipeline/blob/93e368dbcb014e8650a6ba8ad01ca3c5f6eb6554/vendor/knative.dev/pkg/configmap/store.go#L114)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm good with what you described - ie. if you include defaulted config maps then we don't wait for a sync from the API
Can you add the comment to both Start
& WatchWithDefault
?
Thanks for the PR! Yeah metrics tests are flakey |
should be ready for another look @dprotaso ! |
/retest |
a7e569b
to
bf56e75
Compare
When using the informed watcher to watch a config map, previously add events were being processed in a goroutine with no syncrhonization making it so that code may try to access the values backed by the configmaps before they are initialized. This commit makes it so that the Start method of the informer will wait for the add event to occur at least once for all config maps it is watching. This commit also undoes the workaround added in knative#1929 which was working around the race condition identified in knative#1907 (and in tektoncd/pipeline#3720). This means that if the synchronization was removed, the impacted test would start flaking again. If we wanted it to reliably fail in that case, we could introduce a sleep in the callback but that doesn't seem worth it. I also tested this change by manually patching the changes into my clone of tektoncd/pipeline and following the repro steps at tektoncd/pipeline#2815 (comment) Before the change I can reproduce the issue, and after the change, I can't! :D Fixes knative#1960
These utility objects don't really make sense to expose as part of the informed watcher package and are only used by the informed watcher. Writing tests for unexported code makes me a bit :( but hopefully these will get moved to some other package one day. And there's really no reason to expose these to users of knative/pkg at the moment.
@dprotaso I've updated the comments, lemme know what you think! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the changes!
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: dprotaso The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Changes
When using the informed watcher to watch a config map, previously add
events were being processed in a goroutine with no syncrhonization
making it so that code may try to access the values backed by the
configmaps before they are initialized.
This commit makes it so that the Start method of the informer will wait
for the add event to occur at least once for all config maps it is
watching.
This commit also undoes the workaround added in #1929 which was working
around the race condition identified in #1907 (and in
tektoncd/pipeline#3720). This means that if
the synchronization was removed, the impacted test would start flaking
again. If we wanted it to reliably fail in that case, we could introduce
a sleep in the callback but that doesn't seem worth it.
I also tested this change by manually patching the changes into my
clone of tektoncd/pipeline and following the repro steps at
tektoncd/pipeline#2815 (comment)
Before the change I can reproduce the issue, and after the change, I
can't! :D
Fixes #1960 🐛
@dprotaso @mattmoor
/kind bug
Release Note
Docs