-
Notifications
You must be signed in to change notification settings - Fork 600
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multitenant PingSource #2716
Multitenant PingSource #2716
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Produced via:
gofmt -s -w $(find -path './vendor' -prune -o -path './third_party' -prune -o -type f -name '*.go' -print)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Produced via:
gofmt -s -w $(find -path './vendor' -prune -o -path './third_party' -prune -o -type f -name '*.go' -print)
I think this is the wrong direction to go. I would rather use jobs: #2478 |
@n3wscott The problem with CronJob is that a pod is created each time an event is sent. Creating/deleting pods is not cheap specially compare to goroutine. Note that SinkBinding + CronJob is still valid when the scope of PingSource is set to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Produced via:
gofmt -s -w $(find -path './vendor' -prune -o -path './third_party' -prune -o -type f -name '*.go' -print)
It is cheap if it only runs once a week.
It is valid regardless, the question is where that job is launched. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Produced via:
gofmt -s -w $(find -path './vendor' -prune -o -path './third_party' -prune -o -type f -name '*.go' -print)
@@ -14,12 +14,14 @@ See the License for the specific language governing permissions and | |||
limitations under the License. | |||
*/ | |||
|
|||
package pingsource | |||
package controller | |||
|
|||
import ( | |||
"context" | |||
"encoding/json" | |||
"fmt" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Format Go code:
"fmt" | |
"fmt" | |
I still don't think we should go this way. We should use real cronjobs and sinkbindings. |
Sure, but that's not what this PR is about. We want to be able to handle thousand of cronjobs, each potentially triggering every second. |
| I still don't think we should go this way I'm quite puzzled. Why? It seems you think cronjobs can handle thousand of pingsources, each potentially triggering every second. |
Cron is only supported every minute, even today in PingSource 0.13. If you need a load tester, write a load tester. |
oops right I meant every minute. BTW I did write a load tester and currently porting it to |
deploymentInformer.Informer().AddEventHandler(cache.FilteringResourceEventHandler{ | ||
FilterFunc: controller.Filter(v1alpha1.SchemeGroupVersion.WithKind("PingSource")), | ||
Handler: controller.HandleAll(impl.EnqueueControllerOf), | ||
}) | ||
|
||
// Watch for the global deployment | ||
gr := func(obj interface{}) { | ||
impl.GlobalResync(pingSourceInformer.Informer()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need a global resync, maybe add a comment here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume it's because the shared deployment for multi-tenant ones will need to requeue all the ping sources, but if there's only one there, can we not just use tracker for it and only watch / notify on changes to that deployment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the difference between the tracker and the informer with a FilterFunc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I'm not sure why we are doing a global resync here. with the tracker you know if it's using the shared one and you can set up the tracker to only queue the particular ping source when things change. Here you're saying (unless I'm mistaken) that whenever the shared deployment changes, enqueue all the ping sources regardless of if they are affected by the change or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah got it, thanks! You are not mistaken.
This is related to the open discussion about whether to allow mixing cluster and namespace scoped dispatchers/jobrunners. Let's say we allow it (and offer config knobs to restrict it) then using the tracker is a win in that case. Working on it...
I was also curious about the seconds, vs. minutes, as we'd have to expand the types to have more granular time duration. So, the plan is to only have it fire at most once a minute? |
It takes a reserve in the cluster. So it is non-zero always.
This is a job of combining the jobs that have the same schedule, which could be a different task than what is being proposed. Then at most it is one job per minute. |
I am totally on board with this multi-tenant implementation. I have helped develop and support the IBM Cloud Functions alarm/cron event provider for the last 4 years and know from experience that we see far more customer adoption for this event provider than any of the others. My new team is exploring Knative Eventing and would love to see a multi-tenant implementation for cron such as this. The current cron job implementation in Knative Eventing would be far too costly for us. |
@jasonpet does the IBM Cloud Functions alarm/cron event provider allow customers to create schedules with period less than 1 minute (e.g. every 10 seconds)? If so, do customers use that feature? |
@grantr it used to but this feature has been removed. The minimal period is now 1 minute. |
tracker has been added. |
/test pull-knative-eventing-integration-tests |
@vaikas I implemented the tracker and added a couple of more tests. |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: lionelvillard, vaikas The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
||
cronRunner *cronJobsRunner | ||
pingsourceLister sourceslisters.PingSourceLister | ||
impl *controller.Impl |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks unused, and is kind of a strange thing to have the reconciler have as a field (it creates a cycle in the layering).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's coming from the in-mem reconciler. Same problem there...
} | ||
|
||
func (r *Reconciler) reconcile(key string, source *v1alpha2.PingSource) error { | ||
if source.DeletionTimestamp != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
without finalizers, it isn't guaranteed that you'll actually flow through here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indeed! thanks!
Proposed Changes
knative-eventing
eventing.knative.dev/scope: resource
are handled by a single pod (as before)Release Note
Docs
knative/docs#2288