-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a randomized start before running CheckMonitors. #546
Conversation
@ryanbreen there is also a function in util, Otherwise I think this is a good idea! |
Agreed with @ryanuber! Also some tests for this would be good :) |
Thanks for the feedback, guys. I'm struggling a bit with how best to test a random check start time in a deterministic fashion. |
@ryanbreen Probably enough to ensure the check is run within a timely manner (e.g. <= Interval). |
Sounds good. I'll get that added to check_test.go. |
Let me know if that test doesn't cover it. Thanks! |
next := time.After(0) | ||
// Get the randomized initial pause time | ||
initialPauseTime := randomStagger(c.Interval) | ||
c.Logger.Printf("[DEBUG] agent: pausing %ds before first invocation of %s", int(initialPauseTime.Seconds()), c.Script) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you just print a time.Duration
directly here as a string, it will get formatted in a nice, human-friendly way by its String()
method. That way we don't have to explicitly cast to an int
, and we get a more accurate value in the log.
@ryanbreen I left a few comments, but looking good! |
Great feedback, thanks @ryanuber! Let me know if this doesn't address those points adequately. |
@ryanbreen LGTM, Thanks for doing this! |
Add a randomized start before running CheckMonitors.
We have situations where systems use many checks (dozens to hundreds) with fairly long intervals (say, every 60s). Starting them all at the same time leads to undesirable spikes in resource utilization. This patch randomly staggers the start time of CheckMonitors to avoid this. It should address our issue without altering behavior much in the common case.