Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add scale-in cooldown support #6

Merged
merged 1 commit into from
Mar 25, 2019
Merged

Add scale-in cooldown support #6

merged 1 commit into from
Mar 25, 2019

Conversation

etaoins
Copy link
Contributor

@etaoins etaoins commented Mar 3, 2019

This reimplements ASG scale-in cooldown inside the lambda. It takes two parameters which correspond to the existing ASG parameters in the elastic CI stack.

  1. SCALE_IN_COOLDOWN_PERIOD is the cooldown time between scale in events. This defaults to the existing 5 minutes.

  2. SCALE_IN_ADJUSTMENT is the maximum adjustment during scale-in events. Unlike the ASG we may scale in less if we calculate that the desired is closer than the adjustment. This defaults to the existing -1.

This cheats a bit by storing lastScaleInTime in a global variable. This means we'll forget about our cooldown during a cold start. This should happen fairly infrequently and just make us a bit aggressive about scaling it; it shouldn't affect correctness.

This reimplements ASG scale-in cooldown inside the lambda. It takes two
parameters which correspond to the existing ASG parameters in the
elastic CI stack.

1. `SCALE_IN_COOLDOWN_PERIOD` is the cooldown time between scale in
   events. This defaults to the existing 5 minutes.

2. `SCALE_IN_ADJUSTMENT` is the maximum adjustment during scale-in
   events. Unlike the ASG we may scale in less if we calculate that the
   desired is closer than the adjustment. This defaults to the existing
   -1.

This cheats a bit by storing `lastScaleInTime` in a global variable.
This means we'll forget about our cooldown during a cold start. This
should happen fairly infrequently and just make us a bit aggressive
about scaling it; it shouldn't affect correctness.
@lox
Copy link
Contributor

lox commented Mar 25, 2019

This is awesome @etaoins! Somehow I totally missed this PR.

@lox
Copy link
Contributor

lox commented Mar 25, 2019

I wonder how often there will be a cold start on a timed lambda 🤔

@etaoins
Copy link
Contributor Author

etaoins commented Mar 25, 2019

I had a timed Lambda running for a few weeks that would cold start about once a day at an unpredictable time. It looks like even if the Lambda is warm its instance will sometimes disappear anyway. That's only a single datapoint that's about a year out of date; I wouldn't be surprised if there are other situations or configurations that could trigger it more often.

I don't think that's too bad here as it will only make a difference if a scale-in is occurring and we will still be limited by the maximum adjustment. My gut instinct would be to try this with local state and if there are some pathological cases it could then be persisted somewhere. I'm just worried about the complexity/reliability of setting up something like a DynamoDB to track this.

@lox lox merged commit e3e571a into buildkite:master Mar 25, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants