-
Notifications
You must be signed in to change notification settings - Fork 825
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make batchWaitTime configurable in the Allocator #2586
Comments
Relevant code block: agones/pkg/gameserverallocations/allocator.go Lines 457 to 537 in a8993fe
|
You might run into any other issues decreasing the batching time, e.g. more load on system listing and sorting gameservers. But if the batch time is configurable then you can tune the balance for your game between the allocation latency and the amount of CPU you are willing to churn. |
Curious - what change did you make to the batch time that made this change? |
We built two images with different hardcoded The node hosting the controller is an ec2 t3.xlarge With the default 500ms value we have the following results: With the 200ms value: With the 100ms value: CPU/memory usage was comparable accross all three test cases. |
How did you see it changing throughput? What sort of scenario did you run it against? (i.e. size of cluster, churn in cluster, etc?) |
I guess my thoughts here are:
Not to say I'm against making this a configurable value - just wanting to make sure that the effort to change it is worth it, and we aren't looking at a premature optimisation. |
GameServer count per node as well as cluster topology (node size AND node count) can vary greatly based on which game is consuming Agones. Churn rate is indeed artificially high in those test cases indeed, and the tests are very synthetic in the rate of allocation. With all that taken into account, making that sleep configurable would allow us to better tailor the performance on a game per game basis. I understand your concern of premature optimization, but from my point of view, the effort to get that additional knob is pretty minimal as well as being backward compatible because we retain the default 500ms value. |
Totally fair enough - just wanted to ask the question to make sure 👍🏻 |
* Make Allocator batchWaitTime configurable Resolves #2586 * removed `allocation` prefix on var declaration * Updated documentation * Include `allocationBatchWaitTime` documentation entries Co-authored-by: Mark Mandel <markmandel@google.com>
Is your feature request related to a problem? Please describe.
We'd like to make the
batchWaitTime
configurable on the Alocator. On certain allocation pattern, the hard-coded 500ms sleep is too aggressive and degrading allocation latency.Describe the solution you'd like
Make the
batchWaitTime
configurable through an environment variable, with a default set to 500ms to avoid any breaking change.Describe alternatives you've considered
batchWaitTime
instead of making it configurable.Those two solutions would lead to a change in existing behavior so I think they are not to be considered.
Additional context
From our internal testing the impact on CPU usage is negligible but the uplift in p95 allocation latency as well as the throughput in allocation per second is significant.
In any case the default value won't change so existing behaviour is kept intact.
The text was updated successfully, but these errors were encountered: