-
-
Notifications
You must be signed in to change notification settings - Fork 717
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pick worker with lowest memory use by percentage, not absolute #7266
Comments
This boils down to why would people use a heterogeneous cluster.
In the second case, you'll likely have a wealth of tasks with average memory usage and without any restrictions, plus a handful of tasks with a FYI - AMM ReduceReplicas, graceful worker retirement, rebalance, and the future AMM Rebalance all use absolute optimistic memory as a metric (I'm fine with scheduling using managed memory instead, as it's a lot more responsive). |
That use-case makes sense. However, I just find it hard to justify picking worker A in this situation:
To me, using absolute memory is making too much of an assumption that someone's use-case and intent looks like the one you've described. But you might just have heterogeneous workers because that's what you got, whether it's the machines you had around in your lab, or the instances Coiled gave you because you allowed a range of instance types for faster cluster startup time. I feel like the safest generic choice to make is the one that's the least likely to put a worker under memory pressure. If someone has high-memory workers, but they want to save them for particular tasks, then you can always use more resource restrictions to accomplish that. As a user, I'd be confused if my low-memory workers kept getting overloaded and dying but my high-memory workers stayed nearly empty, unless I'd explicitly used restrictions to make this happen. |
Currently the
worker_objective
function uses worker managed memory as a tiebreaker if it looks like a task will start in the same amount of time on multiple workers:distributed/distributed/scheduler.py
Line 3236 in 00bf8ed
In a heterogeneous cluster, this means we might pick a small worker with less memory available instead of a large worker with lots of memory available, but more total data in memory.
Maybe we should compare by percentage of memory used, rather than total bytes used:
#7248 does this for root tasks when queuing is enabled. I think it would make sense to do in all cases though.
cc @fjetter @crusaderky
The text was updated successfully, but these errors were encountered: