Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vine, wq: ramp down mode #3485

Merged
merged 4 commits into from
Sep 11, 2023
Merged

Conversation

btovar
Copy link
Member

@btovar btovar commented Sep 11, 2023

Adds m.tune("ramp-down-heuristic", 1). When this is enabled and there are more workers than ready tasks, then tasks are allocated all the free resources of a worker. When looking for workers by files or by time, ties are broken by worst fit. This mode avoids unnecessary resource exhaustion when there are more workers than tasks. If the monitoring watchdog is not enabled, then this heuristic has no effect. By default it is disabled (set to 0).

It also adds m.tune("hungry-minimum-factor", 2). A manager is hungry if the number of waiting tasks is less than then number of workers times this factor. By default is set to 2 to avoid ramp-down mode to trigger when the workflow is not really terminating.

With this mode enabled I was able to run the processing+accumulation part of topEFT using the old wq executor in 40 minutes, as the expensive final accumulations are never retried. Together with other advances in taskvine, 30 min processing time should be within reach.

A queue is hungry if the number of waiting tasks is less than (number of
workers) * hungry_minimum_factor.
When enabled (aq.tune("ramp-down-heuristic", 1)), and the number of waiting tasks is less than the number of
connected workers, then tasks are allocated all the free resources of a
worker. This avoids tasks terminated with resource exhaustion when there
are more resources than tasks.

If monitoring is not enabled, or if it just set to measure, then this
heuristic has no effect. By default it is disabled.
@dthain
Copy link
Member

dthain commented Sep 11, 2023

Nice! I like it when a simple idea has a big impact.

Does "ramp down" also apply at any other time (not the end) when tasks are fewer than workers?

@btovar
Copy link
Member Author

btovar commented Sep 11, 2023

Yes, if activated, then are less tasks than workers, then tasks get all free resources of a worker. Ideally whatever is managing vine would activate this mode when it knows the workflow is about to end. (E.g. in topcoffea, when all the processing tasks are done and we only have accumulation tasks.)

It can be seen as if the queue is starving (e.g. more than hungry), then be aggressive allocating resources for the tasks.

@btovar
Copy link
Member Author

btovar commented Sep 11, 2023

RTM

@dthain dthain merged commit d8b36f3 into cooperative-computing-lab:master Sep 11, 2023
6 checks passed
Jbrocket pushed a commit to Jbrocket/cctools that referenced this pull request Sep 11, 2023
* wq: adds hungry minimum factor tune parameter

A queue is hungry if the number of waiting tasks is less than (number of
workers) * hungry_minimum_factor.

* vine: hungry-minimum-factor

* adds ramp-down mode

When enabled (aq.tune("ramp-down-heuristic", 1)), and the number of waiting tasks is less than the number of
connected workers, then tasks are allocated all the free resources of a
worker. This avoids tasks terminated with resource exhaustion when there
are more resources than tasks.

If monitoring is not enabled, or if it just set to measure, then this
heuristic has no effect. By default it is disabled.

* vine: ramp down mode
@btovar btovar deleted the ramp_down branch December 4, 2023 18:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants