Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Internal queues: member prioritisation #1196

Open
arjclark opened this issue Oct 24, 2014 · 13 comments
Open

Internal queues: member prioritisation #1196

arjclark opened this issue Oct 24, 2014 · 13 comments
Assignees
Milestone

Comments

@arjclark
Copy link
Contributor

We have now been asked by several areas about the possibility of being able to prioritise within queues. It is not always possible to capture these kind of requirements within cylc dependencies.

An example might be wanting to run longstep ensemble members before shortstep ones.

It may be possible to simply modify cylc to achieve an ordering preference in queues but we need to consider the consequences carefully.

@arjclark arjclark added this to the later milestone Oct 24, 2014
@hjoliver
Copy link
Member

To avoid additional configuration items to handle prioritization, maybe we could prioritize according to the order in which tasks are assigned to a queue? Then for queued tasks, release them up to the queue limit by iterating through the ordered list of members.

@arjclark
Copy link
Contributor Author

By "prioritize according to the order in which tasks are assigned to a queue" do you mean the order of their entries in the config entry for queue members e.g would this:

[scheduling]
  [[queue]]
    [[[ensemble]]]
      members = LONG, SHORT, FRED
...
[runtime]
  [[long01...longNN]]
    inherit = LONG
...

result in LONG family members having priority over those in SHORT with all members of LONG having equal priority to each other (i.e. pri(long1) == pri(long2) == pri(long3) etc.)?

@hjoliver
Copy link
Member

Well, yes and no. Here LONG is just shorthand for all the members of the the LONG family, so long01 would technically have the highest priority, then long02 and so on; but all members of LONG would have higher priority than any members of SHORT (and then FRED). That wouldn't matter if internal prioritization of LONG doesn't matter (it it did matter you'd just have to assign to the queue by member names rather than family name).

@arjclark
Copy link
Contributor Author

I think we'd need to go off namespace for the task to be queued rather than explicit expansion wouldn't we?

If we didn't, I kind of have a worry at the back of my mind for the default queue causing problems by expanding root meaning you could end up unfairly starving tasks from being able to run.

e.g. for the following with a default queue limit of 1 (extreme I know):

graph = """
f[-P1]=>a=>b=>c=>d=>e=>f
LONG=>f
"""

if a-f were quick cycling then members of LONG would struggle to get to the front of the queue until a-e had finished, even though they'd been sitting around and ready from the start of the suite.

For things in families I think we'd just want a first in queue first out type ordering as no member of family has priority over another (all family members are equal ;) ).

@hjoliver
Copy link
Member

Yep, fair point!

@trwhitcomb
Copy link
Collaborator

Another option for prioritization would be to preferentially execute earlier cycle points in order to avoid runahead snags (where one or two tasks still haven't executed in an earlier cycle point and greatly slows down the suite). Order of insertion doesn't help in this case, because if several cycle points are running at once, insertion order would be interleaved. Setting the sequential attribute may help here, but I really hate trying to do resource management with that since I think of sequential as an explicit limitation of dependencies rather than not overloading the system running the tasks.

One possibility:

[scheduling]
    [[queue]]
        [[[ensemble]]]
            members  = LONG, SHORT, FRED
            priority = fifo
        [[[transfer]]]
            members  = TransferGrib # this is a family
            priority = earliest

@benfitzpatrick
Copy link
Contributor

We have a volunteer here who wants to do this!

@hjoliver
Copy link
Member

We have a volunteer here who wants to do this!

Who?

@benfitzpatrick
Copy link
Contributor

A member of our post-processing team - I think they need to use quite a lot of queues for their suites.

@benfitzpatrick
Copy link
Contributor

How about:

  1. no priority ordering for the default queue (status quo)
  2. for all non-default queues, use input family/task name ordering (prioritize according to the order in which tasks are assigned to a queue [in the configuration]) then:
  • for each task, earlier cycle points take priority
  • for each family, earlier cycle points take priority. Within each cycle point, tasks are sorted by task definition order

@benfitzpatrick
Copy link
Contributor

Or, for 1), to make the default queue more compatible, order by cycle point then task definition order

@TomekTrzeciak
Copy link
Contributor

I've put up a PR to get the ball rolling on this one - it implements FIFO behaviour for queues. Not exactly what was suggested in this issue, but perhaps enough to alleviate the main problems.

@matthewrmshin matthewrmshin changed the title prioritisation within cylc queues Internal queues: member prioritisation May 9, 2018
@matthewrmshin matthewrmshin modified the milestones: later, some-day Jun 22, 2018
@oliver-sanders
Copy link
Member

Update: near-future plans:

This opens the door for swappable alternative queue implementations to suit all purposes without creating back-compat issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants