-
Notifications
You must be signed in to change notification settings - Fork 137
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Edit task environment variables #503
Comments
It's a bit of an edge-case, but I definitely see how this is a good QoL feature. I'm thinking about the best way to do this. Maybe we should add a new subcommand for this? This would also make it quite easy to edit variables for groups or multiple commands, as i feel like those flags would be very confusing on the current |
Yes, the more I think about this, the more this sounds like a separate subcommand. |
If you like, feel free to start hacking on this :) Your contributions have always been top notch and I believe that you'll make the right design decisions ;D |
Just wanted to chime in with another use case. I currently schedule jobs on a machine with GPUs using function pugrun {
GPUS=${GPUS:-0}
pueue group add gpu-${GPUS}
CUDA_VISIBLE_DEVICES=${GPUS} pueue add -g gpu-${GPUS} -- "${@}"
unset GPUS
} Editing/adding environment variables to groups would be a great addition! Thank you for the amazing project! |
@activatedgeek It sounds like you could make good use of the https://github.com/Nukesor/pueue/wiki/Get-started#load-balancing feature. This would allow you to do automatic balancing on your gpus for you :) |
Thank you. My understanding was that this behavior would require me to change existing code to parse out the But of course, nothing too complicated since it can be similarly plugged in by instead of calling the GPU job directly, calling another bash script that does the transform to call to the GPU job. Another tricky scenario with the example was this series of steps, say for group
Now experiment 1 and experiment 3 both run on GPU 0, and in many of my cases with large models would lead to an out of memory error on GPU 0. Of course, this could be fixed with slightly smarter scheduling at the expense of more plumbing. This is why I took the easier route of limiting each group to schedule on a fix set of GPU IDs. Let me know if the scenario makes sense, and if I am missing something simpler here. |
Exactly. That's how it was designed to be used, if the Ids weren't directly mappable to the external resources. (Except the round-robin)
That's actually not how it would work. In your example, the last line would look like this:
The ID is not the task id, but rather the internal "worker" or rather "slot" id. A queue with 5 parallel tasks will get 5 slots with ids If it doesn't work this way, this is definitely a bug! That system was specifically designed to handle your usecase. Machines/pools with a set number of limited resources, to which the tasks are then assigned in a greedy manner. If the current wiki entry doesn't properly convey this behavior, it would be awesome if you could rephrase it! The wiki is free to be edited by anyone :) |
Ahaa! That makes much more sense. Thank you for clarifying. Let me run a quick test simulating this scenario, and will report back. I'll update the wiki with this example so that the behavior is more explicit. |
Sorry for the delay. I have verified this behavior, and this actually solves my specific problem. I also realized I missed this:
In that case, the wiki already explains it precisely, and I will skip the wiki edit. |
@mjpieters I'm thinking about adding a The idea came up during #540. What do you think about a Is the nesting too big or is this reasonable? |
Ping @mjpieters |
I love the nesting, actually. Grouping like this can really help with command clarity. E.g. the I would almost say grouping is overdue for pueue; you have commands that operate on either a group or a task, which maybe could do with splitting across command groups, with aliases for backwards compatibility / ease of typing:
and the following first-tier top-level commands:
Note that for many commands that accepted a single |
I really like the I'm not sure about the I guess, this would also be confusing, as I would expect different commands to perform different tasks? But permanently moving all those commands ( In general, I really like the idea to restructure the CLI, especially since the next version will be a breaking major version anyway. I guess I need to think a bit about your proposal and make my mind up about it. |
Fair enough; it's why I included aliases for the original top-level commands in there (as well as thinking about backwards compatibility). It could perhaps depend on how the aliases are represented when you run |
I thought a lot about the CLI redesign and still really wasn't happy with any solution. This issue was kind of blocking me and preventing the 4.0 release, which included a lot of other stuff that needed to be released, so I decided to stick to the old pattern for now and tackle this in a later release :) |
A detailed description of the feature you would like to see added.
Currently, you can edit three aspects of a scheduled task: the command to be run, the path where the task is run and it's label.
Tasks have another component: the environment variables under which the command is executed. Can we have an option to edit these?
A stretch goal would be the ability to add, set or remove environment variables across all tasks or a task group. This lets you do things like set bandwidth limits on a whole group of tasks where the commands in those tasks honour a bandwidth limit set in an environment variable. Powerful stuff!
Explain your usecase of the requested feature
Loads of UNIX command change their behaviour based on the environment variables that are present when they start. Sometimes, we need to change what environment variables are present for a given task, because sometimes circumstances change or we made a mistake with the env vars when the task was added.
The ability to alter environment variables across all tasks is another powerful option. This may need to be a separate command (or even a separate FR!).
Alternatives
I've actively edited the command for tasks to add extra environment variables. This is a poor workaround however, as it is easy to accidentally alter the command itself.
Additional context
No response
The text was updated successfully, but these errors were encountered: