-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add worker config options #1051
Conversation
I still need to work with tests for this but please take a look. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
general question on workers vs. min/max workers : does one ever override the other (i.e. if we set workers, does that invalidate min/max scaling)?
I didn't check if there's a ray doc that covers this, so if there is feel free to tell me to go read the #$%#$%# docs 😄
Args: | ||
title: program name | ||
entrypoint: is a script that will be executed as a job | ||
ex: job.py | ||
env_vars: env vars | ||
dependencies: list of python dependencies to execute a program | ||
working_dir: directory where entrypoint file is located (max size 50MB) | ||
description: description of a program | ||
version: version of a program |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should the args here be workers, cpu, etc.?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The configuration is specified in run
method of Quantum_serverless
serverless.run(program, config=Configuration(max_workers=12, min_workers=5)
I'll update the doc string of the function
@psschwei The |
I was a bit out these last weeks so maybe this question is a bit dumb so sorry in advance (and maybe this may be discussed in the issue). Is it has sense to let the user to modify the min/max number of workers? And if so, I see valuable to have an option that prevent to do that in case the user doesn't want to allow that kind of behavior from external users. Opinions: @IceKhan13 , @psschwei , @akihikokuroda ? (maybe a dumb question, as I said 😅 ). |
@Tansito The max allowed values for the workers are specified in the settings.py (read from env variables). So users are not allowed to set values bigger than those. |
it came up during the internal tests |
I see so we are going to allow only modifications inside a threshold. Thanks @akihikokuroda ❤️ Happy to return to the work, I will try to catch-up and return 100% as soon as I can haha. Thanks team! |
so, for example, something like this serverless.run(program, config=Configuration(workers=2, min_workers=1, max_workers=5, auto_scaling=True)) would ignore the We should probably add a doc on program configuration (to go along with the one for client configuration) to spell out all the options. |
gateway/api/views.py
Outdated
@@ -86,11 +87,10 @@ def run(self, request): | |||
tracer = trace.get_tracer("gateway.tracer") | |||
ctx = TraceContextTextMapPropagator().extract(carrier=request.headers) | |||
with tracer.start_as_current_span("gateway.program.run", context=ctx): | |||
serializer = self.get_serializer(data=request.data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
serializer should handle all related entities, so you do not need to created them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I restore the use of the ProgramSerializer. Thanks!
7b0f714
to
43d1e8e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had one question (and one nit), but overall LGTM.
Will leave for @IceKhan13 to approve (as he had requested a few changes)
gateway/api/ray.py
Outdated
"workers": program_config.workers, | ||
"min_workers": program_config.min_workers, | ||
"max_workers": program_config.max_workers, | ||
"auto_scaling": program_config.auto_scaling, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if bad values are passed? For example, a user sets min=5 and max=2... what kind of error message would they see? Is that caught by Ray or do we need to catch it? (I haven't tried yet, but based on past experience users will pass all kinds of combinations of bad values when configuring...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried it with max=1, min=2. Ray didn't care, it just created 1 worker.
I have to add the config to the existing program path. |
LGTM |
4a08b4c
to
2ea1df9
Compare
Agree to attach the configuration to a Job! |
Signed-off-by: Akihiko Kuroda <akihikokuroda2020@gmail.com>
Signed-off-by: Akihiko Kuroda <akihikokuroda2020@gmail.com>
6e0dba7
to
bcc8ce1
Compare
ProgramConfig has been changed to JobConfig and it is related to Job. Please review. Thanks! |
I'm on it! Thank you Aki! |
gateway/api/models.py
Outdated
null=True, | ||
validators=[ | ||
MinValueValidator(0), | ||
MaxValueValidator(settings.RAY_CLUSTER_WORKER_REPLICAS_MAX), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
configuration values in models will create conflict. If I change my settings now it will generate new migration diff and django will be complaining about migrations. Validation for this feilds should live in serializers
user: Any, | ||
cluster_name: Optional[str] = None, | ||
cluster_data: Optional[str] = None, | ||
job_config: Optional[JobConfig] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we accept optional job_config here, but later on we always act like it is not None.
Here we should check for None and fall back to default from settings.
gateway/api/views.py
Outdated
@@ -150,12 +157,18 @@ def run(self, request): | |||
if not serializer.is_valid(): | |||
return Response(serializer.errors, status=status.HTTP_400_BAD_REQUEST) | |||
|
|||
try: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we replace it with
jobconfig = None
config_data = request.data.get("config")
if config_data:
config_serializer = JobConfigSerializer(data=json.loads(config_data))
if not config_serializer.is_valid():
return Response(config_serializer.errors, status=status.HTTP_400_BAD_REQUEST)
jobconfig = config_serializer.save()
then we resolve validation + errors will be in a nice understandable format
QuantumServerlessException:
| Message: Http bad request.
| Code: 400
| Details: {"workers":["Ensure this value is less than or equal to 5."]}
and we will not longer need save_jobconfig
function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost there :)
Remaining things to fix:
- remove settings variables from models and migrations
- move validation to serializer ref
- write tests for serializer and api view with configuration
@IceKhan13 Thanks for review. I'll make chages. |
No, thank you for handling such a big change! |
The runtime code changes are done. I'm working for the tests. |
There remaining fixes have done. Please take a look! Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Thank you @akihikokuroda for working on it!
Thanks! |
Reopened #1085 Sorry messed up with branches |
Summary
Fix #1031
This PR is rework of PR #1025
Details and comments
Summary
expose number of workers option (number of workers, min workers and max workers when auto scaling is enabled)
Details and comments
The number of worker can be specified in the Configuration object for QuantumServerless.run function.
provider = ServerlessProvider(
username=os.environ.get("GATEWAY_USER", "user"),
password=os.environ.get("GATEWAY_PASSWORD", "password123"),
# token=os.environ.get("GATEWAY_TOKEN", ""), # token can be used instead of user/password combination
host=os.environ.get("GATEWAY_HOST", "http://localhost:8000"),
)
serverless = QuantumServerless(provider)
program = Program(
title="First program", entrypoint="program.py", working_dir="./source_files/"
)
job = serverless.run(program, config=Configuration(workers=2, min_workers=1, max_workers=5, auto_scaling=True))