-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(api,ui,sdk): Make CPU limits configurable #381
feat(api,ui,sdk): Make CPU limits configurable #381
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #381 +/- ##
==========================================
+ Coverage 62.19% 68.06% +5.87%
==========================================
Files 124 149 +25
Lines 9755 11809 +2054
==========================================
+ Hits 6067 8038 +1971
- Misses 2954 3033 +79
- Partials 734 738 +4
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
916ed3f
to
0cd4838
Compare
f6cb0c4
to
08e0064
Compare
08e0064
to
4104721
Compare
2a60787
to
1ac8cde
Compare
03c8e30
to
78dcb4f
Compare
b11e4eb
to
89caaa2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks LGTM. +1 for moving platform default out and putting it as part of the service builder
… setting resource requirement values
Context
Similar to caraml-dev/merlin#586, this PR aims to make CPU limits configurable for the end user.
As of present, users are not able to configure the CPU limits of the pods in which Turing routers/enrichers/ensemblers (docker and pyfunc) are deployed in - they are instead determined automatically on the platform-level (Turing API server). Depending on how the API server has been configured, one of the following happens:
This PR introduces a new workflow which would allow users to instead override the platform-level CPU limits (described in the paragraph above) set on a component. This workflow is available via the UI, SDK and by extension, directly calling the API endpoint of the API server.
UI:
SDK:
In addition, this PR adds a new configuration,
DefaultEnvVarsWithoutCPULimits
, which is a list of env vars that automatically get added to all Turing routers/enrichers/ensemblers (docker and pyfunc) when CPU limits are not set. This allows the Turing API server's operators to set env vars platform-wide that can potentially improve these deployments' performance, e.g. env vars involving concurrency.Modifications
api/turing/cluster/knative_service.go
- Removal of platform-level fields from theKnativeService
structapi/turing/cluster/servicebuilder/service_builder.go
- Addition of platform-level configs toclusterSvcBuilder
and new helper methods to set default env vars when cpu limits are not explicitly set and when the cpu limit scaling factor is set as 0api/turing/config/config.go
- Addition of the new fieldDefaultEnvVarsWithoutCPULimits
sdk/turing/router/config/resource_request.py
- Addition of a new cpu limit field to the resource request classui/src/router/components/form/components/CPULimitsFormGroup.js
- Addition of a new form group to allow cpu limits to be specified on the UI