-
Notifications
You must be signed in to change notification settings - Fork 61
GPU Type #419
Conversation
Codecov ReportPatch coverage has no change and project coverage change:
Additional details and impacted files@@ Coverage Diff @@
## master #419 +/- ##
==========================================
+ Coverage 75.92% 78.48% +2.55%
==========================================
Files 18 18
Lines 1458 1250 -208
==========================================
- Hits 1107 981 -126
+ Misses 294 212 -82
Partials 57 57
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
* Pass task execution metadata from agent Signed-off-by: Hongxin Liang <honnix@users.noreply.github.com> * Add doc Signed-off-by: Hongxin Liang <honnix@users.noreply.github.com> * Update protos/flyteidl/admin/agent.proto Co-authored-by: Kevin Su <pingsutw@gmail.com> Signed-off-by: Honnix <honnix@users.noreply.github.com> * Regenerate --------- Signed-off-by: Hongxin Liang <honnix@users.noreply.github.com> Signed-off-by: Honnix <honnix@users.noreply.github.com> Co-authored-by: Kevin Su <pingsutw@gmail.com> Signed-off-by: Jeev B <jeevb@users.noreply.github.com>
* add tags to execution spec Signed-off-by: Kevin Su <pingsutw@apache.org> * add tags to execution spec Signed-off-by: Kevin Su <pingsutw@apache.org> * add comment Signed-off-by: Kevin Su <pingsutw@apache.org> --------- Signed-off-by: Kevin Su <pingsutw@apache.org> Signed-off-by: Jeev B <jeevb@users.noreply.github.com>
Signed-off-by: Katrina Rogan <katroganGH@gmail.com> Signed-off-by: Jeev B <jeevb@users.noreply.github.com>
Signed-off-by: Kevin Su <pingsutw@apache.org> Signed-off-by: Jeev B <jeevb@users.noreply.github.com>
Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com> Signed-off-by: Jeev B <jeevb@users.noreply.github.com>
Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com> Signed-off-by: Jeev B <jeevb@users.noreply.github.com>
Signed-off-by: Jeev B <jeevb@users.noreply.github.com>
Signed-off-by: Daniel Rammer <daniel@union.ai> Signed-off-by: Jeev B <jeevb@users.noreply.github.com>
Signed-off-by: Jeev B <jeevb@users.noreply.github.com>
Signed-off-by: Jeev B <jeevb@users.noreply.github.com>
Signed-off-by: Jeev B <jeevb@users.noreply.github.com>
|
||
// Additional metadata associated with resources to allocate to a task | ||
message ResourceMetadata { | ||
oneof accelerator_value { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why oneof? if it's not set, it'll just be nil, do you want to allow nullifying a field? if so maybe that should go into the override message?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The thought was to support other accelerator types like TPUs in the future. One option would have been to just add all possible accelerator types as separate fields without the guarantee that oneof provides - that only one type is set at any given point in time.
I actually did not consider nullifying an accelerator specified in the task metadata, in the override. That does sound nice. If we were to do that, would it be a bad idea to add a accelerator_not_set
to the oneof?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since oneofs do not change wire format, I would keep it out and kick the can down the road when we add a different accelerator to decide then.
There isn't a great way to nullify in Protos unfortunately...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This reverts commit ef37788. Signed-off-by: Jeev B <jeevb@users.noreply.github.com>
* add field Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com> Signed-off-by: Jeev B <jeevb@users.noreply.github.com> * Pass task execution metadata from agent (#422) * Pass task execution metadata from agent Signed-off-by: Hongxin Liang <honnix@users.noreply.github.com> * Add doc Signed-off-by: Hongxin Liang <honnix@users.noreply.github.com> * Update protos/flyteidl/admin/agent.proto Co-authored-by: Kevin Su <pingsutw@gmail.com> Signed-off-by: Honnix <honnix@users.noreply.github.com> * Regenerate --------- Signed-off-by: Hongxin Liang <honnix@users.noreply.github.com> Signed-off-by: Honnix <honnix@users.noreply.github.com> Co-authored-by: Kevin Su <pingsutw@gmail.com> Signed-off-by: Jeev B <jeevb@users.noreply.github.com> * Add tags to execution spec (#414) * add tags to execution spec Signed-off-by: Kevin Su <pingsutw@apache.org> * add tags to execution spec Signed-off-by: Kevin Su <pingsutw@apache.org> * add comment Signed-off-by: Kevin Su <pingsutw@apache.org> --------- Signed-off-by: Kevin Su <pingsutw@apache.org> Signed-off-by: Jeev B <jeevb@users.noreply.github.com> * Correct comment for array job max parallelism (#431) Signed-off-by: Katrina Rogan <katroganGH@gmail.com> Signed-off-by: Jeev B <jeevb@users.noreply.github.com> * Add the scalar to the operand (#427) Signed-off-by: Kevin Su <pingsutw@apache.org> Signed-off-by: Jeev B <jeevb@users.noreply.github.com> * add selector Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com> Signed-off-by: Jeev B <jeevb@users.noreply.github.com> * move selectors from container to task metadata Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com> Signed-off-by: Jeev B <jeevb@users.noreply.github.com> * drop only_preferred Signed-off-by: Jeev B <jeevb@users.noreply.github.com> * Updating boilerplate to lock golangci-lint version (#435) Signed-off-by: Daniel Rammer <daniel@union.ai> Signed-off-by: Jeev B <jeevb@users.noreply.github.com> * add unpartitioned selector Signed-off-by: Jeev B <jeevb@users.noreply.github.com> * refactor Signed-off-by: Jeev B <jeevb@users.noreply.github.com> * refactor Signed-off-by: Jeev B <jeevb@users.noreply.github.com> * fix oneof names Signed-off-by: Jeev B <jeevb@users.noreply.github.com> * add build.os for read the docs Signed-off-by: Jeev B <jeevb@users.noreply.github.com> --------- Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com> Signed-off-by: Jeev B <jeevb@users.noreply.github.com> Signed-off-by: Hongxin Liang <honnix@users.noreply.github.com> Signed-off-by: Honnix <honnix@users.noreply.github.com> Signed-off-by: Kevin Su <pingsutw@apache.org> Signed-off-by: Katrina Rogan <katroganGH@gmail.com> Signed-off-by: Daniel Rammer <daniel@union.ai> Co-authored-by: Honnix <honnix@users.noreply.github.com> Co-authored-by: Kevin Su <pingsutw@gmail.com> Co-authored-by: Kevin Su <pingsutw@apache.org> Co-authored-by: Katrina Rogan <katroganGH@gmail.com> Co-authored-by: Jeev B <jeevb@users.noreply.github.com> Co-authored-by: Dan Rammer <daniel@union.ai>
TL;DR
Adds support for passing along extra metadata related to resources (e.g. accelerator type) to the backend
Type
Are all requirements met?
Complete description
How did you fix the bug, make the feature etc. Link to any design docs etc
Tracking Issue
Remove the 'fixes' keyword if there will be multiple PRs to fix the linked issue
fixes https://github.com/flyteorg/flyte/issues/
Follow-up issue
NA
OR
https://github.com/flyteorg/flyte/issues/