-
Notifications
You must be signed in to change notification settings - Fork 994
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support TaskSpec level error handling #26
Labels
kind/feature
Categorizes issue or PR as related to a new feature.
Milestone
Comments
k82cn
added
the
kind/feature
Categorizes issue or PR as related to a new feature.
label
Mar 19, 2019
wangyuqing4
pushed a commit
to wangyuqing4/volcano-1
that referenced
this issue
Apr 30, 2019
[Issue volcano-sh#26]fix Delay Pod Creation If **PG.Phase != v1alpha1.PodGroupInqueue**, do reclaim, allocate, backfill, preempt action. <br/> minA is 1, pods total num is 100, 1 pod Running -> PG is Running, PG Running != v1alpha1.PodGroupInqueue, resource shortage can't do other actions. resource can not release, so 99 pods will always Pending. <br/> so fix **PG.Phase == v1alpha1.PodGroupPending**. Issues info: Issue ID: 26 Title: Delay Pod Creation Issue url: CBU-PaaS/Community/volcano/volcano#26 See merge request CBU-PaaS/Community/volcano/volcano!113
@TommyLike , I think this is done with the latest code,。 |
kevin-wangzefeng
pushed a commit
to kevin-wangzefeng/volcano
that referenced
this issue
Jun 28, 2019
Ignore nodes if out of syc.
kevin-wangzefeng
pushed a commit
to kevin-wangzefeng/volcano
that referenced
this issue
Jun 28, 2019
Ignore nodes if out of syc.
yolgun
added a commit
to yolgun/volcano
that referenced
this issue
Sep 13, 2022
Co-authored-by: Yunus Olgun <yunuso@spotify.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Is this a BUG REPORT or FEATURE REQUEST?:
/kind feature
Description:
Currently, we only support Job level and Task instance level error handling; TaskSpec level error handling is also necessary, e.g. the MPI job should be completed when
mpirun
Pod completed successfully.The text was updated successfully, but these errors were encountered: