Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

doc: add global maxRetry option #3846

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

dongjiang1989
Copy link
Contributor

@dongjiang1989 dongjiang1989 commented Nov 27, 2024

add global jobRetry option

ref: #3838

@volcano-sh-bot volcano-sh-bot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Nov 27, 2024
@volcano-sh-bot volcano-sh-bot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Nov 28, 2024
@dongjiang1989 dongjiang1989 changed the title doc: add global jobRetry option doc: add global maxRetry option Nov 28, 2024
flows:
- name: a
patch:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the function of patch

Copy link
Contributor Author

@dongjiang1989 dongjiang1989 Nov 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the maxRetry of task a is different from the global maxRetry setting, the patch method is supported.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I mean, I don't know the necessity of this field. It seems to be fine without it. What do you think?

Copy link
Contributor Author

@dongjiang1989 dongjiang1989 Nov 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm... 🤔️ Thanks @hwdef
In my usage scenario, I still need to Patch some information to replace the content of the JobTemplate. Avoid partial differences causing an explosion in the number of JobTemplates.

ref: https://github.com/volcano-sh/volcano/blob/master/docs/design/jobflow/README.md Patch section

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I mean, I don't know the necessity of this field. It seems to be fine without it. What do you think?

+1, I also think that this field is unnecessary. Or you can reflect the purpose of this field in the design doc (code logic design), which will make it easier for everyone to understand.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I mean, I don't know the necessity of this field. It seems to be fine without it. What do you think?

+1, I also think that this field is unnecessary. Or you can reflect the purpose of this field in the design doc (code logic design), which will make it easier for everyone to understand.

Thanks @googs1025
One User case:

apiVersion: flow.volcano.sh/v1alpha1
kind: JobTemplate
metadata:
  name: a
spec:
  tasks:
    - spec:
          containers:
            - image: tf2.16-cuda11-xxx:v1.0.1
              command:
                - sh
                - -c
                - ./script.py --dateset a.txt --mode 1
              imagePullPolicy: IfNotPresent
              name: nginx
              resources:
                requests:
                  cpu: "1"
          restartPolicy: OnFailure
---
apiVersion: flow.volcano.sh/v1alpha1
kind: JobFlow
metadata:
  name: test-flow1
  namespace: default
spec:
  jobRetainPolicy: delete
  flows:
    - name: a
---
apiVersion: flow.volcano.sh/v1alpha1
kind: JobFlow
metadata:
  name: test-flow2
  namespace: default
spec:
  jobRetainPolicy: delete
    - name: a
      patch:
          spec:
             tasks:
               - spec:
                     containers:
                        - image: tf2.16-cuda11-xxx:v1.0.2 # just different image
                           command:
                              - sh
                              - -c
                              - ./script.py --dateset b.txt --mode 2 # just different args

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am more concerned about the role of patch and whether this patch is necessary because it will make cr more complicated. The patch field alone cannot let users know what the role is.

@hwdef
Copy link
Member

hwdef commented Nov 29, 2024

/lgtm

@volcano-sh-bot volcano-sh-bot added the lgtm Indicates that a PR is ready to be merged. label Nov 29, 2024
@dongjiang1989
Copy link
Contributor Author

Hi @hwdef When will volcano/apis be released? Use new apis version to generate new CRD.

@hwdef
Copy link
Member

hwdef commented Nov 29, 2024

sorry I do not know about this.
Please ask @Monokaix

@Monokaix
Copy link
Member

Hi @hwdef When will volcano/apis be released? Use new apis version to generate new CRD.

Can you tell me which CRD are you going to use?

@dongjiang1989
Copy link
Contributor Author

Hi @hwdef When will volcano/apis be released? Use new apis version to generate new CRD.

Can you tell me which CRD are you going to use?

ref:volcano-sh/apis#145

@hwdef
Copy link
Member

hwdef commented Dec 2, 2024

/lgtm cancel

@volcano-sh-bot volcano-sh-bot removed the lgtm Indicates that a PR is ready to be merged. label Dec 2, 2024
@hwdef
Copy link
Member

hwdef commented Dec 2, 2024

I prefer the following definition

type Flow struct {
	// +kubebuilder:validation:MinLength=1
	// +required
	Name string `json:"name"`
	// +optional
	DependsOn *DependsOn `json:"dependsOn,omitempty"`
	// +optional
	MaxRetry *MaxRetry `json:"maxRetry,omitempty"`
}

@volcano-sh-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To complete the pull request process, please assign hwdef
You can assign the PR to them by writing /assign @hwdef in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@volcano-sh-bot volcano-sh-bot added size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Dec 3, 2024
@dongjiang1989
Copy link
Contributor Author

I prefer the following definition

type Flow struct {
	// +kubebuilder:validation:MinLength=1
	// +required
	Name string `json:"name"`
	// +optional
	DependsOn *DependsOn `json:"dependsOn,omitempty"`
	// +optional
	MaxRetry *MaxRetry `json:"maxRetry,omitempty"`
}

I agree. Change it to design done.
Thanks @hwdef

@hwdef
Copy link
Member

hwdef commented Dec 3, 2024

@googs1025
What do you think of this?

@googs1025
Copy link
Member

@googs1025 What do you think of this?

I'm ok with this.

@googs1025
Copy link
Member

I think we should squash commit to one

@dongjiang1989 dongjiang1989 force-pushed the add-jobflow-retry-doc branch 2 times, most recently from cd69649 to f8bf78d Compare December 5, 2024 06:47
@dongjiang1989
Copy link
Contributor Author

I think we should squash commit to one

Thanks @googs1025 @hwdef
Squash commits done.
Please re-check it.

@dongjiang1989
Copy link
Contributor Author

@hwdef Please add label to merge it, thanks

Signed-off-by: dongjiang <dongjiang1989@126.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants