-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Consider expanded tasks in getTaskFromNode #2756
Conversation
@@ -591,7 +602,7 @@ func findLeafTaskNames(tasks []wfv1.DAGTask) []string { | |||
} | |||
|
|||
// expandTask expands a single DAG task containing withItems, withParams, withSequence into multiple parallel tasks | |||
func (woc *wfOperationCtx) expandTask(task wfv1.DAGTask) ([]wfv1.DAGTask, error) { | |||
func expandTask(task wfv1.DAGTask) ([]wfv1.DAGTask, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The woc
receiver was not used.
Hi @simster7, First, thank you for considering this fix, I really appreciate it! Since I depend a lot on this feature, because of the scalability of my workflows, I wanted to test it for myself. It looks like it is not working yet, and I am not sure if you finished working on this, but just in case I wanted to share what is happening.
So if parameters are passed through the parameter aggregation JSON list, with
With an error
By the way, the workflow mentioned in the previous issue is working as intended. Thanks! |
Kudos, SonarCloud Quality Gate passed! 0 Bugs |
// nodeTaskName formulates the corresponding task name for a dag node. Note that this is not simply the inverse of | ||
// taskNodeName. A task name might be from an expanded task, in which case it will not have an explicit task defined for it. | ||
// When that is the case, we formulate the name of the original expanded task by removing the fields after "(" | ||
func (d *dagContext) taskNameFromNodeName(nodeName string) string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After much thinking, I've found this to be the most robust way to get a task name from a node name: expanding all tasks when to match them to the node is costly and would require a code refactor, storing a node name to task name map is futile because tasks are not expanded on every operation of the Workflow, adding a field to NodeStatus is undesirable because it exposes implementation details to the API and the field is only used by DAG Nodes.
This function is safe because node names generated from task names are deterministic, so we should always be able to recover the node name from the task name.
If the naming convention of node names changes, this function will also have to change. To ensure that this happens I have created tests that cover this and have also added assert statements that ensure a node name can be reversed to the correct task name every time one is created.
@@ -336,6 +343,10 @@ func (woc *wfOperationCtx) executeDAGTask(dagCtx *dagContext, taskName string) { | |||
dependenciesCompleted := true | |||
dependenciesSuccessful := true | |||
nodeName := dagCtx.taskNodeName(taskName) | |||
// Ensure that the generated taskNodeName can be reversed into the original (not expanded) task name | |||
if dagCtx.taskNameFromNodeName(nodeName) != task.Name { | |||
panic("unreachable: node name cannot be reversed into task name; please file a bug on GitHub") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've made sure that this code is unreachable now, but if the naming convention is changed in the future this is meant to catch the discrepancy before the code can be shipped.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, panics are recovered in the operator, so they fail gracefully: https://github.com/argoproj/argo/blob/c6ef1ff19e1c3f74b4ef146be37e74bd0b748cd7/workflow/controller/operator.go#L159-L168
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you should panic if there is a nasty bug - we log and report on panics
Thanks for the catch @nenadzaric! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you for the tests
Back-ported to 2.7 |
Fixes: #2743