-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle dag in pipelineresoultion #2821
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
This PR cannot be merged: expecting exactly one kind/ label Available
|
This PR cannot be merged: expecting exactly one kind/ label Available
|
5769f6b
to
0c7d850
Compare
it feels like pipelinerunresolution already has so many responsibilities, im not sure that giving it another one will improve it - maybe we need a new package with new responsibilities? I also think it might be worth stepping back and taking a look at the data structures we are using; i often find that when code is getting complex and confusing, changing the shapes of the data structures being used can really help (e.g. PipelineRunState might not be the best data structure to be passing around going foward) |
sorry @afrittoli after looking at #2661 some more, i do agree with moving the call to the DAG logic into the resolution logic I do still think that 1) pipelineresolution does too much and needs to be broken up and that 2) the data structures need to change but i agree this is an improvement! |
Thanks for the review. |
That sounds amazing! |
0c7d850
to
f8234d5
Compare
@bobcatfish @pritidesai I've tried to keep this PR reasonable in size, not very successfully, so some of the refactors here are not included here. The main objectives here are to move the dag handling into pipelinerunresolution, and to ensure the various helpers do what their name mean, so that the code hopefully is more readable and maintainable. Things that are missing (maybe more), to be handled in separate PRs:
One thing that I noticed is that today we:
I wonder if we should change the order to this instead:
Just to make it clear that the new taskruns does not have an impact on the state reported in the current reconcile cycle. One last bit, I think it might be nice to signal the running finally with a different reason. |
22c2e9c
to
efc275f
Compare
The following is the coverage report on the affected files.
|
if err := pipelineSpec.Validate(ctx); err != nil { | ||
// This Run has failed, so we need to mark it as failed and stop reconciling it | ||
pr.Status.MarkFailed(ReasonFailedValidation, | ||
var reason = ReasonFailedValidation | ||
if err.Details == "Invalid Graph" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the graph is created in pipelinerunresources
, validation happens first. In this context it was not clear that the error was an invalid graph one, so I added some extra context in the error to be able to catch it and use the specific reason.
The alternatives are:
- skip validation here, but that's only a temp solution
- move validation into
ResolvePipelineRun
- which might make sense, but not in this PR - simply return a validation error and rely on the error stack to provide more insight to the user
I wonder if we really need to run validation here, but I guess we will need to do once tasks / pipelines may come from a container registry or git url.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@afrittoli pipelineSpec
is being validated here for many different checks/conditions in addition to invalidGraph
. I think we do need to run validation here, I dont see validation happening else where, am I missing something? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could certainly move the check on invalid graph to ResolvePipelineRun
.
Also, such reason could be generated where the error is getting created in pipeline_validation.go
instead of calculating here. Its a great idea to set the reason to specific validation error in addition to returning error
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@afrittoli
pipelineSpec
is being validated here for many different checks/conditions in addition toinvalidGraph
. I think we do need to run validation here, I dont see validation happening else where, am I missing something? 🤔
Validation is done by the webhook when resources are submitted to etcd.
However it will not be performed for resources that are not stored in etcd like tasks or pipelines coming from an external source (git/registry). I'm not sure about embedded specs - I'd expect to go through the same validation as normal resources.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could certainly move the check on invalid graph to
ResolvePipelineRun
.
That's the way I implemented it originally, but because we validate the spec here, we don't really catch the error in ResolvePipelineRun
. The only way would be to move the whole validation of the spec into ResolvePipelineRun
, which is something we could do eventually, but I didn't want to do it here as well.
Also, such reason could be generated where the error is getting created in
pipeline_validation.go
instead of calculating here. Its a great idea to set the reason to specific validation error in addition to returningerror
.
@@ -451,8 +451,8 @@ func TestReconcile_InvalidPipelineRuns(t *testing.T) { | |||
))), | |||
tb.PipelineRun("pipeline-invalid-final-graph", tb.PipelineRunNamespace("foo"), tb.PipelineRunSpec("", tb.PipelineRunPipelineSpec( | |||
tb.PipelineTask("dag-task-1", "taskName"), | |||
tb.FinalPipelineTask("final-task-1", "taskName"), | |||
tb.FinalPipelineTask("final-task-1", "taskName")))), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This error does not trigger an invalid dag when going through validation, it fails first on validatePipelineTaskName
t.Errorf("Expected two ConditionCheck TaskRuns to be created, but it wasn't.") | ||
// Check that the expected TaskRun were created | ||
actions := clients.Pipeline.Actions() | ||
if !actions[1].Matches("create", "taskruns") || !actions[2].Matches("create", "taskruns") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doing a cast without checking first leads to an un-managed failure in the tests when the test fails.
My code changes initially made this test failing and the error was not very informative.
/remove-lifecycle rotten |
@vdemeester: Reopened this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Issues go stale after 90d of inactivity. /lifecycle stale Send feedback to tektoncd/plumbing. |
The pipelinerun state is made of resolved pipelinerun tasks (rprt), which are build from the actual status of the associated taskruns. It is computationaly easy to know if a taskrun started, or completed successfully or unsuccessfully; however determining whether a taskrun has been skipped or will be skipped in the pipeline run execution, requires evaluating the entire pipeline run status and associated dag. The Skip method used to apply to a single rprt, evaluate the entire pipeline run status and dag, return whether the specific rprt was going to be skipped, and throw away the rest. We used to invoke the Skip method on every rprt in the pipeline state to calculate candidate tasks for execution. To make things worse, we also invoked the "Skip" method as part of the "isDone", defined as a logical OR between "isSuccessful", "isFailed" and "Skip". With this change we compute the list of tasks to be skipped once, and we store the result in a map to the pipelinerun facts, along with pipelinerun state and associated dags. We introdce a new method on the pipelinerun facts called "IsTaskSkipped". This method performs the lazy computation of the skips map the first time it is invoked. Any following invocation is able to provide the skip status of a specific pipeline task by looking up in the map. This solution manages to hide the details of the skip logic to the core reconciler logic. I believe further refactor could help, but I wanted to keep this PR as little as possible. I will further pursue this work by revining tektoncd#2821 I converted the unit test for "Skip" to a unit test for "SkipMap", to ensure that the new logic gives the same result as we used to have. The test should be moved to a different module as "SkipMap" lives in a different module, however leaving it in place very much helps with the review. I will move it in a follow-up patch. This changes adds a unit test that reproduces the issue in tektoncd#3521, which used to fail (with timeout 30s) and now succeedes for pipelines roughly up to 120 tasks / 120 links. On my laptop, going beyond 120 tasks/links takes longer than 30s, so I left the unit test at 80 to avoid introducing a flaky test in CI. There is still work to do to improve this further, some profiling / tracing work might help. Breaking large pipelines in logical groups (branches or pipelines in pipelines) would help reduce the complexity and computational cost for very large pipelines. Fixes tektoncd#3521 Signed-off-by: Andrea Frittoli <andrea.frittoli@uk.ibm.com>
The pipelinerun state is made of resolved pipelinerun tasks (rprt), which are build from the actual status of the associated taskruns. It is computationaly easy to know if a taskrun started, or completed successfully or unsuccessfully; however determining whether a taskrun has been skipped or will be skipped in the pipeline run execution, requires evaluating the entire pipeline run status and associated dag. The Skip method used to apply to a single rprt, evaluate the entire pipeline run status and dag, return whether the specific rprt was going to be skipped, and throw away the rest. We used to invoke the Skip method on every rprt in the pipeline state to calculate candidate tasks for execution. To make things worse, we also invoked the "Skip" method as part of the "isDone", defined as a logical OR between "isSuccessful", "isFailed" and "Skip". With this change we compute the list of tasks to be skipped once, and we store the result in a map to the pipelinerun facts, along with pipelinerun state and associated dags. We introdce a new method on the pipelinerun facts called "IsTaskSkipped". This method performs the lazy computation of the skips map the first time it is invoked. Any following invocation is able to provide the skip status of a specific pipeline task by looking up in the map. This solution manages to hide some of the details of the skip logic from the core reconciler logic. I believe further refactor could help, but I wanted to keep this PR as little as possible. I will further pursue this work by revining tektoncd#2821 I converted the unit test for "Skip" to a unit test for "SkipMap", to ensure that the new logic gives the same result as we used to have. The test should be moved to a different module as "SkipMap" lives in a different module, however leaving it in place very much helps with the review. I will move it in a follow-up patch. This changes adds a unit test that reproduces the issue in tektoncd#3521, which used to fail (with timeout 30s) and now succeedes for pipelines roughly up to 120 tasks / 120 links. On my laptop, going beyond 120 tasks/links takes longer than 30s, so I left the unit test at 80 to avoid introducing a flaky test in CI. There is still work to do to improve this further, some profiling / tracing work might help. Breaking large pipelines in logical groups (branches or pipelines in pipelines) would help reduce the complexity and computational cost for very large pipelines. Fixes tektoncd#3521 Signed-off-by: Andrea Frittoli <andrea.frittoli@uk.ibm.com>
The pipelinerun state is made of resolved pipelinerun tasks (rprt), which are build from the actual status of the associated taskruns. It is computationaly easy to know if a taskrun started, or completed successfully or unsuccessfully; however determining whether a taskrun has been skipped or will be skipped in the pipeline run execution, requires evaluating the entire pipeline run status and associated dag. The Skip method used to apply to a single rprt, evaluate the entire pipeline run status and dag, return whether the specific rprt was going to be skipped, and throw away the rest. We used to invoke the Skip method on every rprt in the pipeline state to calculate candidate tasks for execution. To make things worse, we also invoked the "Skip" method as part of the "isDone", defined as a logical OR between "isSuccessful", "isFailed" and "Skip". With this change we compute the list of tasks to be skipped once, and we store the result in a map to the pipelinerun facts, along with pipelinerun state and associated dags. We introdce a new method on the pipelinerun facts called "IsTaskSkipped". This method performs the lazy computation of the skips map the first time it is invoked. Any following invocation is able to provide the skip status of a specific pipeline task by looking up in the map. This solution manages to hide some of the details of the skip logic from the core reconciler logic. I believe further refactor could help, but I wanted to keep this PR as little as possible. I will further pursue this work by revining tektoncd#2821 I converted the unit test for "Skip" to a unit test for "SkipMap", to ensure that the new logic gives the same result as we used to have. The test should be moved to a different module as "SkipMap" lives in a different module, however leaving it in place very much helps with the review. I will move it in a follow-up patch. This changes adds a unit test that reproduces the issue in tektoncd#3521, which used to fail (with timeout 30s) and now succeedes for pipelines roughly up to 120 tasks / 120 links. On my laptop, going beyond 120 tasks/links takes longer than 30s, so I left the unit test at 80 to avoid introducing a flaky test in CI. There is still work to do to improve this further, some profiling / tracing work might help. Breaking large pipelines in logical groups (branches or pipelines in pipelines) would help reduce the complexity and computational cost for very large pipelines. Fixes tektoncd#3521 Signed-off-by: Andrea Frittoli <andrea.frittoli@uk.ibm.com>
The pipelinerun state is made of resolved pipelinerun tasks (rprt), which are build from the actual status of the associated taskruns. It is computationaly easy to know if a taskrun started, or completed successfully or unsuccessfully; however determining whether a taskrun has been skipped or will be skipped in the pipeline run execution, requires evaluating the entire pipeline run status and associated dag. The Skip method used to apply to a single rprt, evaluate the entire pipeline run status and dag, return whether the specific rprt was going to be skipped, and throw away the rest. We used to invoke the Skip method on every rprt in the pipeline state to calculate candidate tasks for execution. To make things worse, we also invoked the "Skip" method as part of the "isDone", defined as a logical OR between "isSuccessful", "isFailed" and "Skip". With this change we compute the list of tasks to be skipped once, and we store the result in a map to the pipelinerun facts, along with pipelinerun state and associated dags. We introdce a new method on the pipelinerun facts called "IsTaskSkipped". This method performs the lazy computation of the skips map the first time it is invoked. Any following invocation is able to provide the skip status of a specific pipeline task by looking up in the map. This solution manages to hide some of the details of the skip logic from the core reconciler logic. I believe further refactor could help, but I wanted to keep this PR as little as possible. I will further pursue this work by revining tektoncd#2821 I converted the unit test for "Skip" to a unit test for "SkipMap", to ensure that the new logic gives the same result as we used to have. The test should be moved to a different module as "SkipMap" lives in a different module, however leaving it in place very much helps with the review. I will move it in a follow-up patch. This changes adds a unit test that reproduces the issue in tektoncd#3521, which used to fail (with timeout 30s) and now succeedes for pipelines roughly up to 120 tasks / 120 links. On my laptop, going beyond 120 tasks/links takes longer than 30s, so I left the unit test at 80 to avoid introducing a flaky test in CI. There is still work to do to improve this further, some profiling / tracing work might help. Breaking large pipelines in logical groups (branches or pipelines in pipelines) would help reduce the complexity and computational cost for very large pipelines. Fixes tektoncd#3521 Signed-off-by: Andrea Frittoli <andrea.frittoli@uk.ibm.com>
The pipelinerun state is made of resolved pipelinerun tasks (rprt), which are build from the actual status of the associated taskruns. It is computationaly easy to know if a taskrun started, or completed successfully or unsuccessfully; however determining whether a taskrun has been skipped or will be skipped in the pipeline run execution, requires evaluating the entire pipeline run status and associated dag. The Skip method used to apply to a single rprt, evaluate the entire pipeline run status and dag, return whether the specific rprt was going to be skipped, and throw away the rest. We used to invoke the Skip method on every rprt in the pipeline state to calculate candidate tasks for execution. To make things worse, we also invoked the "Skip" method as part of the "isDone", defined as a logical OR between "isSuccessful", "isFailed" and "Skip". With this change we compute the list of tasks to be skipped once, incrementally, by caching the results of each invocation of 'Skip'. We store the result in a map to the pipelinerun facts, along with pipelinerun state and associated dags. We introdce a new method on the pipelinerun facts called "IsTaskSkipped". This solution manages to hide some of the details of the skip logic from the core reconciler logic, bit it still requires the cache to be manually reset in a couple of places. I believe further refactor could help, but I wanted to keep this PR as little as possible. I will further pursue this work by revining tektoncd#2821 This changes adds a unit test that reproduces the issue in tektoncd#3521, which used to fail (with timeout 30s) and now succeedes for pipelines roughly up to 120 tasks / 120 links. On my laptop, going beyond 120 tasks/links takes longer than 30s, so I left the unit test at 80 to avoid introducing a flaky test in CI. There is still work to do to improve this further, some profiling / tracing work might help. Breaking large pipelines in logical groups (branches or pipelines in pipelines) would help reduce the complexity and computational cost for very large pipelines. Fixes tektoncd#3521 Co-authored-by: Scott <sbws@google.com> Signed-off-by: Andrea Frittoli <andrea.frittoli@uk.ibm.com>
The pipelinerun state is made of resolved pipelinerun tasks (rprt), which are build from the actual status of the associated taskruns. It is computationaly easy to know if a taskrun started, or completed successfully or unsuccessfully; however determining whether a taskrun has been skipped or will be skipped in the pipeline run execution, requires evaluating the entire pipeline run status and associated dag. The Skip method used to apply to a single rprt, evaluate the entire pipeline run status and dag, return whether the specific rprt was going to be skipped, and throw away the rest. We used to invoke the Skip method on every rprt in the pipeline state to calculate candidate tasks for execution. To make things worse, we also invoked the "Skip" method as part of the "isDone", defined as a logical OR between "isSuccessful", "isFailed" and "Skip". With this change we compute the list of tasks to be skipped once, incrementally, by caching the results of each invocation of 'Skip'. We store the result in a map to the pipelinerun facts, along with pipelinerun state and associated dags. We introdce a new method on the pipelinerun facts called "IsTaskSkipped". This solution manages to hide some of the details of the skip logic from the core reconciler logic, bit it still requires the cache to be manually reset in a couple of places. I believe further refactor could help, but I wanted to keep this PR as little as possible. I will further pursue this work by revining tektoncd#2821 This changes adds a unit test that reproduces the issue in tektoncd#3521, which used to fail (with timeout 30s) and now succeedes for pipelines roughly up to 120 tasks / 120 links. On my laptop, going beyond 120 tasks/links takes longer than 30s, so I left the unit test at 80 to avoid introducing a flaky test in CI. There is still work to do to improve this further, some profiling / tracing work might help. Breaking large pipelines in logical groups (branches or pipelines in pipelines) would help reduce the complexity and computational cost for very large pipelines. Fixes tektoncd#3521 Co-authored-by: Scott <sbws@google.com> Signed-off-by: Andrea Frittoli <andrea.frittoli@uk.ibm.com>
The pipelinerun state is made of resolved pipelinerun tasks (rprt), which are build from the actual status of the associated taskruns. It is computationaly easy to know if a taskrun started, or completed successfully or unsuccessfully; however determining whether a taskrun has been skipped or will be skipped in the pipeline run execution, requires evaluating the entire pipeline run status and associated dag. The Skip method used to apply to a single rprt, evaluate the entire pipeline run status and dag, return whether the specific rprt was going to be skipped, and throw away the rest. We used to invoke the Skip method on every rprt in the pipeline state to calculate candidate tasks for execution. To make things worse, we also invoked the "Skip" method as part of the "isDone", defined as a logical OR between "isSuccessful", "isFailed" and "Skip". With this change we compute the list of tasks to be skipped once, incrementally, by caching the results of each invocation of 'Skip'. We store the result in a map to the pipelinerun facts, along with pipelinerun state and associated dags. We introdce a new method on the pipelinerun facts called "ResetSkippedCache". This solution manages to hide some of the details of the skip logic from the core reconciler logic, bit it still requires the cache to be manually reset in a couple of places. I believe further refactor could help, but I wanted to keep this PR as little as possible. I will further pursue this work by revining tektoncd#2821 This changes adds a unit test that reproduces the issue in tektoncd#3521, which used to fail (with timeout 30s) and now succeedes for pipelines roughly up to 120 tasks / 120 links. On my laptop, going beyond 120 tasks/links takes longer than 30s, so I left the unit test at 80 to avoid introducing a flaky test in CI. There is still work to do to improve this further, some profiling / tracing work might help. Breaking large pipelines in logical groups (branches or pipelines in pipelines) would help reduce the complexity and computational cost for very large pipelines. Fixes tektoncd#3521 Co-authored-by: Scott <sbws@google.com> Signed-off-by: Andrea Frittoli <andrea.frittoli@uk.ibm.com>
The pipelinerun state is made of resolved pipelinerun tasks (rprt), which are build from the actual status of the associated taskruns. It is computationaly easy to know if a taskrun started, or completed successfully or unsuccessfully; however determining whether a taskrun has been skipped or will be skipped in the pipeline run execution, requires evaluating the entire pipeline run status and associated dag. The Skip method used to apply to a single rprt, evaluate the entire pipeline run status and dag, return whether the specific rprt was going to be skipped, and throw away the rest. We used to invoke the Skip method on every rprt in the pipeline state to calculate candidate tasks for execution. To make things worse, we also invoked the "Skip" method as part of the "isDone", defined as a logical OR between "isSuccessful", "isFailed" and "Skip". With this change we compute the list of tasks to be skipped once, incrementally, by caching the results of each invocation of 'Skip'. We store the result in a map to the pipelinerun facts, along with pipelinerun state and associated dags. We introdce a new method on the pipelinerun facts called "ResetSkippedCache". This solution manages to hide some of the details of the skip logic from the core reconciler logic, bit it still requires the cache to be manually reset in a couple of places. I believe further refactor could help, but I wanted to keep this PR as little as possible. I will further pursue this work by revining #2821 This changes adds a unit test that reproduces the issue in #3521, which used to fail (with timeout 30s) and now succeedes for pipelines roughly up to 120 tasks / 120 links. On my laptop, going beyond 120 tasks/links takes longer than 30s, so I left the unit test at 80 to avoid introducing a flaky test in CI. There is still work to do to improve this further, some profiling / tracing work might help. Breaking large pipelines in logical groups (branches or pipelines in pipelines) would help reduce the complexity and computational cost for very large pipelines. Fixes #3521 Co-authored-by: Scott <sbws@google.com> Signed-off-by: Andrea Frittoli <andrea.frittoli@uk.ibm.com>
The pipelinerun state is made of resolved pipelinerun tasks (rprt), which are build from the actual status of the associated taskruns. It is computationaly easy to know if a taskrun started, or completed successfully or unsuccessfully; however determining whether a taskrun has been skipped or will be skipped in the pipeline run execution, requires evaluating the entire pipeline run status and associated dag. The Skip method used to apply to a single rprt, evaluate the entire pipeline run status and dag, return whether the specific rprt was going to be skipped, and throw away the rest. We used to invoke the Skip method on every rprt in the pipeline state to calculate candidate tasks for execution. To make things worse, we also invoked the "Skip" method as part of the "isDone", defined as a logical OR between "isSuccessful", "isFailed" and "Skip". With this change we compute the list of tasks to be skipped once, incrementally, by caching the results of each invocation of 'Skip'. We store the result in a map to the pipelinerun facts, along with pipelinerun state and associated dags. We introdce a new method on the pipelinerun facts called "ResetSkippedCache". This solution manages to hide some of the details of the skip logic from the core reconciler logic, bit it still requires the cache to be manually reset in a couple of places. I believe further refactor could help, but I wanted to keep this PR as little as possible. I will further pursue this work by revining tektoncd#2821 This changes adds a unit test that reproduces the issue in tektoncd#3521, which used to fail (with timeout 30s) and now succeedes for pipelines roughly up to 120 tasks / 120 links. On my laptop, going beyond 120 tasks/links takes longer than 30s, so I left the unit test at 80 to avoid introducing a flaky test in CI. There is still work to do to improve this further, some profiling / tracing work might help. Breaking large pipelines in logical groups (branches or pipelines in pipelines) would help reduce the complexity and computational cost for very large pipelines. Fixes tektoncd#3521 Co-authored-by: Scott <sbws@google.com> Signed-off-by: Andrea Frittoli <andrea.frittoli@uk.ibm.com> (cherry picked from commit fda7a81)
The pipelinerun state is made of resolved pipelinerun tasks (rprt), which are build from the actual status of the associated taskruns. It is computationaly easy to know if a taskrun started, or completed successfully or unsuccessfully; however determining whether a taskrun has been skipped or will be skipped in the pipeline run execution, requires evaluating the entire pipeline run status and associated dag. The Skip method used to apply to a single rprt, evaluate the entire pipeline run status and dag, return whether the specific rprt was going to be skipped, and throw away the rest. We used to invoke the Skip method on every rprt in the pipeline state to calculate candidate tasks for execution. To make things worse, we also invoked the "Skip" method as part of the "isDone", defined as a logical OR between "isSuccessful", "isFailed" and "Skip". With this change we compute the list of tasks to be skipped once, incrementally, by caching the results of each invocation of 'Skip'. We store the result in a map to the pipelinerun facts, along with pipelinerun state and associated dags. We introdce a new method on the pipelinerun facts called "ResetSkippedCache". This solution manages to hide some of the details of the skip logic from the core reconciler logic, bit it still requires the cache to be manually reset in a couple of places. I believe further refactor could help, but I wanted to keep this PR as little as possible. I will further pursue this work by revining #2821 This changes adds a unit test that reproduces the issue in #3521, which used to fail (with timeout 30s) and now succeedes for pipelines roughly up to 120 tasks / 120 links. On my laptop, going beyond 120 tasks/links takes longer than 30s, so I left the unit test at 80 to avoid introducing a flaky test in CI. There is still work to do to improve this further, some profiling / tracing work might help. Breaking large pipelines in logical groups (branches or pipelines in pipelines) would help reduce the complexity and computational cost for very large pipelines. Fixes #3521 Co-authored-by: Scott <sbws@google.com> Signed-off-by: Andrea Frittoli <andrea.frittoli@uk.ibm.com> (cherry picked from commit fda7a81)
Stale issues rot after 30d of inactivity. /lifecycle rotten Send feedback to tektoncd/plumbing. |
@afrittoli: Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@afrittoli: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
@afrittoli do you have any strong feelings about keeping this open vs closing it for now? |
I really want to have time to hack on the controller again :D but it's ok to close for now |
Changes
In the pipelinerun controller, today we follow this logic:
resources module
of the state
of candidates and the pipeline run status
The separation of concerns between the dag, resources and reconciler
modules feels a bit mixed up.
This is just a PoC that resolves part of the issue, by moving the
invocation of the dag building as well as obtaining the list of
candidates to the dag module, and aggregating the dag to the pipeline
state struct.
I did not update tests yet, this is for discussion purposes.
Submitter Checklist
These are the criteria that every PR should meet, please check them off as you
review them:
See the contribution guide for more details.
Double check this list of stuff that's easy to miss:
cmd
dir, please updatethe release Task to build and release this image.
Reviewer Notes
If API changes are included, additive changes must be approved by at least two OWNERS and backwards incompatible changes must be approved by more than 50% of the OWNERS, and they must first be added in a backwards compatible way.
/kind cleanup