-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: IgnoreNotFound prevents a workflow error when the resource does not exist #1834
Conversation
I've checked, and the test is broken in trunk, and the error does not come from this code. Also, I have already signed the CLA |
@@ -265,6 +272,12 @@ func (we *WorkflowExecutor) SaveResourceParameters(resourceNamespace string, res | |||
if param.ValueFrom == nil { | |||
continue | |||
} | |||
if we.Template.Resource.IgnoreNotFound && resourceName == "" { | |||
// Resource wasn't found, so we populate result with sentinel value | |||
output := "<NOT_FOUND>" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason you provided a sentinel value? Using kubectl get ... -ignore-not-found=true
directly doesn't provide one and I don't think any user would expect a value to be filled in if a resource doesn't exist.
Considering your use case from your proposal:
...
- name: check-for-existence
resource:
action: get
ignoreNotFound: true
manifest: |
apiVersion: v1
kind: Pod
metadata:
name: does-not-exist
namespace: argo-workflows
outputs:
parameters:
- name: not-found
valueFrom:
jsonPath: '{.metadata.name}'
A user would have to know what the sentinel value is before checking it against the result of '{.metadata.name}'
to determine if a resource actually exists or not, or would have to check if the result of '{.metadata.name}'
and name
match. It would be more logical to simply check if '{.metadata.name}'
is empty or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason that there is a sentinel value is because we have three separate possibilities from any jsonpath expression: 1) non-empty value like "foo", 2) empty value like "{.metadata.annotations.imageregistry}" == "", and 3) non-existent entity.
If we don't provide a sentinel value, it's still possible to detect the difference between cases (2) and (3), but it requires the developer to either use two steps (one with a known existing jsonpath like .metadata.name followed by the desired jsonpath) or by including a marker in the jsonpath expression (e.g. '<{.metadata.annotations.imageregistry}>' which can than be compared with "<>" for empty vs "" for non-existent). Both of these require considerable extra costs to development, either extra runtime costs or string parsing with every result.
Using a sentinel value provides a nice balance that makes usage simple and efficient for most use cases. In the case where the user happens to be using the exact sentinel value chosen here, they then have to choose a solution like adding markers to the jsonpath.
Adding a custom sentinel value with an additional flag would also solve that problem, but that would be adding more code and configuration for a use case that is unlikely to occur in actual usage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see your point, thanks for explaining.
@alexec What do you think about this? I am leaning towards not including it. Even though it will force developers to have to do an extra check, I think it is more intuitive for users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My first thought wast that it should probably be a sentinel error. They are more clearly defined. However, as I understand it, you want to:
- Ignore "not found" errors.
- Allow subsequent steps to understand that the resource was not found and act on it.
Am I correct? If that's the case, could we say that .metadata.creationTImestamp is null?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alexec The main point of this feature is to disassociate the kubectl not-found error from the workflow error domain.
Right now, the "resource" template subsumes all kubectl errors, which would normally be a useful behavior, but in the case of kubectl, they made a decision to return an error code when looking for a resource that doesn't exist (or for deleting one).
While this can be useful in the guise of a shell script, where you can add per-command error handling, in argo-workflows there is no per-step handling of errors, so gracefully handling this case is problematic.
We could switch the behavior over wholesale, but that might be surprising to existing users. This PR just adds the option to the user to set the --ignore-not-found flag on get or delete operations. With this set, deleting a resource becomes idempotent in the case of a race condition (delete if found, ignore if someone else deleted it first), and getting a resource can now also be used to check for the existence of a resource (something that I need enough that I'm already maintaining a local fork to do so).
The added behavior that we're discussing here is specifically about how to return this additional piece of information to the developer in a way that is useful to consume. We only return a single string, so we have to choose how to encode the new information.
As you've indicated, we could force all developers who are trying to retrieve some field value to use two steps in their workflow (One to check a value like.metadata.creationTimestamp and one to check the desired value). This performs an additional API RPC call, adds argo executor overhead, and requires additional workflow complexity.
We could also suggest that the user add their own markers, but that requires them to split out the marker on every step to get a useful value.
The code here returns a fixed sentinel value that is unlikely to appear in the wild. I've chosen the characters to be outside of the set used in kubernetes identifiers, and unlikely to occur in a natural language sentence. This way, the check is as simple as a string comparison for the sentinel.
@simster7 I've assigned you an owner as you've already looked into this. |
@dmayle Any plans on finishing this up? We would like to get this merged |
Yes, absolutely, but I'm out of the office until next week. BTW - would
you rather I stack changes when updating, or rebase and flatten?
…On Thu, Jan 2, 2020, 20:03 Simon Behar ***@***.***> wrote:
@dmayle <https://github.com/dmayle> Any plans on finishing this up? We
would like to get this merged
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1834>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAQLWPLERQXFYBVNJR3C4TQ3ZB7PANCNFSM4JZUCP4A>
.
|
No preference! Commits get squashed when merging to master anyway |
accd328
to
6363107
Compare
Codecov Report
@@ Coverage Diff @@
## master #1834 +/- ##
=======================================
Coverage 10.63% 10.63%
=======================================
Files 35 35
Lines 24992 24992
=======================================
Hits 2657 2657
Misses 21996 21996
Partials 339 339 Continue to review full report at Codecov.
|
The argo CI check failed because of a transient network error connecting to Github. Is there anyway to retrigger just that check? |
c0f5ce9
to
4ad58f0
Compare
Don't worry about the Argo CI check, we treat that one as flakey |
4ad58f0
to
b36a1da
Compare
I've had to rebase again onto master because of the generated workflow file. |
…ds which support it
2c3d7f2
to
87c497f
Compare
Any thoughts? |
(@dmayle Just FYI, we're a bit busy with the upcoming API Server release and will get to this a couple of days after the release) |
@simster7 thanks for the heads up! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please can you do a git merge master
?
Thanks @dmayle. Sorry for the absolute crazy delay on this |
This implements the solution proposed in:
#1823
Adds an 'ignoreNotFound' value to the resource object in the workflow manifest so that kubectl won't throw an error when the resource does not exist.
While this was implemented in order to allow checking for the existence of a resource, it is also useful in other cases, such as an idempotent delete step.