[corechecks/containerlifecycle] Add Pod finished timestamp to containerlifecycle pod events #16153
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
NOTE: this PR is waiting on others to be merged in first, so some bits may change
What does this PR do?
This PR adds a pod exit timestamp to the PodEvents emitted by the container-lifecycle check, which is defined here
Motivation
Currently, we have the timestamp when a container terminates, but we do not have it on the pod side.
Additional Notes
This PR is waiting on the linked agent-payload PR to be merged, which itself is waiting on another agent-payload PR to be merged, which might affect the underlying data model used in this PR. As a result, some syntax might change between now and when the PR is marked as ready.
Possible Drawbacks / Trade-offs
We do not actually get the timestamp that a pod is killed by the kubelet, so to get around this I am using the timestamp of the first workloadmeta pull in which we did not see the pod anymore. The implementation currently sets all pods that were removed between intervals to have the same exited timestamp.
Describe how to test/QA your changes
Testing this is tricky, because (as far as I know) we do not have a way to inspect the payload generated by this check. I was able to validate this locally by building the agent with debug flags set, and then using delve to set breakpoints at the return statements of
toPayloadModel
andtoEventModel
inpkg/collector/corechecks/containerlifecycle/event.go
. I was able to validate locally that the field is populated this way.Reviewer's Checklist
Triage
milestone is set.major_change
label if your change either has a major impact on the code base, is impacting multiple teams or is changing important well-established internals of the Agent. This label will be use during QA to make sure each team pay extra attention to the changed behavior. For any customer facing change use a releasenote.changelog/no-changelog
label has been applied.qa/skip-qa
label is not applied.team/..
label has been applied, indicating the team(s) that should QA this change.need-change/operator
andneed-change/helm
labels have been applied.k8s/<min-version>
label, indicating the lowest Kubernetes version compatible with this feature.