fix: crash if targetContainer does not exist #292
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This pull request serves as a naive patch to the crashing problem when
targetContainer
does not exist.Specifically, the operator crashes when it cannot find the pod to execute command on. When the pod being queried does not exist,
getContainerID
returns '-1'. The operator detects the error, but instead of aborting immediately, it continues to accesspod.Spec.Containers[]
with indextargetContainer
, leading to a crash.The detailed information and steps to reproduce the problem has been included in #291.
This pull request avoids the crash by adding a return inside every block of error handling code, as can be seen from the changes I made.
Additional Comments
This crashing problem is caused by only printing error logs but not dealing with those errors. The functions continue execution even in the presence of errors. For example, in
executeCommand
, even though errors are detected and error logs are printed, the function continues execution and print out theSuccessfully executed the command
log.In the current code base, there are many other places where errors are only detected but bot dealt with. To make the operator more robust, I think we should refactor the error handling code in many functions. As the desired error handling behavior is unknown to me, I cannot make that fix now. Nonetheless, I am more than happy to discuss with you and make the patches for you.