-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
trt-1538: Wait for monitor resources cleanup #28760
Conversation
@jluhrsen: This pull request references trt-1538 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.15.z" version, but no target version was set. In response to this: Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: jluhrsen The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@@ -186,7 +186,7 @@ func (o *RunMonitorOptions) Run() error { | |||
|
|||
fmt.Fprintf(o.Out, "Monitor shutting down, this may take up to five minutes...\n") | |||
|
|||
cleanupContext, cleanupCancel := context.WithTimeout(context.Background(), 5*time.Minute) | |||
cleanupContext, cleanupCancel := context.WithTimeout(context.Background(), 15*time.Minute) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wound up bumping this to 20 minutes master run_monitor_command
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
func (w *availability) Cleanup(ctx context.Context) error { | ||
if w.imageRegistryRoute != nil { | ||
err := w.routeClient.RouteV1().Routes("openshift-image-registry").Delete(ctx, w.imageRegistryRoute.Name, metav1.DeleteOptions{}) | ||
if err != nil { | ||
return fmt.Errorf("failed to delete route: %w", err) | ||
} | ||
|
||
startTime := time.Now() | ||
err = wait.PollUntilContextTimeout(ctx, 15*time.Second, 15*time.Minute, true, w.routeDeleted) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
20
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
func (pna *podNetworkAvalibility) Cleanup(ctx context.Context) error { | ||
if len(pna.namespaceName) > 0 && pna.kubeClient != nil { | ||
if err := pna.kubeClient.CoreV1().Namespaces().Delete(ctx, pna.namespaceName, metav1.DeleteOptions{}); err != nil { | ||
return err | ||
} | ||
|
||
startTime := time.Now() | ||
err := wait.PollUntilContextTimeout(ctx, 15*time.Second, 15*time.Minute, true, pna.namespaceDeleted) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
20
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
func (w *availability) Cleanup(ctx context.Context) error { | ||
if len(w.namespaceName) > 0 && w.kubeClient != nil { | ||
if err := w.kubeClient.CoreV1().Namespaces().Delete(ctx, w.namespaceName, metav1.DeleteOptions{}); err != nil { | ||
return err | ||
} | ||
} | ||
|
||
startTime := time.Now() | ||
err := wait.PollUntilContextTimeout(ctx, 15*time.Second, 15*time.Minute, true, w.namespaceDeleted) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and 20 minutes here as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
c101b4b
to
d652ede
Compare
func (w *availability) Cleanup(ctx context.Context) error { | ||
if len(w.namespaceName) > 0 && w.kubeClient != nil { | ||
if err := w.kubeClient.CoreV1().Namespaces().Delete(ctx, w.namespaceName, metav1.DeleteOptions{}); err != nil { | ||
return err | ||
} | ||
} | ||
|
||
startTime := time.Now() | ||
err := wait.PollUntilContextTimeout(ctx, 20*time.Second, 15*time.Minute, true, w.namespaceDeleted) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you meant to bump the timeout (minutes) not the interval
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sloppy... fixed.
d652ede
to
2a07b9f
Compare
/retest-required |
1 similar comment
/retest-required |
@neisw , is there any value in this PR? the purpose of it was to prove that our OCPBUGS-31868 is not a regression in 4.16. So assuming it's not a regression, this PR has the chance of making new failures show up in 4.15. Or, we can get this in and backport the change to allow for some errors in the checking of namespaces. |
@jluhrsen The issue in 4.16 is occurring infrequently I believe. If you don't have a strong push to get this in and compare the results it seems reasonable to hold off. We didn't backport it originally even though the root issue that surfaced the need for this was in 4.15. I don't think it is super risky since we have been running it for a while but if you are ok without it we could close this. |
totally agree, @neisw. closing this now. |
@jluhrsen: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
No description provided.