Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workspaces can get stuck in failed state if DevWorkspaceRouting cannot be processed #21694

Closed
amisevsk opened this issue Sep 10, 2022 · 3 comments
Labels
area/che-operator Issues and PRs related to Eclipse Che Kubernetes Operator kind/bug Outline of a bug - must adhere to the bug report template. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. severity/P2 Has a minor but important impact to the usage or development of the system.

Comments

@amisevsk
Copy link
Contributor

Describe the bug

Due to a bug [1] in the DevWorkspace Operator, it's possible for workspaces to get stuck in a failed state if the Che Operator encounters a temporary issue in reconciling DevWorkspaceRoutings. Once a DevWorkspaceRouting is failed, further reconciles exit early and cannot clear the failed status.

This issue is for tracking in the Che repo; the fix will have to come in the DevWorkspace Operator. There are workarounds listed in the DWO issue.

[1] - devfile/devworkspace-operator#923

Che version

next (development version)

Steps to reproduce

  1. Install Che as normal
  2. Create a second CheCluster in another namespace. This will cause all workspace starts to fail with error
    Unable to provision networking for DevWorkspace: workspace routing is invalid: the routing does not specify any Che manager in its configuration but there are 2 Che managers in the cluster
    
  3. Create a DevWorkspace and wait for it to enter the failed state
  4. Remove the second CheCluster from step 2.
  5. New workspaces or workspaces that didn't enter the failed state due to the second CheCluster can be started as normal, but any workspaces that failed cannot be started.

Expected behavior

Failed status should be cleared when a workspace is restarted.

Runtime

Kubernetes (vanilla)

Screenshots

No response

Installation method

other (please specify in additional context)

Environment

Linux

Eclipse Che Logs

No response

Additional context

No response

@amisevsk amisevsk added kind/bug Outline of a bug - must adhere to the bug report template. area/che-operator Issues and PRs related to Eclipse Che Kubernetes Operator labels Sep 10, 2022
@che-bot che-bot added the status/need-triage An issue that needs to be prioritized by the curator responsible for the triage. See https://github. label Sep 10, 2022
@l0rd l0rd added severity/P2 Has a minor but important impact to the usage or development of the system. and removed status/need-triage An issue that needs to be prioritized by the curator responsible for the triage. See https://github. labels Sep 12, 2022
@che-bot
Copy link
Contributor

che-bot commented Mar 11, 2023

Issues go stale after 180 days of inactivity. lifecycle/stale issues rot after an additional 7 days of inactivity and eventually close.

Mark the issue as fresh with /remove-lifecycle stale in a new comment.

If this issue is safe to close now please do so.

Moderators: Add lifecycle/frozen label to avoid stale mode.

@che-bot che-bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 11, 2023
@amisevsk amisevsk added the lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. label Mar 11, 2023
@amisevsk
Copy link
Contributor Author

/remove-lifecycle stale

@che-bot che-bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 11, 2023
@tolusha
Copy link
Contributor

tolusha commented Sep 25, 2023

@amisevsk
Can we close this issue since devfile/devworkspace-operator#923 is resolved?

@amisevsk amisevsk closed this as completed Oct 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/che-operator Issues and PRs related to Eclipse Che Kubernetes Operator kind/bug Outline of a bug - must adhere to the bug report template. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. severity/P2 Has a minor but important impact to the usage or development of the system.
Projects
None yet
Development

No branches or pull requests

4 participants