Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Environments] Stop failed/pending environments #787

Closed
pameladelgado opened this issue Feb 26, 2020 · 4 comments · Fixed by #1605
Closed

[Environments] Stop failed/pending environments #787

pameladelgado opened this issue Feb 26, 2020 · 4 comments · Fixed by #1605
Assignees

Comments

@pameladelgado
Copy link
Contributor

Motivation

Users often have problems when starting an environment due to various reasons:

  • error in the entrypoint of the dockerfile
  • unavailable resources
  • registry unavailable for a newly ported project
  • etc

as a result the environment is stuck in Failed or Pending state without the user being able to do something about it.

Proposal

Add the option to stop the failed/pending environment.

Screenshot

unschedulable

@rokroskar
Copy link
Member

This is more or less a duplicate of SwissDataScienceCenter/renku-notebooks#62

@pameladelgado
Copy link
Contributor Author

This issue became more relevant now that users can ask for storage size and that size is enforced either with a dynamically created volume or on the general emptyDir.

Case 1

If the user ask for less storage than needed (especially when having auto-fetch LFS enabled):

  • the session is not able to start
  • the user does not get feedback on what happened and why
  • and most importantly, it can't do anything to get out of the loop other than asking an admin to stop their session or start a session from a new commit

Some other scenarios can still happen:

Case 2

When there are not enough resources in the cluster, the user should be also able to understand what happened and stop the session from being started.

@olevski
Copy link
Member

olevski commented Nov 11, 2021

I can confirm that as far as the notebook service is concerned there are no issues sending a DELETE request to the api/notebooks/servers/<server-name> endpoint even if that server has not fully started. The notebook service actually does the exact thing any admin would do to clean things up: just runs kubectl delete jupyterserver <server-name>.

There is an optional force parameter that can be passed. This should not be used. What this does is set the termination grace period in k8s to zero (if used and set to true). By default this is set to false and the UI should not send a request with this set to true for this application (or for any other for that matter).

@olevski
Copy link
Member

olevski commented Nov 11, 2021

Btw, this DELETE request is the same one that is sent when a session is stopped from the ui. Now the ui can enable this option even if a session has not fully started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants