-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add optional-components/jupyterhub-stop-idle
#389
Conversation
E2E Test ResultsDACCS-iac Pipeline ResultsBuild URL : http://daccs-jenkins.crim.ca:80/job/DACCS-iac-birdhouse/2153/Result : failure BIRDHOUSE_DEPLOY_BRANCH : jupyterhub-stop-idle DACCS_CONFIGS_BRANCH : master PAVICS_E2E_WORKFLOW_TESTS_BRANCH : master PAVICS_SDI_BRANCH : master DESTROY_INFRA_ON_EXIT : true PAVICS_HOST : https://host-140-118.rdext.crim.ca PAVICS-e2e-workflow-tests Pipeline ResultsTests URL : http://daccs-jenkins.crim.ca:80/job/PAVICS-e2e-workflow-tests/job/master/1365/NOTEBOOK TEST RESULTS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Just a few documentation and clean-up comments below:
${JUPYTERHUB_STOP_IDLE_CONFIG} | ||
" | ||
|
||
#export DELAYED_EVAL=" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove commented code please
# idle timeout of a user jupyterhub server to be shut down automatically (default: 1 day) | ||
export JUPYTERHUB_STOP_IDLE_TIMEOUT=${JUPYTERHUB_STOP_IDLE_TIMEOUT:-86400} | ||
|
||
# maximum age before stoping user servers (includiung active/running ones, regardless of last activity status) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this maximum age in seconds? Pleas specify the unit of time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing optional-components/README.rst
and CHANGES.md
update.
Also an existing idle culling mechanism was already possible via env.local
, see
birdhouse-deploy/birdhouse/env.local.example
Lines 338 to 350 in 2da7a69
#export JUPYTERHUB_CONFIG_OVERRIDE=" | |
# | |
# Sample below will shutdown idle server after 3 days and idle kernel after 1 day. | |
# | |
#c.Spawner.args.extend([ | |
## Shut down the server after N seconds with no kernels or terminals running and no activity. | |
#'--NotebookApp.shutdown_no_activity_timeout={}'.format(3*24*60*60) , # 3 days | |
## Timeout (in seconds) after which a kernel is considered idle and ready to be culled. | |
#'--MappingKernelManager.cull_idle_timeout={}'.format(24*60*60), # 1 day | |
## Culling kernels which have one or more connections for idle but open notebooks. | |
## Otherwise, browser have to be closed for culling to work. | |
#'--MappingKernelManager.cull_connected=True', | |
#]) |
Not clear to me how better or different this external idle culling differ but I wonder if this is some changes between Jupyterhub v1 vs v4. I would say to keep this external idle culler in case the built-in idle culler do not work anymore in v4.
Please in the optional-components/README.rst
, mention the existing internal idle culler documented in env.local.example
, to give users an option if one of the 2 works better or just plain do not work anymore.
Has this newer culler been tested by lowering the timeout to very short amount?
@@ -187,4 +187,5 @@ blocked_users = {'authtest', '${CATALOG_USERNAME}', 'anonymous'} | |||
c.Authenticator.blacklist = blocked_users # v0.9+ | |||
c.Authenticator.blocked_users = blocked_users # v1.2+ | |||
|
|||
${JUPYTERHUB_CONFIG_EXTENDED} # noqa |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not re-use the existing JUPYTERHUB_CONFIG_OVERRIDE
?
If you want to keep both, please leave 2 lines between the 2 variables so that when both are actually in use, they do not look mixed up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried doing that, but given that some instances already use export JUPYTERHUB_CONFIG_OVERRIDE=...
in their env.local
, which is sourced when runninig pavics-compose.sh
, any overrides applied along the way get ignored.
Because of how JUPYTERHUB_CONFIG_EXTENDED
gets defined using extra newlines, they are already separated by multiple extra lines. I could add more though, or even add comments in the template to separate them explicitly.
# Any component extending this configuration using this variable *MUST* include it back to allow further | ||
# components to extend it as well. Also, they *MUST* extend the variable with each new definitions applied on | ||
# a new line (without indents) to ensure that Python configurations will be executable without syntax error. | ||
# This variable differs from 'JUPYTERHUB_CONFIG_OVERRIDE' that does not require users to include it back to make |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean by "include it back"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doing something in the form:
export JUPYTERHUB_CONFIG_EXTENDED="
${JUPYTERHUB_CONFIG_EXTENDED}
<extra python here>
"
The README of the external jupyterhub-idle-culler also mention about the JupyterHub can already cull idle jupyter itself without explaining why would someone need to use this external version https://github.com/jupyterhub/jupyterhub-idle-culler/blob/737dfa155b809453e6e795dd9de42d6a926fd4a0/README.md?plain=1#L263-L274 This is rather confusing to me. @mishaschwartz what do you think about keeping 2 different culling options? Would this confuse more the users? |
Yeah, I'd rather keep one version or the other. It will definitely confuse users. Can we test out if the old version still works and if not, remove the comment from env.local. |
Yes. I would like to test existing approach, and if it works, only keep that one. |
Yes, true, anything in production will need this to avoid long running idle jupyter containers wasting ram and cpu. Then I would suggest moving that snippet of code into the default jupyterhub config and just make those timeout limits config variables.
Being a production server with many users, we faced this issue #67 and although this culling might not fixed the root cause, it probably helped. |
@tlvu @mishaschwartz From my reading, it would seem the internal culler is sufficient for our use case. So the config might need some extra parameters, but I would stick with only the internal parameters if those work by themselves. |
## Overview Add new variables to easily configure idle jupyter user instances. ## Changes **Non-breaking changes** - Jupyterhub configurable idle server culling. - Add optional variables `JUPYTER_IDLE_SERVER_CULL_TIMEOUT`, `JUPYTER_IDLE_KERNEL_CULL_TIMEOUT` and `JUPYTER_IDLE_KERNEL_CULL_INTERVAL` that allows fined-grained configuration of user-kernel and server-wide docker image culling when their activity status reached a certain idle timeout threshold. - Enable idle kernel culling by default with a timeout of 1 day, and user server culling with timeout of 3 days. - Avoids the need for custom `JUPYTERHUB_CONFIG_OVERRIDE` specifically for idle server culling. If similar argument parameters should be defined using an older `JUPYTERHUB_CONFIG_OVERRIDE` definition, the new configuration strategy can be skipped by setting `JUPYTER_IDLE_KERNEL_CULL_TIMEOUT=0`. **Breaking changes** - n/a ## Related Issue / Discussion - Closes #389 (replaces) - Closes Ouranosinc/jupyterhub#21 (not required anymore)
Overview
Jupyter instances that have been started by users are simply left in a running status indefinitely if not manually logged of.
This causes dockers
jupyter-<user>
to remain active, using resources unnecessarily.This feature adds a new component that helps quickly detecting idle jupyter servers, and stopping them after a given inactivity timeout. The logic behind it depends on what JupyterHub reports on its activity API endpoint for the given user.
To have the utility available, the following requirements must be applied: Ouranosinc/jupyterhub#21
To test quickly, simply build the Docker on this branch and override
JUPYTERHUB_DOCKER
andJUPYTER_VERSION
accordingly, and enableoptional-components/jupyterhub-stop-idle
inEXTRA_CONF_DIRS
. SettingJUPYTERHUB_STOP_IDLE_TIMEOUT
can be set to adjust the timeout interval.Changes
Non-breaking changes
optional-components/jupyterhub-stop-idle
allowing culling of idle jupyter servers of users.Breaking changes
Related Issue / Discussion
To Do
Still investigating
The timeout frequency is properly applied and docker logs show that activity status checks are performed, but servers are still considerd in a "running" status for some reason, even if nothing is running on them, nor any tab refreshing them being open.
I will look more into that, but I want to open this PR right away to gather early feedback.