-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fault tolerant storage for Jupterhub #19
Comments
As we discussed - this should just need additional config if we want PV per pod. If we want something that also allows sharing like NFS, we may need to add the necessary config to run that. cc/ @yuvipanda for thoughts on sharing between notebooks. |
I'd be happy with whatever's easiest. Do you have a pointer to the config that need's to change? I'd like to start using Jupterlab on Kubeflow to write examples for Kubeflow. But I don't want to do that until we have fault tolerant storage. |
The quickest thing may be using a PV per Jupyter pod. It needs adding a spawner option.
It may need a storage class also to be set through |
Note that for this and other JupyterHub config things, we have a helm chart in https://github.com/jupyterhub/zero-to-jupyterhub-k8s/tree/master/jupyterhub that's very actively maintained and heavily used. I'd love to re-use that here than build from scratch. |
See docs about chart at z2jh.jupyter.org |
To be clear, I'm happy for this to be using KubeSpawner + Vanillay JupyterHub too! But we've put in a bunch of effort into the helm chart to provide easy OOTB solutions to things like:
That we want to make sure you can re-use all that work, and we can re-use any improvements you make to the hub deployment without having to re-invent it. I understand that not everyone wants to use helm, but I do want to try find a path to not having y'all duplicate all our work there... |
As examples of other projects that include the JupyterHub helm chart as a dependency and build on it, see http://github.com/jupyterhub/binderhub. As examples of direct deployments that use JupyterHub and other charts, see http://github.com/jupyterhub/mybinder.org-deploy/ or http://github.com/berkeley-dsep-infra/datahub/ or http://github.com/berkeley-dsep-infra/data8xhub or http://github.com/yuvipanda/paws :) |
Good point, and I totally agree. But I do think we need to come up with a
cohesive solution here suitable for hub, and the other components in this
repo. Having different deployment mechanisms for different parts of this
effort will lead to more confusion. So, as soon as we get to that unified
deployment solution, we can try match upstream as much as possible.
…On Dec 7, 2017 11:04 PM, "Yuvi Panda" ***@***.***> wrote:
To be clear, I'm happy for this to be using KubeSpawner + Vanillay
JupyterHub too! But we've put in a bunch of effort into the helm chart to
provide easy OOTB solutions to things like:
1. Updating config of hub without disrupting users
2. Automatic HTTPS
3. Multiple authenticators
4. Load tests (http://github.com/yuvipanda/jupyterhub-loadtest
<https://github.com/yuvipanda/jupyterhub-loadtest>) to validate
performance at scale, etc
That we want to make sure you can re-use all that work, and we can re-use
any improvements you make to the hub deployment without having to re-invent
it. I understand that not everyone wants to use helm, but I do want to try
find a path to not having y'all duplicate all our work there...
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#19 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AA3U5xGWAYVDtpnwBkrxBE4XJCpJ23Qqks5s-MNxgaJpZM4Q6Jlz>
.
|
I agree too! The way I'd do it would be to make a Helm Chart for the TF controller and one for the Model Server, and KubeFlow then just configures them together (similar to the http://github.com/jupyterhub/mybinder.org-deploy/ pattern). IMO that's much more user friendly than having users edit kubernets objects directly and then apply them... I agree that using Helm for JupyterHub but something else for the other tools isn't a valid long term strategy. I'm happy to take an initial shot at doing this if you'd like, which can easily be discarded too without hurting any of my feelings :) |
In the meantime I've also created #22 which provides persistent storage for each user with the current setup :) |
@jlewi and I also had discussed ksonnet as one potential mechanism here and
I'm not sure where the others stand.
Another thought is that we could be more opinionated here in our config and
possibly have less knobs since we are targeting a very specific ML use-case
as opposed to the upstream project and chart which are aiming at a broader
use-case.
Happy to discuss this more. Perhaps we should have a "how do we manage
config" issue, enumerate the options, pros and cons and go from there.
…On Dec 7, 2017 11:12 PM, "Yuvi Panda" ***@***.***> wrote:
I agree too! The way I'd do it would be to make a Helm Chart for the TF
controller and one for the Model Server, and KubeFlow then just configures
them together (similar to the http://github.com/jupyterhub/
mybinder.org-deploy/ <https://github.com/jupyterhub/mybinder.org-deploy/>
pattern). IMO that's much more user friendly than having users edit
kubernets objects directly and then apply them... I agree that using Helm
for JupyterHub but something else for the other tools isn't a valid long
term strategy.
I'm happy to take an initial shot at doing this if you'd like, which can
easily be discarded too without hurting any of my feelings :)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#19 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AA3U517hkwOEK4zpIAAUvsMST-XwxEBSks5s-MVZgaJpZM4Q6Jlz>
.
|
That sounds like a good idea! ksonnet looks cool too! I am not attached to helm particularly, only against having end users directly edit kubernetes object specifications. 99% of the work that needs to happen to fully support this is in kubespawner anyway, so it's not a very big deal! +1 on opening another issue. This discussion is already pretty off-topic for this issue :) |
This issue is about user-pod storage persistence, related to #145. However the data used by the hub itself is also not persistent. IIRC the z2jh helm chart has hub pvc as well. |
Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com>
Add e2e tests for the Notebook Controllers
Jupyter pods are storing data in the pod volume. So if the pod dies you would lose any notebook/file edits.
We should be using a fault tolerant volume so that if the pod dies we don't lose our data.
/cc @foxish
The text was updated successfully, but these errors were encountered: