Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fault tolerant storage for Jupterhub #19

Closed
jlewi opened this issue Dec 7, 2017 · 14 comments
Closed

Fault tolerant storage for Jupterhub #19

jlewi opened this issue Dec 7, 2017 · 14 comments

Comments

@jlewi
Copy link
Contributor

jlewi commented Dec 7, 2017

Jupyter pods are storing data in the pod volume. So if the pod dies you would lose any notebook/file edits.

We should be using a fault tolerant volume so that if the pod dies we don't lose our data.

/cc @foxish

@foxish
Copy link
Contributor

foxish commented Dec 8, 2017

As we discussed - this should just need additional config if we want PV per pod. If we want something that also allows sharing like NFS, we may need to add the necessary config to run that.

cc/ @yuvipanda for thoughts on sharing between notebooks.

@jlewi
Copy link
Contributor Author

jlewi commented Dec 8, 2017

I'd be happy with whatever's easiest. Do you have a pointer to the config that need's to change?

I'd like to start using Jupterlab on Kubeflow to write examples for Kubeflow. But I don't want to do that until we have fault tolerant storage.

@foxish
Copy link
Contributor

foxish commented Dec 8, 2017

The quickest thing may be using a PV per Jupyter pod. It needs adding a spawner option.

c.KubeSpawner.user_storage_pvc_ensure = True

It may need a storage class also to be set through c.KubeSpawner.user_storage_class if it doesn't auto-provision a PV automagically already in the environment it's being run in. There's a sample config here.

@yuvipanda
Copy link
Contributor

Note that for this and other JupyterHub config things, we have a helm chart in https://github.com/jupyterhub/zero-to-jupyterhub-k8s/tree/master/jupyterhub that's very actively maintained and heavily used. I'd love to re-use that here than build from scratch.

@yuvipanda
Copy link
Contributor

See docs about chart at z2jh.jupyter.org

@yuvipanda
Copy link
Contributor

To be clear, I'm happy for this to be using KubeSpawner + Vanillay JupyterHub too! But we've put in a bunch of effort into the helm chart to provide easy OOTB solutions to things like:

  1. Updating config of hub without disrupting users
  2. Automatic HTTPS
  3. Multiple authenticators
  4. Load tests (http://github.com/yuvipanda/jupyterhub-loadtest) to validate performance at scale, etc

That we want to make sure you can re-use all that work, and we can re-use any improvements you make to the hub deployment without having to re-invent it. I understand that not everyone wants to use helm, but I do want to try find a path to not having y'all duplicate all our work there...

@yuvipanda
Copy link
Contributor

As examples of other projects that include the JupyterHub helm chart as a dependency and build on it, see http://github.com/jupyterhub/binderhub. As examples of direct deployments that use JupyterHub and other charts, see http://github.com/jupyterhub/mybinder.org-deploy/ or http://github.com/berkeley-dsep-infra/datahub/ or http://github.com/berkeley-dsep-infra/data8xhub or http://github.com/yuvipanda/paws :)

@foxish
Copy link
Contributor

foxish commented Dec 8, 2017 via email

@yuvipanda
Copy link
Contributor

I agree too! The way I'd do it would be to make a Helm Chart for the TF controller and one for the Model Server, and KubeFlow then just configures them together (similar to the http://github.com/jupyterhub/mybinder.org-deploy/ pattern). IMO that's much more user friendly than having users edit kubernets objects directly and then apply them... I agree that using Helm for JupyterHub but something else for the other tools isn't a valid long term strategy.

I'm happy to take an initial shot at doing this if you'd like, which can easily be discarded too without hurting any of my feelings :)

@yuvipanda
Copy link
Contributor

In the meantime I've also created #22 which provides persistent storage for each user with the current setup :)

@foxish
Copy link
Contributor

foxish commented Dec 8, 2017 via email

@yuvipanda
Copy link
Contributor

That sounds like a good idea! ksonnet looks cool too! I am not attached to helm particularly, only against having end users directly edit kubernetes object specifications. 99% of the work that needs to happen to fully support this is in kubespawner anyway, so it's not a very big deal!

+1 on opening another issue. This discussion is already pretty off-topic for this issue :)

@jlewi
Copy link
Contributor Author

jlewi commented Dec 8, 2017

@foxish So I modified the JupterHub config jlewi#2 the PVC and PV are created. But it doesn't look tike the pod running Jupyter for me is attaching it as a volume

    volumeMounts:
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: no-api-access-please
      readOnly: true

@clkao
Copy link

clkao commented Jan 28, 2018

This issue is about user-pod storage persistence, related to #145. However the data used by the hub itself is also not persistent. IIRC the z2jh helm chart has hub pvc as well.

@jlewi jlewi added this to the Kubecon Europe milestone Feb 21, 2018
@inc0 inc0 mentioned this issue Feb 21, 2018
jlewi pushed a commit that referenced this issue Feb 23, 2018
While we were adding pvc for every jupyter instance, we didn't mount it
anywhere.
Let's mount it to work dir, as I assume this is dir where users will
likely put their notebooks. This will ensure that work will be retained
even if pod dies.

Fix #19
Fix #22
kimwnasptd pushed a commit to arrikto/kubeflow that referenced this issue Mar 5, 2019
yanniszark pushed a commit to arrikto/kubeflow that referenced this issue Feb 15, 2021
Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com>
VaishnaviHire pushed a commit to VaishnaviHire/kubeflow that referenced this issue Jul 22, 2022
Add e2e tests for the Notebook Controllers
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants