Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate Farallon Staging hub to this repository. #379

Merged
merged 14 commits into from
May 7, 2021

Conversation

yuvipanda
Copy link
Member

@yuvipanda yuvipanda commented May 3, 2021

kops produces kubeconfig files we can check-in to the repo,
and use to authenticate to kops clusters on AWS.

We test this by moving the farallon staging hub to be deployed
from here, for #368

TODO:

  • Figure out why kubeconfig from kops doesn't work after about 12h
  • Fix timeouts when mounting EFS shares
  • Fix hub deployment test when used with profiles
  • Fix Put GCP-only 'scratch bucket' behind a flag #374 (comment), which is causing failures because we're mounting the configmap unconditionally

@yuvipanda
Copy link
Member Author

Getting closer!

image

Now at:

2021-05-04T18:10:45Z [Warning] MountVolume.SetUp failed for volume "farallon-staging-home-nfs" : mount failed: exit status 32 Mounting command: systemd-run Mounting arguments: --description=Kubernetes transient mount for /var/lib/kubelet/pods/338bdd08-d55c-4a43-acf6-f653bcb87f27/volumes/kubernetes.io~nfs/farallon-staging-home-nfs --scope -- mount -t nfs -o noresvport,retrans=2,rsize=1048576,soft,timeo=600,wsize=1048576 fs-7b129903.efs.us-east-2.amazonaws.com:/homesfarallon-staging /var/lib/kubelet/pods/338bdd08-d55c-4a43-acf6-f653bcb87f27/volumes/kubernetes.io~nfs/farallon-staging-home-nfs Output: Running scope as unit: run-rf210704da9ed4f5bbe7837b331747708.scope mount.nfs: Connection timed out 

@yuvipanda
Copy link
Member Author

Figure out why kubeconfig from kops doesn't work after about 12h

      --admin duration[=18h0m0s]   export a cluster admin user credential with the given lifetime and add it to the cluster context

for kops export kubecfg farallon-2i2c.k8s.local

Makes sense it expires after 18h!

@yuvipanda
Copy link
Member Author

#380 fixes the timeouts. Turns out if you don't have a directory named /homes/farallon-staging but try to mount it, it'll just error with a timeout.

https://pilot-hubs.2i2c.org/en/latest/topic/storage-layer.html has more info about how we do storage and why this matters.

@damianavila
Copy link
Contributor

#380 fixes the timeouts. Turns out if you don't have a directory named /homes/farallon-staging but try to mount it, it'll just error with a timeout.

Great! I was going to suggest looking at mount targets and security groups that sometimes generate timeouts at the time to mount EFS.

@yuvipanda
Copy link
Member Author

@damianavila for the expiry of the generated kubeconfig, I see two options:

  1. Generate an AWS IAM role that can be used to ephemerally generate a kubeconfig, and modify our deploy script to use that
  2. Put a large number into --admin

Actually, we can probably do both - do (2) first and follow up with (1)

@yuvipanda
Copy link
Member Author

I've done (2), and opened #381 to track (1)

@damianavila
Copy link
Contributor

I've done (2), and opened #381 to track (1)

It makes sense to me in the current context.

- Needs to be nested under singleuser
- Explicitly set mem_limit too - otherwise, the default 1G
  limit of memory stays in, and k8s doesn't allow a limit to
  be smaller than a guarantee
EFS doesn't let you set ver=4.2
We mount this into all daskhub user accounts regardless -
by default they don't do anything
Needs to be automated eventually
With KUBECONFIG=secrets/farallon.yaml kops export kubecfg --admin=730h farallon-2i2c.k8s.local
Referencing base-hub from helm charts templates is basically
impossible - go templates can not use '-' in their name!
This makes referencing them easier
@yuvipanda yuvipanda changed the title [WIP] Add support for 'raw' kubeconfig files Migrate Farallon Staging hub to this repository. May 6, 2021
Copy link
Contributor

@damianavila damianavila left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I think this one is ready!

@yuvipanda yuvipanda merged commit 3e22399 into 2i2c-org:master May 7, 2021
@yuvipanda
Copy link
Member Author

Thanks for all the review, @damianavila!

yuvipanda added a commit to yuvipanda/pilot-hubs that referenced this pull request May 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants