-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Storage Options: Using Amazon EFS as Persistent Volume with Kubeflow 1.3 #32
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good work, have few comments.
I also noticed that perviously the entire driver deploy section was getting copied here.
Is there a way we can avoid this ?
4411a01
to
80ffc22
Compare
Not Sure I follow this comment, could you elaborate a little. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you also test the dynamic provision?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
apologies for the number of comments. I was watching a reinvent session and took sweet time to review this
distributions/aws/aws-efs-csi-driver/overlays/stable/kustomization.yaml
Outdated
Show resolved
Hide resolved
distributions/aws/aws-efs-csi-driver/base/controller-serviceaccount.yaml
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks better now, few comments
|
||
1. Use the `$file_system_id` you recorded before or use the following command to get the efs filesystem id - | ||
``` | ||
aws efs describe-file-systems --query "FileSystems[*].FileSystemId" --output text |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assign to variable so it can be used in next command.
also, please add --region
in the command
recorded before? as in previous step?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually those steps have been moved out of the README, let me reword this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This results in multiple ids and there is no direct way to filter these. This step will have to be slightly manual if the specified $file_system_id
variable if not already populated from section 4.
### 2. Build and Push the Docker image | ||
In the `training-sample` directory, we have provided a sample training script and Dockerfile which you can use as follows to build a docker image- | ||
``` | ||
export dockerImage=image-classification:no-data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: export vars are caps in other readmes
also, callout to replace the docker image uri?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you mean callout to use their own image if they wanted to ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No I mean they will have to change the URI if they have to push to their repo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, I actually removed the hardcoded defaults altogether, they didnt add any value.
e9e3214
to
008e89b
Compare
## 5.0 Using EFS Storage in Kubeflow | ||
|
||
## 5.1 Provisioning Options | ||
### Option 1 - Static Provisioning |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is the option 2 for dynamic provisioning ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned in standup (and somewhere on this PR) I was planning to add that in a separate PR since this one has been long pending and that needs additional testing.
kubectl apply -f storage-efs/sample/pv.yaml | ||
kubectl apply -f storage-efs/sample/pvc.yaml | ||
kubectl apply -f storage-efs/sample/sc.yaml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This order does not make sense to me.
you are referring referring storageClassName: efs-sc
in pv and pvc files.
I think the right order is sc, pv and pvc
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We discussed this before on this PR -
#32 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It still does not make sense that you are applying something in future but referencing it in present.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
kubectl apply -f storage-efs/sample/sc.yaml | ||
``` | ||
|
||
## 5.2 Check your Setup |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I missed that but you don't need to see dashboard to verify the volumes are up.
I think command on get pvc
should tell you the status of pvc
whether it has been bounded
or not
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Either option works and we are anyway logging onto the dashboard to create a notebook, any reason to prefer one over the other ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess its fine to check the dashboard too..but like how you're verifying the csi driver, you can verify the pvc is bounded or not. This maintains the consistency in your instructions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding both options.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Take a closer look, some of the comments were made before as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for taking care of all the comments.
…1.3 (awslabs#32) * efs-kubeflow-branch-13 * Add TrainingJob Sample * Refactor the EFS CSI Driver installation * Address Review Comments * Directory restructure * Minor comments addressed * More comments
Which issue is resolved by this Pull Request:
Adds the option to use Amazon EFS as Storage with Kubeflow.
Description of your changes:
This is an initial PR with the following changes -
examples/aws/storage-efs/sample
directory includes 4 spec files to demonstrate creating a persistentVolume, PersistentVolumeClaim, StorageClass and the README has instructions on how to use the same for static provisioning and usage via the Kubeflow Notebooks.Testing Done:
TBD:
Checklist:
Make sure you have installed kustomize == 3.2.1
make generate-changed-only
make test