-
Notifications
You must be signed in to change notification settings - Fork 554
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Skip deallocating Gid when static Gid set #733
Conversation
Hi @kyanar. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: kyanar The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/cc @Ashley-wenyizha |
This issue is also currently preventing us from being able to leverage the newer versions of the efs-csi-driver for our production environments. |
Having this same issue as well in our environment since we are setting GIDs. Would appreciate it if this could be reviewed and merged please! |
@MarkSpencerTan @daghaian just so you're aware, if you're running into this issue it's because the EFS CSI driver has already failed to mount your EFS, because the bug is unfortunately in the exception handler, meaning it doesn't get a chance to output a useable error message. I recommend checking CloudTrail for any AccessDenied errors - in my case I was using a version of the IAM policy that did not confer efs:CreateAccessPoint on the CSI controller service account's IAM role. |
@kyanar Thanks for the suggestion. We did see AccessPointAlreadyExists errors being thrown inside CloudTrail so its plausible something is going on there. |
@kyanar thanks for the suggestions! We tried to figure out what was causing the AccessPointAlreadyExists errors that we were getting, however, it seems to be happening at random times or some sort of race condition and seems like the EFS CSI Driver handles this by just doing a retry and it works again. This is why we've never really seen any issues until we have started setting the Gid... When the Gid is set and this problem arises, the EFS CSI Driver pods errors out continuously, preventing any further PVCs from being created until the problematic PVCs are cleared out. I've captured the logs of the efs csi driver pod when this error happens without the Gid being set so you can see what it does normally: https://gist.github.com/MarkSpencerTan/96775d9b2b3043ce7647693ecce309be#file-gistfile1-txt-L163 Would appreciate it if this fix becomes available since this is causing the efs-csi-driver to be very unstable in cases where the Gid is being set. |
Unfortunately none of the approvers for this project appear to be active on GitHub - the most recent was 18th August, so I can't find anyone to review it. |
/cc @Ashley-wenyizha |
@Ashley-wenyizha this is a one line fix for an issue affecting quite a few users, can this be looked at please? |
Pull #850 corrects this behaviour. |
Is this a bug fix or adding new feature?
Bug fix
What is this PR about? / Why do we need it?
The efs-plugin crashes with a segmentation fault if there was an error creating an access point if the storage class is defined with a fixed uid and gid because it attempts to deallocate the gid using an uninitialised gidAllocator, and does not log the error for the cluster admin to resolve, as the code to log the error occurs after the segfault.
This PR adds a check to see if the "allocated gid" is the default Go int value, and skips attempting to deallocate it if is so.
What testing is done?
Is captured by existing test case for fixed uid/gid allocation