-
Notifications
You must be signed in to change notification settings - Fork 133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OFFLINE resizing woes #44
Comments
I do not like holding It is possible however to stop |
That haven't been happening, and, I admit, haven't tested this scenario. The issue is not in the concurrent calls, but with any call of |
For volume types that don't implement attach/detach - it might be more tricky to decide if they are published to a node or not. |
But is should be possible to detect attachment from Volume objects? Or have I not yet adopted the CSI way of thinking? :) |
okay, it may be fixable but part of me also thinks that the CSI spec should be relaxed to use MAY rather than MUST (I still have not made up my mind!) word for this part. In general though - why is this a problem if |
I am not saying driver should do any such detection but we can implement this check in external-resizer. But it is somewhat prone to race conditions. |
I've successfully implemented this logic. It does return error. Unfortunately, if I delete Pod, and Volume gets Unpublished, Kubernetes almost instantaneously schedules it back. |
Is this because pod is backed by a deployment etc? Shouldn't you really scale down deployment to 0, so as no replacement pod is created? Even if we implemented checks in external-resizer to not call expansion if volume is published (possible after container-storage-interface/spec#374), this would still mean that in-tree controller-manager could controllerpublish the volume while resize was pending. So in reality what you want is to to prevent |
It is. And I do just that, but it seems hacky, doesn't it? How does the in-tree offline resizing work? I've never had a problem with it, no matted the storage provider, I've simply deleted a Pod, and some machinery (pardon my ignorance) applied the resize procedure without any problems whatsover. Although, perhaps it instantaneously resized the volume in the cloud provider, and only executed |
It works same for in-tree offline expansion. |
Well, then I've got nothing. Perhaps I've imagined a problem, and it's possible to solve my woes through other means. Feel free to close the issue, I can't find any more arguments. :) |
I have not thought about ensuring Volume resize at ControllerPublishVolume stage. That will happen before the volume is attached to the node. I feel so dumb... |
Well, it worked. Don't have anything to add, should've come to this solution much earlier. |
There seems to be a problem with operations ordering in OFFLINE resize situation.
First of all, I indicate supported plugin resize capabilities with
PluginCapability_VolumeExpansion_OFFLINE
and implement a ControllerExpandVolume method.First naïve solution
Just hope that external-resizer somehow understands that state of a given PVC (Volume) and calls resize only when the Volume is Unpublished.
Does not work, calls ControllerExpandVolume once PVC in Kubernetes API is resized.
Return a gRPC error solution
There is an options to send back a gRPC error
9 FAILED_PRECONDITION
which should be interpreted by caller (external-resizer) asCaller SHOULD ensure that volume is not published and retry with exponential back off
. It kind of works, but there is another problem: Pod may be scheduled and ControllerPublishVolume may be called earlier than back off expires. Perhaps, there is a way to hold ControllerPublishVolume until resize completes? But is there a way to know that Volume has a resize pending?The text was updated successfully, but these errors were encountered: