Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OFFLINE resizing woes #44

Closed
zuzzas opened this issue Jul 17, 2019 · 13 comments
Closed

OFFLINE resizing woes #44

zuzzas opened this issue Jul 17, 2019 · 13 comments

Comments

@zuzzas
Copy link
Contributor

zuzzas commented Jul 17, 2019

There seems to be a problem with operations ordering in OFFLINE resize situation.
First of all, I indicate supported plugin resize capabilities with PluginCapability_VolumeExpansion_OFFLINE and implement a ControllerExpandVolume method.

First naïve solution

Just hope that external-resizer somehow understands that state of a given PVC (Volume) and calls resize only when the Volume is Unpublished.

Does not work, calls ControllerExpandVolume once PVC in Kubernetes API is resized.

Return a gRPC error solution

There is an options to send back a gRPC error 9 FAILED_PRECONDITION which should be interpreted by caller (external-resizer) as Caller SHOULD ensure that volume is not published and retry with exponential back off. It kind of works, but there is another problem: Pod may be scheduled and ControllerPublishVolume may be called earlier than back off expires. Perhaps, there is a way to hold ControllerPublishVolume until resize completes? But is there a way to know that Volume has a resize pending?

@gnufied
Copy link
Contributor

gnufied commented Jul 24, 2019

I do not like holding ControllerPublishVolume until resize completes because that creates hard to recover failures if resizing fails and it keeps failing. Can you explain bit more - what happens if ControllerPublishVolume and ControllerExpandVolume do get called concurrently?

It is possible however to stop ControllerExpandVolume from being called if plugin does not have ONLINE capability and is attached somewhere. I was thinking we can do this by checking VolumeAttachment object for corresponding PV. We will have to probably loop with exponential backoff though and retry. It is still possible that - between the moment we check VolumeAttachment object and the moment resize was started, the state of the volume could change, so you could end up ControllerExpandVolume and ControllerPublishVolume being called concurrently..

@zuzzas
Copy link
Contributor Author

zuzzas commented Jul 24, 2019

Can you explain bit more - what happens if ControllerPublishVolume and ControllerExpandVolume do get called concurrently?

That haven't been happening, and, I admit, haven't tested this scenario. The issue is not in the concurrent calls, but with any call of ControllerExpandVolume when the Volume is published.

@gnufied
Copy link
Contributor

gnufied commented Jul 24, 2019

For volume types that don't implement attach/detach - it might be more tricky to decide if they are published to a node or not.

@zuzzas
Copy link
Contributor Author

zuzzas commented Jul 24, 2019

But is should be possible to detect attachment from Volume objects? Or have I not yet adopted the CSI way of thinking? :)

@gnufied
Copy link
Contributor

gnufied commented Jul 24, 2019

The issue is not in the concurrent calls, but with any call of ControllerExpandVolume when the Volume is published.

okay, it may be fixable but part of me also thinks that the CSI spec should be relaxed to use MAY rather than MUST (I still have not made up my mind!) word for this part.

In general though - why is this a problem if ControllerExpandVolume does get called when volume is already published? Is this a fencing issue? Can storage provider which only supports OFFLINE expansion not detect such a state and return error?

@gnufied
Copy link
Contributor

gnufied commented Jul 24, 2019

But is should be possible to detect attachment from Volume objects? Or have I not yet adopted the CSI way of thinking? :)

I am not saying driver should do any such detection but we can implement this check in external-resizer. But it is somewhat prone to race conditions.

@zuzzas
Copy link
Contributor Author

zuzzas commented Jul 24, 2019

In general though - why is this a problem if ControllerExpandVolume does get called when volume is already published? Is this a fencing issue? Can storage provider which only supports OFFLINE expansion not detect such a state and return error?

I've successfully implemented this logic. It does return error. Unfortunately, if I delete Pod, and Volume gets Unpublished, Kubernetes almost instantaneously schedules it back. external-resizer currently cannot insert ControllerExpandVolume call when Volume is Unpublised and Pod is non-existent.

@gnufied
Copy link
Contributor

gnufied commented Sep 11, 2019

I've successfully implemented this logic. It does return error. Unfortunately, if I delete Pod, and Volume gets Unpublished, Kubernetes almost instantaneously schedules it back.

Is this because pod is backed by a deployment etc? Shouldn't you really scale down deployment to 0, so as no replacement pod is created?

Even if we implemented checks in external-resizer to not call expansion if volume is published (possible after container-storage-interface/spec#374), this would still mean that in-tree controller-manager could controllerpublish the volume while resize was pending. So in reality what you want is to to prevent ControllerPublish if a volume has resize operation pending. IMO - that is going to be very hard to implement and I am not sure benefits it will bring.

@zuzzas
Copy link
Contributor Author

zuzzas commented Sep 11, 2019

Is this because pod is backed by a deployment etc? Shouldn't you really scale down deployment to 0, so as no replacement pod is created?

It is. And I do just that, but it seems hacky, doesn't it? How does the in-tree offline resizing work? I've never had a problem with it, no matted the storage provider, I've simply deleted a Pod, and some machinery (pardon my ignorance) applied the resize procedure without any problems whatsover.

Although, perhaps it instantaneously resized the volume in the cloud provider, and only executed resize2fs on the node once the Pod came back after deletion...

@gnufied
Copy link
Contributor

gnufied commented Sep 11, 2019

It works same for in-tree offline expansion.

@zuzzas
Copy link
Contributor Author

zuzzas commented Sep 11, 2019

Well, then I've got nothing. Perhaps I've imagined a problem, and it's possible to solve my woes through other means. Feel free to close the issue, I can't find any more arguments. :)

@zuzzas
Copy link
Contributor Author

zuzzas commented Sep 30, 2019

I have not thought about ensuring Volume resize at ControllerPublishVolume stage. That will happen before the volume is attached to the node. I feel so dumb...
Will report once I've successfully (hopefully) implement it this week.

@zuzzas
Copy link
Contributor Author

zuzzas commented Nov 1, 2019

Well, it worked. Don't have anything to add, should've come to this solution much earlier.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants