Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redesign mount propagation #659

Merged
merged 5 commits into from
Aug 22, 2017

Conversation

jsafrane
Copy link
Member

The proposal won't work as it was merged, it makes too many directories as shared (see #648).

A different approach is needed, I've chosen 'Add an option in VolumeMount API', but I would be fine also with 'Add an option in HostPathVolumeSource', there is only very small difference to me.

The proposal also describes how it will be implemented, especially during alpha phase.

Fixes #648

@kubernetes/sig-node-proposals @kubernetes/sig-storage-proposals

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label May 24, 2017
@k8s-ci-robot
Copy link
Contributor

@jsafrane: These labels do not exist in this repository: sig/node, sig/storage.

In response to this:

The proposal won't work as it was merged, it makes too many directories as shared (see #648).

A different approach is needed, I've chosen 'Add an option in VolumeMount API', but I would be fine also with 'Add an option in HostPathVolumeSource', there is only very small difference to me.

The proposal also describes how it will be implemented, especially during alpha phase.

Fixes #648

@kubernetes/sig-node-proposals @kubernetes/sig-storage-proposals

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@jsafrane
Copy link
Member Author

For alpha annotation, I know there is attempt to have real fields in alpha in https://docs.google.com/document/d/1wuoSqHkeT51mQQ7dIFhUKrdi3-1wbKrNWeIL4cKb9zU/edit#, but it seems it has not been implemented yet. Alpha annotation is still described as preferred way to introduce new fields in https://github.com/kubernetes/community/blob/master/contributors/devel/api_changes.md

@yujuhong
Copy link
Contributor

Adding a couple of folks who were on the original proposal.

/cc @vishh @euank

@k8s-ci-robot k8s-ci-robot requested review from euank and vishh May 26, 2017 00:27
@dchen1107
Copy link
Member

We discussed at sig-node meeting, and agreed with the redesign the proposal scope. But this feature is not targeted for 1.7 release.

@inc0
Copy link

inc0 commented May 30, 2017

This is required change for openstack-on-kubernetes too. Could we please add sig-openstack to this review?

@kfox1111
Copy link

+1 for sig-openstack.

I would also greatly benefit from it for some high energy physics work I do.

If the patch set is ready for 1.7 will it go in?

Is there anything I can do to help get it ready faster?

@ddysher
Copy link
Contributor

ddysher commented Jun 6, 2017

This is also required for local volume provisioner. The provisioner is running as a daemonset to monitor local volume and post PV to kubernetes. To correctly detect volume capacity, the daemonset pod must mount hostpath (path where volume is detected) as rshared.

Are we targeting alpha in 1.8 now?

@jsafrane
Copy link
Member Author

jsafrane commented Jun 6, 2017

yes, we missed 1.7

@ddysher
Copy link
Contributor

ddysher commented Jun 6, 2017

Thanks for confirm. I'd be happy to test it out when implementation is out.

@jsafrane
Copy link
Member Author

jsafrane commented Jun 8, 2017

I updated the PR with the latest development:

  • added links to Kubernetes PRs (waiting for review)
  • added note that the default propagation should be rslave. This is what was agreed in the previous proposal, now it's more explicit.
  • I tested my PRs with various distros and I stumbled upon Google COS (former GCI). It runs Docker in its own mount namespace with slave mount propagation. I.e. it can't ever run a pod with shared mount propagation. There should be a check in kubelet that tests if it can run a pod with shared mount propagation, at least on /var/lib/kubelet, and refuse to start if it can't.
    • This test is quite tricky, the best would be to run some dummy container (gcr.io/*/pause?) with docker run -v /var/lib/kubelet:/tmp:rshared and check if it failed.
    • Note that COS-60 fixes this issue, however the test might be useful if someone runs another old or weird unsupported distro.

it will be marked as deprecated from the beginning.
Developers / testers can enable it in their clusters manually.
Mount propagation may be redesigned or even removed in any future release.
* The default mount propagation will be `rslave`, which is different to current
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Earlier, it says the default is private, to not break backwards compatiblity. Why is it rslave here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for spotting this, the default should be rslave, which should not break backward compatibility (testing needed!) and enables this use case kubernetes/kubernetes#46444 (comment)

@kfox1111
Copy link

Ping. What is the status of getting this into 1.8? I don't see it marked for 1.8.

I'm stuck on k8s 1.4 currently, as I need this kind of functionality and docker 1.9 is the last version to default mounts to shared and k8s 1.4 is the last version to support docker 1.9.

I would be happy if this was merged but hidden behind an experimental flag or something, just so those of us that need to make forward progress can do so.

@kfox1111
Copy link

ping

@luxas
Copy link
Member

luxas commented Jul 20, 2017

@jsafrane It would indeed be nice to have this as alpha in v1.8

@jsafrane
Copy link
Member Author

@luxas, all that's missing is review of #46444

@saad-ali
Copy link
Member

/lgtm
/approve

We discussed this design in a meeting on Monday July 24. I am fine with this.

One thing I just remembered is that there is a pending design to add "HostPathType" to hostpath volumesource: kubernetes/kubernetes#46597
Let's make sure that these two options (propagation and hostPathType play well together)

CC @timstclair @dixudx

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 28, 2017
@dixudx
Copy link
Member

dixudx commented Jul 28, 2017

@saad-ali propagation will not conflict with HostPathType.

@jsafrane
Copy link
Member Author

@timstclair was there any discussion about feature gate vs --exeprimental cmdline argument? I am happy with both, I just need a decision.

@jsafrane
Copy link
Member Author

jsafrane commented Aug 9, 2017

I would like to understand what the pros/cons of a new volume type would be.

So, let there be LinuxVolumeSource:

type LinuxVolume struct {
    MountPropagation string
    WrappedVolume *VolumeSource
}

type VolumeSource struct {
   ... aws, gce, hostPath, emptyDir, git, ...
    LinuxVolume *LinuxVolumeSource
}
  1. It's recursive. Recursive structures in API are IMO ugly.
  2. Does not help with application portability at all. If an application needed mount propagation it won't work on macOS / Windows at all, regardless if LinuxVolume or HostPath was used.
  3. I do not see any added value other than increased complexity and confusion.

In my world, VolumeSource describes where to get the data from. I still personally think that it should not include things like readOnly or mountPropagation. It should be in VolumeMount struct which says how the data should be made available. It's much easier to validate it, it goes well along with VolumeMount.readOnly. I can accept that HostPathVolumeSource and EmptyDirVolumeSource is the right place, it will work well for all use cases that I know about. I don't see any reason why to invent a new volume type.

@bgrant0607
Copy link
Member

@kfox1111
Copy link

kfox1111 commented Aug 9, 2017

I agree. Properties on how something gets mounted is a totally different thing than what the thing storing the data is. hostPath, emptyDir, etc will need the flag. not, recreate volume types for rsharedHostPath and rsharedEmptyDir. Once namedEmptyDir becomes a thing, it will want it too.

@kfox1111
Copy link

kfox1111 commented Aug 9, 2017

@thockin yeah, I get the desire for a correct api. Totally a good thing to get right. Just saying, for an alpha, getting it perfect is not so critical. I'm totally fine accepting the risk of writing software against an api that is expected to drastically change in future releases. That part is easy. Living without this feature is really really hard. I'm facing a 1.4 -> 1.8 migration path already, with broken steps in between. :/ Several of these releases are already unsupported.

As for openstack on k8s, it can't really be a complete thing until this feature merges in some form or another. Neutron really needs it.

* Node conformance suite will check that mount propagation in /var/lib/kubelet
works.
* During alpha, all the behavior above must be explicitly enabled by
`kubelet --feature-gates=MountPropagation=true`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ICYMI

Field gate proposal: #869

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@@ -73,13 +79,18 @@ type VolumeMount struct {
// Required.
MountPath string `json:"mountPath"`
// Optional.
Propagation string `json:"propagation"`
Propagation PropagationMode `json:"propagation,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This field name is very vague. So is the file name, by the way. Propagation of what? In what direction?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This field name is very vague... Propagation of what? In what direction?

Renamed to MountPropagation with more documentation around. Still, I don't want to put whole kernel documentation there.

So is the file name

What file name?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proposal filename.

@thockin
Copy link
Member

thockin commented Aug 11, 2017

This change is Reviewable

@jsafrane
Copy link
Member Author

We had a productive meeting yesterday with @thockin and @saad-ali and we're back at "Add an option in VolumeMount API".

  • While we do not see any use case for any other volume than HostPath or EmptyDir to be rshared, we will not block it, at least not in alpha. It must be explicitly enabled and having rshared GCE volume is not more harmful than HostPath.

  • Renamed (r)shared and (r)slave in the API to Bidirectional and HostToContainer.

  • Removed private as we do not see any use case for it.

  • We use enum instead of simple bool to be able to add private and non-recursive shared and slave if we need so. We can't extend bool so easily.

I did more edits describing what exactly will be in alpha, see the last commit.

@bgrant0607, can you please look at the API again? PR with Kubernetes changes will follow very shortly to catch 1.8

@saad-ali
Copy link
Member

Revised proposal:

/lgtm
/approve

Thanks @jsafrane

@k8s-github-robot k8s-github-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed lgtm "Looks good to me", indicates that a PR is ready to be merged. labels Aug 15, 2017
@saad-ali
Copy link
Member

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 22, 2017
@k8s-github-robot
Copy link

Automatic merge from submit-queue

@k8s-github-robot k8s-github-robot merged commit c9f14de into kubernetes:master Aug 22, 2017
@bgrant0607
Copy link
Member

Github notifications are useless :-(

k8s-github-robot pushed a commit to kubernetes/kubernetes that referenced this pull request Sep 2, 2017
Automatic merge from submit-queue (batch tested with PRs 45724, 48051, 46444, 51056, 51605)

Mount propagation in kubelet

Together with #45724 it implements mount propagation as proposed in kubernetes/community#659

There is:

- New alpha annotation that allows user to explicitly set propagation mode for each `VolumeMount` in pod containers (to be replaced with real `VolumeMount.Propagation` field during beta) + validation + tests. "Private" is the default one (= no change to existing pods).

  I know about proposal for real API fields for alpha feature in https://docs.google.com/document/d/1wuoSqHkeT51mQQ7dIFhUKrdi3-1wbKrNWeIL4cKb9zU/edit, but it seems it's not implemented yet. It would save me quite lot of code and ugly annotation.

- Updated CRI API to transport chosen propagation to Docker.

- New `kubelet --experimental-mount-propagation` option to enable the previous bullet without modifying types.go (worked around with changing `KubeletDeps`... not nice, but it's better than adding a parameter to `NewMainKubelet` and removing it in the next release...)

```release-note
kubelet has alpha support for mount propagation. It is disabled by default and it is there for testing only. This feature may be redesigned or even removed in a future release.
```

@derekwaynecarr @dchen1107 @kubernetes/sig-node-pr-reviews
k8s-github-robot pushed a commit to kubernetes/kubernetes that referenced this pull request Apr 13, 2018
Automatic merge from submit-queue (batch tested with PRs 60476, 62462, 61391, 62535, 62394). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Private mount propagation

This PR changes the default mount propagation from "rslave" (newly added in 1.10) to "private" (default in 1.9 and before). "rslave" as default causes regressions, see below.

Value `"None"` has to be added to `MountPropagationMode` enum in API ("I don't want any propagation at all"), which translates to "private" on Linux. [We did not have use cases for it](kubernetes/community#659 (comment)), but we have them now.

**Which issue(s) this PR fixes**
Fixes #62397, fixes #62396

**Special notes for your reviewer**:
CRI already has an option for private mount propagation in volumes, however it's called "PRIVATE", while Kubernetes API value is "None". I did not change PRIVATE to NONE to keep the interface stable. See `kubelet_pods.go`.

**Release note**:
```release-note
Default mount propagation has changed from "HostToContainer" ("rslave" in Linux terminology) to "None" ("private") to match the behavior in 1.9 and earlier releases. "HostToContainer" as a default caused regressions in some pods.
```


/sig storage
/sig node
k8s-publishing-bot added a commit to kubernetes/api that referenced this pull request Apr 13, 2018
Automatic merge from submit-queue (batch tested with PRs 60476, 62462, 61391, 62535, 62394). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Private mount propagation

This PR changes the default mount propagation from "rslave" (newly added in 1.10) to "private" (default in 1.9 and before). "rslave" as default causes regressions, see below.

Value `"None"` has to be added to `MountPropagationMode` enum in API ("I don't want any propagation at all"), which translates to "private" on Linux. [We did not have use cases for it](kubernetes/community#659 (comment)), but we have them now.

**Which issue(s) this PR fixes**
Fixes #62397, fixes #62396

**Special notes for your reviewer**:
CRI already has an option for private mount propagation in volumes, however it's called "PRIVATE", while Kubernetes API value is "None". I did not change PRIVATE to NONE to keep the interface stable. See `kubelet_pods.go`.

**Release note**:
```release-note
Default mount propagation has changed from "HostToContainer" ("rslave" in Linux terminology) to "None" ("private") to match the behavior in 1.9 and earlier releases. "HostToContainer" as a default caused regressions in some pods.
```

/sig storage
/sig node

Kubernetes-commit: 92065407b783a6de56ce181967c2a7778aa6521a
MadhavJivrajani pushed a commit to MadhavJivrajani/community that referenced this pull request Nov 30, 2021
Automatic merge from submit-queue

Redesign mount propagation

The proposal won't work as it was merged, it makes too many directories as shared (see kubernetes#648).

A different approach is needed, I've chosen 'Add an option in VolumeMount API', but I would be fine also with 'Add an option in HostPathVolumeSource', there is only very small difference to me.

The proposal also describes how it will be implemented, especially during alpha phase.

Fixes kubernetes#648

@kubernetes/sig-node-proposals @kubernetes/sig-storage-proposals
danehans pushed a commit to danehans/community that referenced this pull request Jul 18, 2023
Signed-off-by: Zou Nengren <zouyee1989@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.