Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support idmap mounts for volumes #3717

Merged
merged 11 commits into from
Jul 17, 2023
Merged

Conversation

rata
Copy link
Member

@rata rata commented Feb 1, 2023

This PR implements support for this runtime-spec change that added idmap mounts support: opencontainers/runtime-spec#1143

We open the idmap source paths and call mount_setattr() in runc PARENT,
as we need privileges in the init userns for that, and then sends the
fds to the child process. For this fd passing we use the same mechanism
used in other parts of thecode, the _LIBCONTAINER_ env vars.

The mount is finished (unix.MoveMount) from go code, inside the userns,
so we reuse all the prepareBindMount() security checks and the remount
logic for some flags too.

This PR only supports idmap mounts when userns are used AND the mappings
are the same specified for the userns mapping. This limitation is to
simplify the initial implementation, as all our users so far only need
this, and we can avoid sending over netlink the mappings, creating a
userns with this custom mapping, etc. Future PRs will remove this
limitation.

As the idmap case is quite similar to the existing mount sources case we
open with O_PATH, some simple refactors are done to share more code and
to group the slices of fds in go code. To that end, we created the
mountFds struct, and add all the slices of fds there.

This replaces PR #3429, as that PR tries to mount already when we are inside the user namespace. AFAIK, that will never work and therefore this PR tries a completely different way to do the idmap mounts.

This PR is co-authored-by: Francis Laniel <flaniel@linux.microsoft.com>

cc @eiffel-fl

Question

One open question I have is I added most validations in libcontainer/configs/validate/validator.go but I couldn't help to notice that other parts of the code do some other validations. All examples that I tried hit the validations, but am I missing adding the validations in some other part, that don't use this maybe when runc is called differently? Or adding them there should be enough?

TODO

  • Pass idmap mount sources using bootstrap data (now hardcoded an example path)
  • Add integration tests
  • Modify the mountToRootfs() function to not move_mount and finish the idmap mount
  • Update runtime spec to have the new fields for UID/GID mappings
  • Address all TODOs I added in this PR

Changelog entry

  • Support idmap mounts as specified in the OCI runtime-spec. Currently the mount mappings need to be identical to the mappings used in the user namespace section.
  • Enforce absolute paths for mounts

Closes: #3429
Closes: #3020

@rata rata force-pushed the rata/idmap branch 2 times, most recently from df4f448 to 95170b5 Compare February 1, 2023 11:38
@rata
Copy link
Member Author

rata commented Feb 1, 2023

@AkihiroSuda can we add this to the 1.2 milestone? 🙏

@rata
Copy link
Member Author

rata commented Feb 3, 2023

We have something working now. Will polish and push next week. Still tests and that missing.

@rata rata force-pushed the rata/idmap branch 2 times, most recently from ee112d5 to 409c31e Compare February 9, 2023 12:23
@rata rata changed the title WIP: Support idmap mounts for volumes Support idmap mounts for volumes Feb 9, 2023
@rata rata marked this pull request as ready for review February 9, 2023 12:23
@rata rata force-pushed the rata/idmap branch 6 times, most recently from 39913c4 to 854940e Compare February 9, 2023 13:35
@rata
Copy link
Member Author

rata commented Feb 9, 2023

Test failures seem unrelated, for example:

make[1]: Entering directory '/go/src/github.com/opencontainers/runc'
fatal: detected dubious ownership in repository at '/go/src/github.com/opencontainers/runc'
To add an exception for this directory, call:

	git config --global --add safe.directory /go/src/github.com/opencontainers/runc
go build -trimpath -buildmode=pie -a -tags "seccomp urfave_cli_no_docs netgo osusergo" -ldflags "-X main.gitCommit= -X main.version=1.1.0+dev -linkmode external -extldflags --static-pie -w -s -buildid=" -o runc .
error obtaining VCS status: exit status 128
	Use -buildvcs=false to disable VCS stamping.
make[1]: *** [Makefile:69: static] Error 1

or go 1.19 with -race failing due to criu:

Run # criu repo
Warning: apt-key output should not be parsed (stdout is not a terminal)
curl: (92) HTTP/2 stream 0 was not closed cleanly: PROTOCOL_ERROR (err 1)
gpg: no valid OpenPGP data found.
Error: Process completed with exit code 2.

@AkihiroSuda
Copy link
Member

Please rebase, then the CI should be green

@thaJeztah
Copy link
Member

We've had some issues with packages in the containerd repository (which was related to mirrors configured on the workers that only have a subset of architectures).

I gave CI a kick (hoping it was just a fluke) 🤞

Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks so much!

⚠️ ⚠️ I did not review the C code extensively; I'm not a C-coder, and don't want to pretend being one, so other than "a proper glance", haven't reviewed those commits. Reviews from others on those files therefore would be appreciated 👍

Copy link
Member

@lifubang lifubang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two small suggestions. we aleady have 3 LGTMs, I think we can move forward now.

libcontainer/nsenter/nsexec.c Outdated Show resolved Hide resolved
libcontainer/nsenter/nsexec.c Outdated Show resolved Hide resolved
rata and others added 2 commits July 17, 2023 13:30
This commit adds support for idmap mounts as specified in the runtime-spec.

We open the idmap source paths and call mount_setattr() in runc PARENT,
as we need privileges in the init userns for that, and then sends the
fds to the child process. For this fd passing we use the same mechanism
used in other parts of thecode, the _LIBCONTAINER_ env vars.

The mount is finished (unix.MoveMount) from go code, inside the userns,
so we reuse all the prepareBindMount() security checks and the remount
logic for some flags too.

This commit only supports idmap mounts when userns are used AND the mappings
are the same specified for the userns mapping. This limitation is to
simplify the initial implementation, as all our users so far only need
this, and we can avoid sending over netlink the mappings, creating a
userns with this custom mapping, etc. Future PRs will remove this
limitation.

Co-authored-by: Francis Laniel <flaniel@linux.microsoft.com>
Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
Co-authored-by: Francis Laniel <flaniel@linux.microsoft.com>
Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
@rata
Copy link
Member Author

rata commented Jul 17, 2023

@lifubang thanks, fixed that too. Feel free to merge now! 🎉

@lifubang
Copy link
Member

Thanks everyone working on this.

@lifubang lifubang merged commit f73b05d into opencontainers:main Jul 17, 2023
@rata rata deleted the rata/idmap branch July 17, 2023 14:14
Copy link
Member

@cyphar cyphar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, this was merged while I was reviewing it. I guess I'll send a fix-up PR then... (I had 6-7 other comments but GitHub won't let me post them because the "diff has changed"...)

if m.idmapFD == -1 {
return fmt.Errorf("error creating mount %+v: idmapFD is invalid, should point to a valid fd", m)
}
if err := unix.MoveMount(m.idmapFD, "", -1, dest, unix.MOVE_MOUNT_F_EMPTY_PATH); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if err := unix.MoveMount(m.idmapFD, "", -1, dest, unix.MOVE_MOUNT_F_EMPTY_PATH); err != nil {
if err := unix.MoveMount(m.idmapFD, "", unix.AT_FDCWD, dest, unix.MOVE_MOUNT_F_EMPTY_PATH); err != nil {

Comment on lines +65 to +73
static inline int sys_mount_setattr(int dfd, const char *path, unsigned int flags, struct mount_attr *attr, size_t size)
{
return syscall(__NR_mount_setattr, dfd, path, flags, attr, size);
}

static inline int sys_open_tree(int dfd, const char *filename, unsigned int flags)
{
return syscall(__NR_open_tree, dfd, filename, flags);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: We don't usually call these wrappers sys_foo, we just call them foo, but I guess it doesn't really matter.

continue;
}

int fd_tree = sys_open_tree(-EBADF, idmap_src,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
int fd_tree = sys_open_tree(-EBADF, idmap_src,
int fd_tree = sys_open_tree(AT_FDCWD, idmap_src,

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are enforcing that the path is a abs dir. If we use AT_FDCWD it will work (if the validation is skipped for some reason), while with this IIRC it will fail if the path is a rel path.

Why is this suggestion better?

@rata
Copy link
Member Author

rata commented Jul 17, 2023

@cyphar Sure, feel free to open a PR and tag me if that works for you. Otherwise, let me know what suggested changes you had in mind

@lifubang
Copy link
Member

lifubang commented Jul 18, 2023

Ah, this was merged while I was reviewing it.

Sorry to interrupt your code review. Maybe you can set this PR's label to status/2-codereview if the label is status/4-merge(This is learned from @thaJeztah ^_^) when you are doing an important code review. Thanks.

By the way, I think all the maintainers should have a msg group in a instant messaging tool If convenient. Then we can announce some important things in this msg group.

@AkihiroSuda
Copy link
Member

msg

Do you have an account in opencontainers.slack.com ?
If you don't have, you can create one in https://communityinviter.com/apps/opencontainers/join-the-oci-community

@lifubang
Copy link
Member

msg

Do you have an account in opencontainers.slack.com ? If you don't have, you can create one in https://communityinviter.com/apps/opencontainers/join-the-oci-community

Has created one account: Lifubang. Can you bring me in runc user group if there is? Thanks.

@AkihiroSuda
Copy link
Member

Done, but not much active currently

@cyphar cyphar mentioned this pull request Mar 14, 2024
dims pushed a commit to dims/libcontainer that referenced this pull request Oct 19, 2024
This was a warning already and it was requested to make this an error
while we will add validation of idmap mounts:
	opencontainers/runc#3717 (comment)

I've also tested a k8s cluster and the config.json generated by
containerd didn't use any relative paths. I tested one pod, so it was
definitely not an extensive test.

Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

v1.x.x: enforce absolute paths for mounts again
8 participants