-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add --preserve-file-descriptors=N to create #1320
Conversation
utils_linux.go
Outdated
|
||
if len(r.listenFDs) > 0 && r.preserveFDs > 0 { | ||
return -1, fmt.Errorf("Preserve FDs and socket activation (LISTEN_FDS) are mutually incompatible") | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels like a bit of a cop out, but I couldn't think of a better way. The only alternative I thought of was to specify that preserveFds preserves N over and above ${LISTEN_FDS:-0}
but that seemed pretty fugly. It seems unlikely to me that one would write an application which uses both socket activation and expects other random open fds to be present upon exec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You an just start by using len(process.ExtraFiles)
, adding 3 stdio and always preserve from what is already in the ExtraFiles
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@crosbymichael thanks, so to check I understand, you are suggesting that --preserve-fds
should be cumulative over SD_LISTEN
(or anything else which extends process.ExtraFiles
in the future). IOW given LISTEN_FDS=3
and --preserve-fds=5
then we would preserve [3, 11)
(in addition to stdio). That's 3, 4 & 5 from LISTEN_FDS=3
and 6, 7, 8, 9 & 10 from --preserve-fds=5
.
I'll do that.
create.go
Outdated
@@ -46,6 +46,10 @@ command(s) that get executed on start, edit the args parameter of the spec. See | |||
Name: "no-new-keyring", | |||
Usage: "do not create a new session keyring for the container. This will cause the container to inherit the calling processes session key", | |||
}, | |||
cli.IntFlag{ | |||
Name: "preserve-file-descriptors", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a pretty long flag name, Maybe go for something like --preserve-fd
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. I think I'll go for --preserve-fds
to keep it plural. Maybe --nr-preserve-fds
would be clearer though (but then the length creeps back up...). --nr-keep-fds
?
utils_linux.go
Outdated
|
||
if len(r.listenFDs) > 0 && r.preserveFDs > 0 { | ||
return -1, fmt.Errorf("Preserve FDs and socket activation (LISTEN_FDS) are mutually incompatible") | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You an just start by using len(process.ExtraFiles)
, adding 3 stdio and always preserve from what is already in the ExtraFiles
I still think the |
@ijc25 i'm curious, do you have a usecase for this noted down somewhere ? |
@crosbymichael @dqminh basically I'd like to pass an open socket or file fd to my application so I can reduce its privileges to not include opening sockets/files. |
I also don't like -nr, have not seen this in a CLI before. |
Could you either squash or rework the commits? |
cli.IntFlag{ | ||
Name: "preserve-fds", | ||
Usage: "Pass N additional file descriptors to the container (stdio + $LISTEN_FDS + N in total)", | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we also add this to run.go? They almost have same options.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think that would make sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They were even similar enough that git show create.go | patch run.go
did the right thing ;-)
edited: I can't see ... I can't see how adding the option to
I see similar messages in some other recent master failures e.g. |
utils_linux.go
Outdated
if len(r.listenFDs) > 0 { | ||
process.Env = append(process.Env, fmt.Sprintf("LISTEN_FDS=%d", len(r.listenFDs)), "LISTEN_PID=1") | ||
process.ExtraFiles = append(process.ExtraFiles, r.listenFDs...) | ||
} | ||
|
||
baseFd := 3 + len(process.ExtraFiles) | ||
for i := baseFd; i < baseFd+r.preserveFDs; i++ { | ||
process.ExtraFiles = append(process.ExtraFiles, os.NewFile(uintptr(i), "PreserveFD")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the name be different for each fd instead of "PreserveFD"? Maybe "PreserveFD:i" ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nicer for debugging to do that, It does not really make a difference functionality wise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
This preserves the given number of file descriptors on top of the 3 stdio and the socket activation ($LISTEN_FDS=M) fds. If LISTEN_FDS is not set then [3..3+N) would be preserved by --preserve-fds=N. Given LISTEN_FDS=3 and --preserve-fds=5 then we would preserve fds [3, 11) (in addition to stdio). That's 3, 4 & 5 from LISTEN_FDS=3 and 6, 7, 8, 9 & 10 from --preserve-fds=5. Signed-off-by: Ian Campbell <ian.campbell@docker.com>
@ijc25 It's not a flakiness in the CI, it was a race condition that we've been hitting a lot recently. It's been fixed in #1237. |
@@ -57,6 +57,10 @@ command(s) that get executed on start, edit the args parameter of the spec. See | |||
Name: "no-new-keyring", | |||
Usage: "do not create a new session keyring for the container. This will cause the container to inherit the calling processes session key", | |||
}, | |||
cli.IntFlag{ | |||
Name: "preserve-fds", | |||
Usage: "Pass N additional file descriptors to the container (stdio + $LISTEN_FDS + N in total)", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're going to have this sort of API, I would prefer that we make it more explicit than this. For example, make it so that you have to specify which file descriptors you want to pass (IntSliceFlag
). This would also mean that cases where you don't want to use low file descriptors (see this SQLite option for more reasons why that might matter).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So it would look like
runc run --preserve-fd 512 --preserve-fd 1337 ctr
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current interface doesn't allow you to pass through fds 0, 1, or 2 in this way so I think the SQLlite reasoning for avoiding those fds doesn't really apply here.
I can take a look switching things to behave like you suggest but it would have to be contingent on Go's os/exec
Cmd.ExtraFiles
supporting the necessary semantics (i.e. a bug slice full of nil
s apart from at offset 512
and 1337
in your example). Personally I don't think it is terribly useful to be able to pass arbitrary fds, it's pretty easy (and arguably more controlled) for the caller to dup2
things into the right place after fork
but before exec
of runc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this i'm not sure defining individual fds is optimal. If you have a large amount of fds to pass you would run out of space for all the --preserve-fd 3 --preserve-fd 4 --preserve-fd 5...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point. I was a bit surprised that IntSliceFlag
didn't support comma separated lists or even N-M
spans (or if it does that the syntax wasn't more obvious...).
FWIW I did confirm that Cmd.ExtraFiles
looks on first glance to DTRT with nil
in the list, although finalizeNamespace
and CloseExecFrom
would probably need extending to handle a sparse fd space to be absolutely safe.
Thanks the the clarification ! I was wondering why reusing the |
@dqminh ya, and it would be overloading the systemd functionality that people expect with listen_ |
Signed-off-by: Ian Campbell <ian.campbell@docker.com>
Signed-off-by: Ian Campbell <ian.campbell@docker.com>
Signed-off-by: Ian Campbell <ian.campbell@docker.com>
Expose --preserve-fds option added in opencontainers/runc#1320
This allows extra open fds to be passed through into the container, relying on opencontainers/runc#1320 to actually pass them to the init process. These fds are opened in the order they appear in the ExtraFDs array and begin at fd 3 after stdio. Initially only two kinds of extra fd are supported, /dev/null (mostly as a PoC to have more than one, its not actually very useful) and a named tap device. Other useful types which could be considered might include anything which could be returnable by Go's net.Dial or Fds received as ancillary data over a Unix domain socket. The fds are opened in the main containerd process and passed through to the shim as a simple count. This slightly complicates passing the open gRPC socket fd through to the shim since inserting the extra fds before it means it is no longer fixed at fd 3. The number is now communicated through an environment variable, this seemed to have less potential for confusion than retaining the fd at 3 and then requiring containerd-shim to shuffle the extra fds down one as it does the exec of runc. NB revendoring crosbymichael/go-runc also pulls in db0c87b Remove - from temp process name c098a32 Remove process options from ExecOpts as well as the changes from containerd/go-runc#8. Signed-off-by: Ian Campbell <ian.campbell@docker.com>
This preserves the given number of file descriptors on top of the 3 stdio, that
is, fds [3..3+N) will be passed through.
Since Socket Activation (via $LISTEN_FDS) also claims fd 3 onwards the use of
this new flag is incompatible with socket activation.
Signed-off-by: Ian Campbell ian.campbell@docker.com