Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pebble exec, when in enforced interactive mode, is prone to losing some of the command output #306

Closed
dmitry-lyfar opened this issue Sep 14, 2023 · 7 comments · Fixed by #466

Comments

@dmitry-lyfar
Copy link
Contributor

If one enforces the pebble exec interactive mode when executing a command and stdin returns EOF sooner than reading from the commands stdout ends, it may result in some output lost. Consider:

> pebble exec -i -t -- tee < sometextfile
> pebble exec -i -t -- ls </dev/null

In both cases, the command's output will either be nonexistent or partial. This is likely due to the client closing the websocket connection with the server (/v1/tasks/<task-id>/websocket/stdio) before the command's stdout is fully read. In the interactive mode, stdio websocket is used to redirect both stdin and stdout, hence, if the client's stdin go routine sends the end message to the websocket server it affects the routine reading stdout (see the line 68):

go func(conn MessageWriter, r io.Reader) {
in := ReaderToChannel(r, bufferSize)
for {
buf, ok := <-in
if !ok {
break
}
err := conn.WriteMessage(websocket.BinaryMessage, buf)
if err != nil {
logger.Debugf("Got err writing %s", err)
break
}
}
conn.WriteMessage(websocket.TextMessage, endCommandJSON)
close(ch) // NOTE(benhoyt): this was "ch <- true", but that can block
}(conn, r)

If sending the end message is delayed until the stdout routine reads all the produced output, it resolves the issue.

@benhoyt
Copy link
Contributor

benhoyt commented Sep 14, 2023

Thanks for the report, @dmitry-lyfar. Just for the record, the three different cases I've seen are as follows (with the first being the most common):

$ PEBBLE=~/pebble ./pebble exec -i -t -- tee <go.mod
[exit code 129]

$ PEBBLE=~/pebble ./pebble exec -i -t -- tee <go.mod
error: cannot perform the following tasks:
- exec command "tee" (fork/exec /usr/bin/tee: input/output error)
[exit code 1]

$ PEBBLE=~/pebble ./pebble exec -i -t -- tee <go.mod
module github.com/canonical/pebble

go 1.20

require (
	github.com/canonical/go-flags v0.0.0-20230403090104-105d09a091b8
	github.com/canonical/x-go v0.0.0-20
[exit code 129]

@benhoyt benhoyt added the 24.10 label Mar 13, 2024
@dmitry-lyfar
Copy link
Contributor Author

dmitry-lyfar commented Jun 4, 2024

@benhoyt This is the relevant issue from LXD that I mentioned to you before: canonical/lxd#12285. Also, please see this comment on the GetPollRevents: canonical/lxd#12210 (comment).

@dmitry-lyfar
Copy link
Contributor Author

Extending the list of the known exit codes (no output):

go run ./cmd/pebble exec -i -t -- tee < HACKING.md
exit status 137

@dmitry-lyfar
Copy link
Contributor Author

@benhoyt @IronCore864 I think I narrowed the issue to a POC that potentially solves it. The root problem is that by using -i we enforce the server to use a single master pty descriptor to read and write data from/to the client. If the descriptor is closed at the time Pebble finishes reading stdin (but not writing to stdout), we get the issues above.

Pebble starts two go routines that mirror stdin/stdout from the client.

  1. Read and mirror stdout from master to the websocket (over to the client).
  2. Read stdin from the websocket and send it to master.

So, on:

go run ./cmd/pebble exec -i -- tee < ./HACKING.md

the go routine from p. 2 reads the input entirely, writes it to the master, sees that there is an EOF (actually, "end" from the websocket) and closes the master descriptor. The go routine from p.1 is still working and reading the output from the master descriptor to mirror it to the websocket. Given that it has just been closed, it detects POLINVAL and exists even if there is data to read.

This is not a problem when the interactive mode is set automatically due to the command being run in the terminal as the reading go routine (p.2) would only return when a user terminates the process somehow.

The suggested solution is simple -- write EOF to the pty to indicate the end of stdin to the process. This way the examples using tee would terminate correctly upon writing the remaining output and exiting due to EOF. I see that there are multiple places where the master descriptor can be closed (i.e. in MirrorToWebsocket and in afterClosers). I could not identify any scenarios where this solution would leak a file descriptor.

See the POC #466.

@benhoyt
Copy link
Contributor

benhoyt commented Aug 5, 2024

This is a nice find and fix, thanks a lot @dmitry-lyfar! I'll look at the change a bit more closely in context in the next few days. It certainly seems to work nicely at first blush.

@IronCore864
Copy link
Contributor

@benhoyt @IronCore864 I think I narrowed the issue to a POC that potentially solves it. The root problem is that by using -i we enforce the server to use a single master pty descriptor to read and write data from/to the client. If the descriptor is closed at the time Pebble finishes reading stdin (but not writing to stdout), we get the issues above.

Pebble starts two go routines that mirror stdin/stdout from the client.

  1. Read and mirror stdout from master to the websocket (over to the client).
  2. Read stdin from the websocket and send it to master.

So, on:

go run ./cmd/pebble exec -i -- tee < ./HACKING.md

the go routine from p. 2 reads the input entirely, writes it to the master, sees that there is an EOF (actually, "end" from the websocket) and closes the master descriptor. The go routine from p.1 is still working and reading the output from the master descriptor to mirror it to the websocket. Given that it has just been closed, it detects POLINVAL and exists even if there is data to read.

This is not a problem when the interactive mode is set automatically due to the command being run in the terminal as the reading go routine (p.2) would only return when a user terminates the process somehow.

The suggested solution is simple -- write EOF to the pty to indicate the end of stdin to the process. This way the examples using tee would terminate correctly upon writing the remaining output and exiting due to EOF. I see that there are multiple places where the master descriptor can be closed (i.e. in MirrorToWebsocket and in afterClosers). I could not identify any scenarios where this solution would leak a file descriptor.

See the POC #466.

Thanks @dmitry-lyfar for the diagnosis and the PoC.

Before this, I had been trying to fix it using various methods, including porting the fix from the LXD PR you mentioned, but without success.

After reading this PoC it came to me that the close of the master descriptor here is premature; it will be closed in the afterClosers.

I tested it with long inputs like the tee command and short ones like a simple ls, and both worked as expected. Let's merge your PR.

@benhoyt
Copy link
Contributor

benhoyt commented Aug 6, 2024

Once again, much appreciated, @dmitry-lyfar.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants