Pebble exec, when in enforced interactive mode, is prone to losing some of the command output #306

dmitry-lyfar · 2023-09-14T03:39:30Z

If one enforces the pebble exec interactive mode when executing a command and stdin returns EOF sooner than reading from the commands stdout ends, it may result in some output lost. Consider:

> pebble exec -i -t -- tee < sometextfile
> pebble exec -i -t -- ls </dev/null

In both cases, the command's output will either be nonexistent or partial. This is likely due to the client closing the websocket connection with the server (/v1/tasks/<task-id>/websocket/stdio) before the command's stdout is fully read. In the interactive mode, stdio websocket is used to redirect both stdin and stdout, hence, if the client's stdin go routine sends the end message to the websocket server it affects the routine reading stdout (see the line 68):

pebble/internals/wsutil/wsutil.go

Lines 54 to 70 in 6134298

    
           go func(conn MessageWriter, r io.Reader) { 
        
           	in := ReaderToChannel(r, bufferSize) 
        
           	for { 
        
           		buf, ok := <-in 
        
           		if !ok { 
        
           			break 
        
           		} 
        
           		err := conn.WriteMessage(websocket.BinaryMessage, buf) 
        
           		if err != nil { 
        
           			logger.Debugf("Got err writing %s", err) 
        
           			break 
        
           		} 
        
           	} 
        
           	conn.WriteMessage(websocket.TextMessage, endCommandJSON) 
        
           	close(ch) // NOTE(benhoyt): this was "ch <- true", but that can block 
        
           }(conn, r)

If sending the end message is delayed until the stdout routine reads all the produced output, it resolves the issue.

The text was updated successfully, but these errors were encountered:

benhoyt · 2023-09-14T22:37:16Z

Thanks for the report, @dmitry-lyfar. Just for the record, the three different cases I've seen are as follows (with the first being the most common):

$ PEBBLE=~/pebble ./pebble exec -i -t -- tee <go.mod
[exit code 129]

$ PEBBLE=~/pebble ./pebble exec -i -t -- tee <go.mod
error: cannot perform the following tasks:
- exec command "tee" (fork/exec /usr/bin/tee: input/output error)
[exit code 1]

$ PEBBLE=~/pebble ./pebble exec -i -t -- tee <go.mod
module github.com/canonical/pebble

go 1.20

require (
	github.com/canonical/go-flags v0.0.0-20230403090104-105d09a091b8
	github.com/canonical/x-go v0.0.0-20
[exit code 129]

dmitry-lyfar · 2024-06-04T22:57:57Z

@benhoyt This is the relevant issue from LXD that I mentioned to you before: canonical/lxd#12285. Also, please see this comment on the GetPollRevents: canonical/lxd#12210 (comment).

dmitry-lyfar · 2024-08-02T00:54:20Z

Extending the list of the known exit codes (no output):

go run ./cmd/pebble exec -i -t -- tee < HACKING.md
exit status 137

dmitry-lyfar · 2024-08-02T07:05:22Z

@benhoyt @IronCore864 I think I narrowed the issue to a POC that potentially solves it. The root problem is that by using -i we enforce the server to use a single master pty descriptor to read and write data from/to the client. If the descriptor is closed at the time Pebble finishes reading stdin (but not writing to stdout), we get the issues above.

Pebble starts two go routines that mirror stdin/stdout from the client.

Read and mirror stdout from master to the websocket (over to the client).
Read stdin from the websocket and send it to master.

So, on:

go run ./cmd/pebble exec -i -- tee < ./HACKING.md

the go routine from p. 2 reads the input entirely, writes it to the master, sees that there is an EOF (actually, "end" from the websocket) and closes the master descriptor. The go routine from p.1 is still working and reading the output from the master descriptor to mirror it to the websocket. Given that it has just been closed, it detects POLINVAL and exists even if there is data to read.

This is not a problem when the interactive mode is set automatically due to the command being run in the terminal as the reading go routine (p.2) would only return when a user terminates the process somehow.

The suggested solution is simple -- write EOF to the pty to indicate the end of stdin to the process. This way the examples using tee would terminate correctly upon writing the remaining output and exiting due to EOF. I see that there are multiple places where the master descriptor can be closed (i.e. in MirrorToWebsocket and in afterClosers). I could not identify any scenarios where this solution would leak a file descriptor.

See the POC #466.

benhoyt · 2024-08-05T04:57:33Z

This is a nice find and fix, thanks a lot @dmitry-lyfar! I'll look at the change a bit more closely in context in the next few days. It certainly seems to work nicely at first blush.

IronCore864 · 2024-08-05T08:36:46Z

@benhoyt @IronCore864 I think I narrowed the issue to a POC that potentially solves it. The root problem is that by using -i we enforce the server to use a single master pty descriptor to read and write data from/to the client. If the descriptor is closed at the time Pebble finishes reading stdin (but not writing to stdout), we get the issues above.

Pebble starts two go routines that mirror stdin/stdout from the client.

Read and mirror stdout from master to the websocket (over to the client).

Read stdin from the websocket and send it to master.

So, on:
go run ./cmd/pebble exec -i -- tee < ./HACKING.md
the go routine from p. 2 reads the input entirely, writes it to the master, sees that there is an EOF (actually, "end" from the websocket) and closes the master descriptor. The go routine from p.1 is still working and reading the output from the master descriptor to mirror it to the websocket. Given that it has just been closed, it detects POLINVAL and exists even if there is data to read.

This is not a problem when the interactive mode is set automatically due to the command being run in the terminal as the reading go routine (p.2) would only return when a user terminates the process somehow.

The suggested solution is simple -- write EOF to the pty to indicate the end of stdin to the process. This way the examples using tee would terminate correctly upon writing the remaining output and exiting due to EOF. I see that there are multiple places where the master descriptor can be closed (i.e. in MirrorToWebsocket and in afterClosers). I could not identify any scenarios where this solution would leak a file descriptor.

See the POC #466.

Thanks @dmitry-lyfar for the diagnosis and the PoC.

Before this, I had been trying to fix it using various methods, including porting the fix from the LXD PR you mentioned, but without success.

After reading this PoC it came to me that the close of the master descriptor here is premature; it will be closed in the afterClosers.

I tested it with long inputs like the tee command and short ones like a simple ls, and both worked as expected. Let's merge your PR.

benhoyt · 2024-08-06T04:55:25Z

Once again, much appreciated, @dmitry-lyfar.

benhoyt added the 24.10 label Mar 13, 2024

benhoyt mentioned this issue Aug 6, 2024

fix(overlord): send EOF to master pty in interactive mode #466

Merged

benhoyt closed this as completed in #466 Aug 6, 2024

benhoyt closed this as completed in dcd37f3 Aug 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pebble exec, when in enforced interactive mode, is prone to losing some of the command output #306

Pebble exec, when in enforced interactive mode, is prone to losing some of the command output #306

dmitry-lyfar commented Sep 14, 2023

benhoyt commented Sep 14, 2023 •

edited

Loading

dmitry-lyfar commented Jun 4, 2024 •

edited

Loading

dmitry-lyfar commented Aug 2, 2024

dmitry-lyfar commented Aug 2, 2024

benhoyt commented Aug 5, 2024

IronCore864 commented Aug 5, 2024

benhoyt commented Aug 6, 2024

Pebble exec, when in enforced interactive mode, is prone to losing some of the command output #306

Pebble exec, when in enforced interactive mode, is prone to losing some of the command output #306

Comments

dmitry-lyfar commented Sep 14, 2023

benhoyt commented Sep 14, 2023 • edited Loading

dmitry-lyfar commented Jun 4, 2024 • edited Loading

dmitry-lyfar commented Aug 2, 2024

dmitry-lyfar commented Aug 2, 2024

benhoyt commented Aug 5, 2024

IronCore864 commented Aug 5, 2024

benhoyt commented Aug 6, 2024

benhoyt commented Sep 14, 2023 •

edited

Loading

dmitry-lyfar commented Jun 4, 2024 •

edited

Loading