Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

subprocess documentation incorrectly implies text-mode translation options apply to files #99864

Open
cemysce opened this issue Nov 29, 2022 · 1 comment
Labels
docs Documentation in the Doc dir topic-subprocess Subprocess issues.

Comments

@cemysce
Copy link
Contributor

cemysce commented Nov 29, 2022

Documentation

The documentation for the subprocess module (I'm looking at both 3.8 and 3.11) makes the following misleading or inaccurate claims (excerpts taken from 3.11, but similar statements exist in both):

  • Under subprocess.run (3.8, 3.11):
    • If encoding or errors are specified, or text is true, file objects for stdin, stdout and stderr are opened in text mode using the specified encoding and errors or the io.TextIOWrapper default. The universal_newlines argument is equivalent to text and is provided for backwards compatibility. By default, file objects are opened in binary mode.
  • Under "Frequently Used Arguments" (3.8, 3.11):
    • If encoding or errors are specified, or text (also known as universal_newlines) is true, the file objects stdin, stdout and stderr will be opened in text mode using the encoding and errors specified in the call or the defaults for io.TextIOWrapper.
    • For stdin, line ending characters '\n' in the input will be converted to the default line separator os.linesep. For stdout and stderr, all line endings in the output will be converted to '\n'. For more information see the documentation of the io.TextIOWrapper class when the newline argument to its constructor is None.
    • If text mode is not used, stdin, stdout and stderr will be opened as binary streams. No encoding or line ending conversion is performed.

In reality, even if text mode is enabled (as required by the above statements) these translations only occur for a stream (stdin, stdout, or stderr) if that stream is specified as or implied to be subprocess.PIPE. It's not clear to a reader that if they pass a file object (for instance a TextIOWrapper) for a stream, the encapsulation (i.e. translation features) provided by that file object is completely circumvented because subprocess actually just passes the underlying file descriptor directly to the child process.

There are also issues with phrases like "file objects ... stdin, stdout and stderr ... opened in text mode" (variations of this appear a few times):

  • This phrase is an oversimplification, and consequently it's confusing. It's not the passed-in file objects which are opened in text mode, it's the file objects constructed within Popen.__init__ (which are actually wrapping internally-constructed pipes, not the passed-in files) that are opened in text mode. But this is an implementation detail, the documentation shouldn't get bogged down in trying to explain it. Instead it should simply state that these translation options only apply to streams which are specified as subprocess.PIPE, and in such case (and only if in text mode) the pipes are read/written using a TextIOWrapper with the specified encoding and errors parameters (or TextIOWrapper's defaults for those parameters).
  • The stream names (stdin, etc.) are not italicized, and it's not clear if this is a mistake or intentional. If it's a mistake, they should be italicized. If it's intentional, then the statements should be rephrased to clarify exactly what they refer to (and perhaps addressing bullet point above will address this bullet point as well).

Just to be clear, I think the actual implementation of subprocess is totally reasonable. Otherwise, subprocess would have to use a pipe under the hood. Then if stdin was passed a file object, subprocess would have to read from the file object, performing appropriate translations, then write to a pipe, and pass the pipe's read fd as the child's stdin. And if stdout/stderr was passed a file object, subprocess would have to pass a pipe's write fd as the child's stdout/stderr, then read from that pipe, and perform translations when writing to the file object. All this functionality is best implemented by the application itself.

My issue is just about the documentation, which implies the above functionality when in fact all translation is bypassed.

@cemysce cemysce added the docs Documentation in the Doc dir label Nov 29, 2022
@cemysce
Copy link
Contributor Author

cemysce commented Nov 29, 2022

FYI I came across this issue because I was investigating a program I'm calling with subprocess.run on Windows, whose stderr I'm directing to a file. (It happens to be a Python program, though I don't think that actually matters here.) The program has its own bug, whereby it prints to stderr with '\n' line terminators, regardless of the platform on which it is running.

So I was expecting to be able to use pass the file object (which is actually a TextIOWrapper with appropriate value for newline) to subprocess.run and have the line endings seamlessly translated.

@vstinner vstinner added the topic-subprocess Subprocess issues. label Jul 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Documentation in the Doc dir topic-subprocess Subprocess issues.
Projects
None yet
Development

No branches or pull requests

2 participants