Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modifications to std::io::Stdin on Windows so that there is no longer a 4-byte buffer minimum in read(). #91754

Merged

Conversation

Patrick-Poitras
Copy link
Contributor

@Patrick-Poitras Patrick-Poitras commented Dec 10, 2021

This is an attempted fix of issue #91722, where a too-small buffer was passed to the read function of stdio on Windows. This caused an error to be returned when read_to_end or read_to_string were called. Both delegate to std::io::default_read_to_end, which creates a buffer that is of length >0, and forwards it to std::io::Stdin::read(). The latter method returns an error if the length of the buffer is less than 4, as there might not be enough space to allocate a UTF-16 character. This creates a problem when the buffer length is in 0 < N < 4, causing the bug.

The current modification creates an internal buffer, much like the one used for the write functions

I'd also like to acknowledge the help of @agausmann and @hkratz in detecting and isolating the bug, and for suggestions that made the fix possible.

Couple disclaimers:

  • Firstly, I didn't know where to put code to replicate the bug found in the issue. It would probably be wise to add that case to the testing suite, but I'm afraid that I don't know where that test should be added.
  • Secondly, the code is fairly fundamental to IO operations, so my fears are that this may cause some undesired side effects or performance loss in benchmarks. The testing suite runs on my computer, and it does fix the issue noted in read_to_end / read_to_string returns Invalid Input ("buffer too small") on Windows MSVC #91722.
  • Thirdly, I left the "surrogate" field in the Stdin struct, but from a cursory glance, it seems to be serving the same purpose for other functions. Perhaps merging the two would be appropriate.

Finally, this is my first pull request to the rust language, and as such some things may be weird/unidiomatic/plain out bad. If there are any obvious improvements I could do to the code, or any other suggestions, I would appreciate them.

Edit: Closes #91722

@rust-highfive
Copy link
Collaborator

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @Mark-Simulacrum (or someone else) soon.

Please see the contribution instructions for more information.

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Dec 10, 2021
@rust-log-analyzer

This comment has been minimized.

@pitdicker
Copy link
Contributor

I am not sure if there really is a bug here. Please see #58454 for a bit of history.

See this comment as a note for the reviewer, I currently don't have the time to form a real opinion.

@Patrick-Poitras
Copy link
Contributor Author

@agausmann I've made the two changes you requested above. Thanks so much!

On a sidenote, I'm trying to write a test that replicates the behavior, but I think the code is going around any sort of testing by checking if console mode is enabled. This is probably why automated testing didn't catch it before. (They're all probably using pipes, or not console mode)

@agausmann
Copy link
Contributor

agausmann commented Dec 10, 2021

@pitdicker

I am not sure if there really is a bug here.

The current behavior makes read_to_string and read_to_vec non-portable in a non-intuitive and undocumented way - they have unreliable behavior in Windows console environments and will return an error for very specific forms of input (though it is not that strange or uncommon) that would be fine on other platforms.

And I agree about what was said in #91722 (comment) about windows stdio - it ought to be liberal in what it accepts, specifically it should be able to handle all buffer sizes:

The io::Read implementation for Stdin should conform to the behavior specified in the documentation of the Read trait. Essentially calling read() with a buf of 0 < length < 4 should work and return (at least) Ok(1) with one byte of the UTF-8 sequence if EOF has not been reached.

@Patrick-Poitras Patrick-Poitras marked this pull request as ready for review December 10, 2021 20:50
@Patrick-Poitras
Copy link
Contributor Author

Patrick-Poitras commented Dec 10, 2021

I agree that we should conform to the specifications of the Read function. That was my main reason for favoring this implementation over modifying the caller to catch the errors it would throw, which I originally favored.

I've also changed this from a draft to an open pull request, after doing enough manual testing that I've convinced myself that the behavior is correct. I've still been unable to find where in the code this path gets tested, or an alternative method for testing it.

Finally, I've removed the request for a benchmark from the main post. Piped-in IO short circuits the modified part of the code so I don't see this being a problem that would bench unfavorably on non-console mode path. The console mode path would only be slowed down if the length is less than 4, which currently returns an error, so I believe that that covers any potential performance decreases.

@agausmann
Copy link
Contributor

agausmann commented Dec 11, 2021

but I think the code is going around any sort of testing by checking if console mode is enabled

Yeah, this section of code is only used when the I/O device is a handle to a Windows "console". I don't know much about this, but as I understand it, Rust converts the programmer's (UTF-8) sequence of bytes to/from a UTF-16 buffer which is needed by the low-level ReadConsole/WriteConsole calls.

It seems like there has been difficulty making automated tests for this in the past - e.g. in #58454, they say they "did some manual testing of reading and writing to console" (emphasis mine).

It may be possible to build a test harness for Windows consoles using pseudoconsoles? That might deserve its own issue, but regardless, I think it is worth looking into.

Copy link
Member

@Mark-Simulacrum Mark-Simulacrum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks okay -- left a couple small comments to clean the code up a little.

Test-wise I suspect a run-pass test which looks something like https://github.com/rust-lang/rust/blob/master/src/test/ui/issues/issue-13304.rs might work well, though there's not currently access to raw stdin. Maybe we can expose a doc(hidden) and unstable helper to get access to the raw stdin to test it...

library/std/src/sys/windows/stdio.rs Outdated Show resolved Hide resolved
library/std/src/sys/windows/stdio.rs Outdated Show resolved Hide resolved
library/std/src/sys/windows/stdio.rs Outdated Show resolved Hide resolved
library/std/src/sys/windows/stdio.rs Outdated Show resolved Hide resolved
library/std/src/sys/windows/stdio.rs Outdated Show resolved Hide resolved
@Mark-Simulacrum Mark-Simulacrum added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 14, 2021
@ChrisDenton
Copy link
Member

Maybe we can expose a doc(hidden) and unstable helper to get access to the raw stdin to test it...

You can get the Windows console input handle by opening the special device \\.\CONIN$. See CreateFileW#Consoles

@apiraino apiraino added the T-libs Relevant to the library team, which will review and decide on the PR/issue. label Dec 14, 2021
@rust-log-analyzer

This comment has been minimized.

@Mark-Simulacrum Mark-Simulacrum added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Dec 14, 2021
@Mark-Simulacrum
Copy link
Member

Please squash commits, and one more nit -- then we can move ahead.

Maybe we can expose a doc(hidden) and unstable helper to get access to the raw stdin to test it...

You can get the Windows console input handle by opening the special device \\.\CONIN$. See CreateFileW#Consoles

This'll be a good start to getting the stdin for the child process right, thanks!

We will need the raw wrapper provided by std, in a UI test or similar, which means externally from std so need something ~public. I can help out directly with plumbing the bits of std out if that would be helpful, or provide some additional instructions beyond what I previously said.

@Patrick-Poitras
Copy link
Contributor Author

@Mark-Simulacrum All done!

@Mark-Simulacrum
Copy link
Member

Alright. I'll try to throw together a UI test skeleton and push it up in the next few days, though I don't have easy access to a windows machine to test on right now.

Thanks!

@Mark-Simulacrum
Copy link
Member

@ChrisDenton hm, is there a way to write to that windows console input so we can read to it from the child? My rough assumption is that this isn't necessarily the same as regular stdin for processes spawned through std::process, but maybe I'm wrong about that?

It's fairly easy to imagine exposing the read impl so we can directly test this, but to make that work well we'd need to manufacture a console handle with information to read from, which I'm not sure how to do.

FWIW, I'd be okay moving ahead here if we could verify manually that this works, even if a UI test is difficult to pull together.

@bors bors added the S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. label Jan 1, 2022
@bors
Copy link
Contributor

bors commented Jan 2, 2022

⌛ Testing commit d49d1d4 with merge 007f52dba2161c8de49b0905c05ecefde0448e21...

@rust-log-analyzer
Copy link
Collaborator

A job failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)

@bors
Copy link
Contributor

bors commented Jan 2, 2022

💔 Test failed - checks-actions

@bors bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Jan 2, 2022
@matthiaskrgr
Copy link
Member

@bors retry apple runner did not start...?

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 2, 2022
@bors
Copy link
Contributor

bors commented Jan 2, 2022

⌛ Testing commit d49d1d4 with merge 51288ba26d420cbfbfc8f9382fb5fe1e45752167...

@bors
Copy link
Contributor

bors commented Jan 2, 2022

💥 Test timed out

@bors bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Jan 2, 2022
@Patrick-Poitras
Copy link
Contributor Author

@matthiaskrgr Same runner on apple timed out. I checked the output while it was running and it was the last build remaining about 3 hours ago. The apple build is running significantly slower than other builds for no apparent reason. (Also this change is Windows-only, so unless something weird is happening, I don't think it's got anything to do with this PR)

@rust-log-analyzer
Copy link
Collaborator

A job failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)

@lqd
Copy link
Member

lqd commented Jan 3, 2022

@bors retry

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 3, 2022
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Jan 3, 2022
…dio-windows, r=Mark-Simulacrum

Modifications to `std::io::Stdin` on Windows so that there is no longer a 4-byte buffer minimum in read().

This is an attempted fix of issue rust-lang#91722, where a too-small buffer was passed to the read function of stdio on Windows. This caused an error to be returned when `read_to_end` or `read_to_string` were called. Both delegate to `std::io::default_read_to_end`, which creates a buffer that is of length >0, and forwards it to `std::io::Stdin::read()`. The latter method returns an error if the length of the buffer is less than 4, as there might not be enough space to allocate a UTF-16 character. This creates a problem when the buffer length is in `0 < N < 4`, causing the bug.

The current modification creates an internal buffer, much like the one used for the write functions

I'd also like to acknowledge the help of `@agausmann` and `@hkratz` in detecting and isolating the bug, and for suggestions that made the fix possible.

Couple disclaimers:

- Firstly, I didn't know where to put code to replicate the bug found in the issue. It would probably be wise to add that case to the testing suite, but I'm afraid that I don't know _where_ that test should be added.
- Secondly, the code is fairly fundamental to IO operations, so my fears are that this may cause some undesired side effects ~or performance loss in benchmarks.~ The testing suite runs on my computer, and it does fix the issue noted in rust-lang#91722.
- Thirdly, I left the "surrogate" field in the Stdin struct, but from a cursory glance, it seems to be serving the same purpose for other functions. Perhaps merging the two would be appropriate.

Finally, this is my first pull request to the rust language, and as such some things may be weird/unidiomatic/plain out bad. If there are any obvious improvements I could do to the code, or any other suggestions, I would appreciate them.

Edit: Closes rust-lang#91722
bors added a commit to rust-lang-ci/rust that referenced this pull request Jan 4, 2022
…askrgr

Rollup of 7 pull requests

Successful merges:

 - rust-lang#91754 (Modifications to `std::io::Stdin` on Windows so that there is no longer a 4-byte buffer minimum in read().)
 - rust-lang#91884 (Constify `Box<T, A>` methods)
 - rust-lang#92107 (Actually set IMAGE_SCN_LNK_REMOVE for .rmeta)
 - rust-lang#92456 (Make the documentation of builtin macro attributes accessible)
 - rust-lang#92507 (Suggest single quotes when char expected, str provided)
 - rust-lang#92525 (intra-doc: Make `Receiver::into_iter` into a clickable link)
 - rust-lang#92532 (revert rust-lang#92254 "Bump gsgdt to 0.1.3")

Failed merges:

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit 50a66b7 into rust-lang:master Jan 4, 2022
@rustbot rustbot added this to the 1.59.0 milestone Jan 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

read_to_end / read_to_string returns Invalid Input ("buffer too small") on Windows MSVC