Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lossy consistency #56787

Closed
wants to merge 1 commit into from
Closed

Lossy consistency #56787

wants to merge 1 commit into from

Conversation

jnqnfe
Copy link
Contributor

@jnqnfe jnqnfe commented Dec 13, 2018

fixes #56786

this is a change in behaviour

@rust-highfive
Copy link
Collaborator

r? @KodrAus

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Dec 13, 2018
@jnqnfe
Copy link
Contributor Author

jnqnfe commented Dec 13, 2018

edit, the first commit is pull request #56142 which I had to base one minor change on; that pull request has been accepted but not yet merged

@rust-highfive
Copy link
Collaborator

The job x86_64-gnu-llvm-5.0 of your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.
travis_time:end:089c1140:start=1544727477699948191,finish=1544727578310518803,duration=100610570612
$ git checkout -qf FETCH_HEAD
travis_fold:end:git.checkout

Encrypted environment variables have been removed for security reasons.
See https://docs.travis-ci.com/user/pull-requests/#Pull-Requests-and-Security-Restrictions
$ export SCCACHE_BUCKET=rust-lang-ci-sccache2
$ export SCCACHE_REGION=us-west-1
Setting environment variables from .travis.yml
$ export IMAGE=x86_64-gnu-llvm-5.0
---
travis_time:start:test_codegen
Check compiletest suite=codegen mode=codegen (x86_64-unknown-linux-gnu -> x86_64-unknown-linux-gnu)
[01:24:59] 
[01:24:59] running 121 tests
[01:25:02] i..ii...iii..iiii.....i...i..........i..iii.............i.....i......ii...i..i.ii..............i...i 100/121
[01:25:03] i..ii.i.....iiii.....
[01:25:03] 
[01:25:03]  finished in 3.667
[01:25:03] travis_fold:end:test_codegen

---
travis_time:start:test_debuginfo
Check compiletest suite=debuginfo mode=debuginfo-both (x86_64-unknown-linux-gnu -> x86_64-unknown-linux-gnu)
[01:25:18] 
[01:25:18] running 119 tests
[01:25:44] .iiiii...i.....i..i...i..i.i..i.ii..i.....i..i....i..........iiii.........i.i....i...i.......ii.i.i. 100/119
[01:25:48] i......iii.i.....ii
[01:25:48] 
[01:25:48]  finished in 29.612
[01:25:48] travis_fold:end:test_debuginfo

---
[01:45:05] thread '<unnamed>' panicked at 'owned string', src/libstd/thread/mod.rs:1656:13
[01:45:05] thread '<unnamed>' panicked at 'static string', src/libstd/thread/mod.rs:1642:13
[01:45:05] thread '<unnamed>' panicked at 'Box<Any>', src/libstd/thread/mod.rs:1688:13
[01:45:05] thread '<unnamed>' panicked at 'specified instant was later than self', src/libstd/sys/unix/time.rs:298:17
[01:45:05] ........F.......F...........................................a: Instant { tv_sec: 6487, tv_nsec: 379536891 }
[01:45:05] dur: 92ns
_64-unknown-linux-gnu/release
128420 ./obj/build/x86_64-unknown-linux-gnu/stage1-rustc/x86_64-unknown-linux-gnu
128416 ./obj/build/x86_64-unknown-linux-gnu/stage1-rustc/x86_64-unknown-linux-gnu/release

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

jnqnfe added a commit to jnqnfe/gong that referenced this pull request Dec 14, 2018
On Windows, `std` lossily converts an unpaired surrogate encoding
into a single unicode replacement character, whereas on Unix, and
also via `String`'s raw byte conversion, three unicode replacement
characters are used.

This comes from the fact that different code paths are used which
take different approaches: `std::sys_common::wtf8::to_string_lossy`
for Windows `OsStr` lossy conversion, vs. `core`'s
`run_utf8_validation` function for Unix and for creaing a `String`
from raw bytes.

The inconsistency causes a problem in correct short option argument
byte consumption tracking in our `OsStr` parsing code, due to using
lossy `OsStr` conversion in combination with `std::str::from_utf8`.

A [bug report][1] and [pull request][2] have been filed against `std`.

[1]: rust-lang/rust#56786
[2]: rust-lang/rust#56787
@jnqnfe
Copy link
Contributor Author

jnqnfe commented Dec 14, 2018

update: rebased now that #56142 has been successfully merged

@rust-highfive
Copy link
Collaborator

The job x86_64-gnu-llvm-5.0 of your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.
travis_time:end:248ed1c3:start=1544794194650923229,finish=1544794196186367140,duration=1535443911
$ git checkout -qf FETCH_HEAD
travis_fold:end:git.checkout

Encrypted environment variables have been removed for security reasons.
See https://docs.travis-ci.com/user/pull-requests/#Pull-Requests-and-Security-Restrictions
$ export SCCACHE_BUCKET=rust-lang-ci-sccache2
$ export SCCACHE_REGION=us-west-1
Setting environment variables from .travis.yml
$ export IMAGE=x86_64-gnu-llvm-5.0
---
travis_time:start:test_codegen
Check compiletest suite=codegen mode=codegen (x86_64-unknown-linux-gnu -> x86_64-unknown-linux-gnu)
[00:50:58] 
[00:50:58] running 121 tests
[00:51:00] i..ii...iii..iiii.....i...i..........i..iii.............i.....i......ii...i..i.ii..............i...i 100/121
[00:51:01] i..ii.i.....iiii.....
[00:51:01] 
[00:51:01]  finished in 3.304
[00:51:01] travis_fold:end:test_codegen

---
travis_time:start:test_debuginfo
Check compiletest suite=debuginfo mode=debuginfo-both (x86_64-unknown-linux-gnu -> x86_64-unknown-linux-gnu)
[00:51:15] 
[00:51:15] running 119 tests
[00:51:37] .iiiii...i.....i..i...i..i.i..i.ii..i.....i..i....i..........iiii.........i.i....i...i.......ii.i.i. 100/119
[00:51:41] i......iii.i.....ii
[00:51:41] 
[00:51:41]  finished in 26.291
[00:51:41] travis_fold:end:test_debuginfo

---
[01:08:24] thread '<unnamed>' panicked at 'owned string', src/libstd/thread/mod.rs:1656:13
[01:08:24] thread '<unnamed>' panicked at 'static string', src/libstd/thread/mod.rs:1642:13
[01:08:24] thread '<unnamed>' panicked at 'Box<Any>', src/libstd/thread/mod.rs:1688:13
[01:08:24] thread '<unnamed>' panicked at 'specified instant was later than self', src/libstd/sys/unix/time.rs:286:17
[01:08:24] ........F.......F...........................................a: Instant { tv_sec: 4203, tv_nsec: 596951049 }
[01:08:24] dur: 135ns
travis_time:start:1ae1ea3a
$ date && (curl -fs --head https://google.com | grep ^Date: | sed 's/Date: //g' || true)
Fri Dec 14 14:38:38 UTC 2018
---
travis_time:end:11749350:start=1544798319788907819,finish=1544798319795240539,duration=6332720
travis_fold:end:after_failure.3
travis_fold:start:after_failure.4
travis_time:start:0615b694
$ ln -s . checkout && for CORE in obj/cores/core.*; do EXE=$(echo $CORE | sed 's|obj/cores/core\.[0-9]*\.!checkout!\(.*\)|\1|;y|!|/|'); if [ -f "$EXE

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

With this commit, lossy UTF-8 conversion of OsStr/OsString on Windows will
output three Unicode replacement characters (U+FFFD), one per byte, for
surrogate byte sequences, instead of just one, making it consistent with
lossy conversion on Unix and with the lossy conversion of raw bytes
sequences.

fixes rust-lang#56786
@jnqnfe
Copy link
Contributor Author

jnqnfe commented Dec 14, 2018

updated: forgot to update the test block at the bottom of the page

@jnqnfe
Copy link
Contributor Author

jnqnfe commented Dec 28, 2018

closing, as per #56786

@jnqnfe jnqnfe closed this Dec 28, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: Awaiting review from the assignee but also interested parties.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

utf-8 surrogate lossy conversion inconsistency
3 participants