-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unhelpful Error Messages When Trying to Compile UTF16 Files #73979
Comments
@rustbot claim |
There's still a question. Should |
@Stupremee I do not think rustc should parse the file as UTF-16, as this would introduce an ambiguity. I opened this issue only to address the poor UX of the current error messages. |
@rustbot release-assignment |
Maybe add a help message when output error "unknown start of token: \u{0}". |
Suggest character encoding is incorrect when encountering random null bytes This adds a note whenever null bytes are seen at the start of a token unexpectedly, since those tend to come from UTF-16 encoded files without a [BOM](https://en.wikipedia.org/wiki/Byte_order_mark) (if a UTF-16 BOM appears it won't be valid UTF-8, but if there is no BOM it be both valid UTF-16 and valid but garbled UTF-8). This approach was suggested in rust-lang#73979 (comment). Closes rust-lang#73979.
Saving a Hello World program as UTF-16:
and trying to compile it causes about as many errors as there are characters in the file complaining of
unknown start of token: \u{0}
. Instead, using some heuristics to determine that the file is saved as UTF16 and printing a more helpful error message would be much friendlier to new users who are most likely to run into this issue.Meta
rustc --version --verbose
:The text was updated successfully, but these errors were encountered: