lexer: improve errors #9308

ben0x539 · 2013-09-19T01:13:27Z

Previously, the lexer calling rdr.fatal(...) would report the span of
the last complete token, instead of a span within the erroneous token
(besides one span fixed in 1ac90bb).

This branch adds wrappers around rdr.fatal(...) that sets the span
explicilty, so that all fatal errors in libsyntax/parse/lexer.rs now
report the offending code more precisely. A number of tests try to
verify that, though the compile-fail testing setup can only check that
the spans are on the right lines, and the "unterminated string/block
comment" errors can't have the line marked at all, so that's incomplete.

This closes #9149.

Also, the lexer errors now report the offending code in the error message,
not just via the span, just like other errors do.

ben0x539 · 2013-09-19T14:43:57Z

Added two more commits so that the error messages always mention the offending code in a hopefully helpful format.

Previously, the lexer calling `rdr.fatal(...)` would report the span of the last complete token, instead of a span within the erroneous token (besides one span fixed in 1ac90bb). This commit adds a wrapper around `rdr.fatal(...)` that sets the span explicilty, so that all fatal errors in `libsyntax/parse/lexer.rs` now report the offending code more precisely. A number of tests try to verify that, though the `compile-fail` testing setup can only check that the spans are on the right lines, and the "unterminated string/block comment" errors can't have the line marked at all, so that's incomplete. Closes rust-lang#9149.

... instead of giving their numeric codepoint, following the lead of fdaae34. So the error message for, say, '\_' mentions _ instead of 95, and '\●' now mentions \u25cf.

alexcrichton · 2013-09-19T18:58:32Z

src/libsyntax/parse/lexer.rs

@@ -327,7 +367,8 @@ fn consume_block_comment(rdr: @mut StringReader)
            bump(rdr);
        }
        if is_eof(rdr) {
-            rdr.fatal(~"unterminated block doc-comment");
+            fatal_span_verbose(rdr, start_bpos, rdr.last_pos,


Does this mean that the entire comment will be included in the error message?

It'd be nice to see a test for this.

This also may be subsumed by my comment above.

ben0x539 · 2013-09-19T19:49:24Z

It's been my impression that there's some redundancy in other error messages too, and I figured that was desirable. For example, unknown identifiers, wrong types or unexpected tokens also seem to get reproduced in the error messages in some context.

I might have overdone it with reproducing whole block comments and string literals, but at that point I was mainly going for consistency with the errors for erroneous numeric literals or escape sequences which I felt were reasonable.

A test for unterminated comments seemed useless to me since by virtue of extending to eof, they don't leave anywhere to put the "error should happen here" annotation, so there's just the fact that an error happens.

Do you think I should drop the code-in-error-message for unterminated strings, comments and doc comments, but leave it in for the other cases?

Here's the lexer errors for the test cases, for reference: http://ix.io/8ag (the span for the last one is displayed wrong presumably because of #8706, but it reproduces the unterminated character constant in the error message just fine, so it's correct in the lexer)

alexcrichton · 2013-09-19T20:17:41Z

I would remove the verbose error messages about the numeric literals and comments, but I think that the ones in reference to the tokens are fine.

In general if a message just spits back what it highlights, then it's probably not necessary to have it (in my opinion), so for things like type errors it makes sense because you didn't necessarily have the type right there, although for identifiers I guess it's a bit more debatable (resolve is always super tricky though).

We can always revisit these error messages once we've been using them for awhile to see if they were the best courses of action.

ben0x539 · 2013-09-19T21:26:39Z

Replaced the last commit with a less verbose one.

Previously, the lexer calling `rdr.fatal(...)` would report the span of the last complete token, instead of a span within the erroneous token (besides one span fixed in 1ac90bb). This branch adds wrappers around `rdr.fatal(...)` that sets the span explicilty, so that all fatal errors in `libsyntax/parse/lexer.rs` now report the offending code more precisely. A number of tests try to verify that, though the `compile-fail` testing setup can only check that the spans are on the right lines, and the "unterminated string/block comment" errors can't have the line marked at all, so that's incomplete. This closes #9149. Also, the lexer errors now report the offending code in the error message, not just via the span, just like other errors do.

As documented in issue #7945, these literal identifiers are all accepted by rust today, but they should probably be disallowed (especially `'''`). This changes all escapable sequences to being *required* to be escaped. Closes #7945 I wanted to write the tests with more exact spans, but I think #9308 will be fixing that?

Use `check_proc_macro` for `missing_const_for_fn` This uses `@Jarcho's` rust-lang#8694 implementation to fix `missing_const_for_fn` linting in proc-macros. I'm not 100% sure what I'm doing here, any feedback is appreciated. Previously: Jarcho/rust-clippy#1. Fixes rust-lang#8854. changelog: [`missing_const_for_fn`]: No longer lints in proc-macros

ben0x539 added 2 commits September 19, 2013 16:58

lexer: report problematic chars verbatim or as escape sequence

8009c97

... instead of giving their numeric codepoint, following the lead of fdaae34. So the error message for, say, '\_' mentions _ instead of 95, and '\●' now mentions \u25cf.

alexcrichton mentioned this pull request Sep 19, 2013

Disallow char literals which should be escaped #9335

Merged

alexcrichton reviewed Sep 19, 2013
View reviewed changes

lexer: further slight improvements to lexer errors

567c567

bors closed this Sep 20, 2013

bors merged commit 567c567 into rust-lang:master Sep 20, 2013

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lexer: improve errors #9308

lexer: improve errors #9308

ben0x539 commented Sep 19, 2013

ben0x539 commented Sep 19, 2013

alexcrichton Sep 19, 2013

alexcrichton Sep 19, 2013

ben0x539 commented Sep 19, 2013

alexcrichton commented Sep 19, 2013

ben0x539 commented Sep 19, 2013

lexer: improve errors #9308

lexer: improve errors #9308

Conversation

ben0x539 commented Sep 19, 2013

ben0x539 commented Sep 19, 2013

alexcrichton Sep 19, 2013

Choose a reason for hiding this comment

alexcrichton Sep 19, 2013

Choose a reason for hiding this comment

ben0x539 commented Sep 19, 2013

alexcrichton commented Sep 19, 2013

ben0x539 commented Sep 19, 2013