update rustdoc's syntax checker to work with error-recovering lexer #63284

matklad · 2019-08-05T12:39:28Z

Rustdoc processes code inside of ``` blocks in two ways:

First, the code is checked for lexer errors
Second, lexer-based syntax highlighting is done

The checking for lexer errors works by intercepting fatal errors from the lexer. However, since this was originally implemented, the lexer moved from fatal erroring to error-recovery (#63017 in particular tries to remove the last bit of fatal erroring). That means that the current approach intercepts only a fraction of lexer errors, while most of the errors are reported twice (once duing the check, once during highlighing). Here's an example:

14:30:41|~/tmp
λ cat main.rs 
/// ```
/// '...'
/// ```
pub fn foo() {}

14:31:11|~/tmp
λ rustdoc main.rs
error: character literal may only contain one codepoint
 --> <doctest>:1:1
  |
1 | '...'
  | ^^^^^
help: if you meant to write a `str` literal, use double quotes
  |
1 | "..."
  |

error: character literal may only contain one codepoint
 --> <rustdoc-highlighting>:1:1
  |
1 | '...'
  | ^^^^^
help: if you meant to write a `str` literal, use double quotes
  |
1 | "..."

I think that, to fix this, we should configure the parsing session with a custom Emitter. For code-checking pass, the emmitter should downgrade all diagnostics to warnings and set a flag if there were any diagnostics. For syntax-highlighting pass, we should use a "/dev/null" emitter which just doesn't emit anything.

cc @euclio, @GuillaumeGomez

The text was updated successfully, but these errors were encountered:

euclio · 2019-08-05T15:11:48Z

cc #56885

matklad · 2019-08-05T15:49:46Z

hm, #56885 makes me think that there are two ways to approach this:

use special emitter (what issue description is proposing)
make all lexer errors buffered

I don't know what would be better (someone with more broad compiler knowledge needs to decide this), but I like BufferingEmitter approach better, because it can be reused anywhere, while buffering in the lexer will be special-casing the lexer.

estebank · 2019-08-05T20:40:57Z

An alternative would be to have the lexer and parser have a "strict" mode where they bail early when encountering any error, that way the caller can handle the Err case themselves. This would also help people using the Parser as a library.

Centril · 2019-08-05T22:25:21Z

@estebank That sounds like it would complicate parser.rs (which is already in a sorry state!).

Mark-Simulacrum · 2019-08-11T02:13:03Z

A separate/custom emitter is definitely the way to go here, we should not complicate parsing code anymore than it already is.

use silent emitter for rustdoc highlighting pass Partially addresses rust-lang#63284.

jyn514 · 2020-12-15T22:44:31Z

@matklad this only gives one warning on master, I think it was fixed by #76068 ?

camelid · 2021-01-29T00:20:46Z

Should this be closed?

jonas-schievink added C-cleanup Category: PRs that clean code up or issues documenting cleanup. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue. labels Aug 5, 2019

matklad mentioned this issue Aug 5, 2019

Remove special code-path for handing unknown tokens #63017

Merged

estebank mentioned this issue Aug 7, 2019

Parser silent exit on invalid input #63288

Closed

euclio mentioned this issue Nov 3, 2019

use silent emitter for rustdoc highlighting pass #66068

Merged

Centril added a commit to Centril/rust that referenced this issue Nov 5, 2019

Rollup merge of rust-lang#66068 - euclio:null-emitter, r=estebank

580dbb5

use silent emitter for rustdoc highlighting pass Partially addresses rust-lang#63284.

Centril added a commit to Centril/rust that referenced this issue Nov 6, 2019

Rollup merge of rust-lang#66068 - euclio:null-emitter, r=estebank

f5c5489

use silent emitter for rustdoc highlighting pass Partially addresses rust-lang#63284.

matklad mentioned this issue Aug 20, 2020

Switch rustdoc from lexer::StringReader to rustc_lexer #75619

Closed

jyn514 closed this as completed Apr 7, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

update rustdoc's syntax checker to work with error-recovering lexer #63284

update rustdoc's syntax checker to work with error-recovering lexer #63284

matklad commented Aug 5, 2019

euclio commented Aug 5, 2019

matklad commented Aug 5, 2019

estebank commented Aug 5, 2019

Centril commented Aug 5, 2019

Mark-Simulacrum commented Aug 11, 2019

jyn514 commented Dec 15, 2020

camelid commented Jan 29, 2021

update rustdoc's syntax checker to work with error-recovering lexer #63284

update rustdoc's syntax checker to work with error-recovering lexer #63284

Comments

matklad commented Aug 5, 2019

euclio commented Aug 5, 2019

matklad commented Aug 5, 2019

estebank commented Aug 5, 2019

Centril commented Aug 5, 2019

Mark-Simulacrum commented Aug 11, 2019

jyn514 commented Dec 15, 2020

camelid commented Jan 29, 2021