Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove Lexer's dependency on Parser. #134192

Merged
merged 1 commit into from
Dec 14, 2024

Conversation

nnethercote
Copy link
Contributor

Lexing precedes parsing, as you'd expect: Lexer creates a TokenStream and Parser then parses that TokenStream.

But, in a horrendous violation of layering abstractions and common sense, Lexer depends on Parser! The Lexer::unclosed_delim_err method does some error recovery that relies on creating a Parser to do some post-processing of the TokenStream that the Lexer just created.

This commit just removes unclosed_delim_err. This change removes Lexer's dependency on Parser, and also means that lex_token_tree's return value can have a more typical form.

The cost is slightly worse error messages in two obscure cases, as shown in these tests:

  • tests/ui/parser/brace-in-let-chain.rs: there is slightly less explanation in this case involving an extra {.
  • tests/ui/parser/diff-markers/unclosed-delims{,-in-macro}.rs: the diff marker detection is no longer supported (because that detection is implemented in the parser).

In my opinion this cost is outweighed by the magnitude of the code cleanup.

r? @chenyukang

@nnethercote nnethercote marked this pull request as ready for review December 12, 2024 01:05
@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Dec 12, 2024
@bors
Copy link
Contributor

bors commented Dec 12, 2024

☔ The latest upstream changes (presumably #134201) made this pull request unmergeable. Please resolve the merge conflicts.

Copy link
Member

@compiler-errors compiler-errors left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r=me after rebase

Lexing precedes parsing, as you'd expect: `Lexer` creates a
`TokenStream` and `Parser` then parses that `TokenStream`.

But, in a horrendous violation of layering abstractions and common
sense, `Lexer` depends on `Parser`! The `Lexer::unclosed_delim_err`
method does some error recovery that relies on creating a `Parser` to do
some post-processing of the `TokenStream` that the `Lexer` just created.

This commit just removes `unclosed_delim_err`. This change removes
`Lexer`'s dependency on `Parser`, and also means that `lex_token_tree`'s
return value can have a more typical form.

The cost is slightly worse error messages in two obscure cases, as shown
in these tests:
- tests/ui/parser/brace-in-let-chain.rs: there is slightly less
  explanation in this case involving an extra `{`.
- tests/ui/parser/diff-markers/unclosed-delims{,-in-macro}.rs: the diff
  marker detection is no longer supported (because that detection is
  implemented in the parser).

In my opinion this cost is outweighed by the magnitude of the code
cleanup.
@nnethercote
Copy link
Contributor Author

I rebased.

@bors r=compiler-errors rollup

@bors
Copy link
Contributor

bors commented Dec 12, 2024

📌 Commit 2e412fe has been approved by compiler-errors

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 12, 2024
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Dec 12, 2024
…=compiler-errors

Remove `Lexer`'s dependency on `Parser`.

Lexing precedes parsing, as you'd expect: `Lexer` creates a `TokenStream` and `Parser` then parses that `TokenStream`.

But, in a horrendous violation of layering abstractions and common sense, `Lexer` depends on `Parser`! The `Lexer::unclosed_delim_err` method does some error recovery that relies on creating a `Parser` to do some post-processing of the `TokenStream` that the `Lexer` just created.

This commit just removes `unclosed_delim_err`. This change removes `Lexer`'s dependency on `Parser`, and also means that `lex_token_tree`'s return value can have a more typical form.

The cost is slightly worse error messages in two obscure cases, as shown in these tests:
- tests/ui/parser/brace-in-let-chain.rs: there is slightly less explanation in this case involving an extra `{`.
- tests/ui/parser/diff-markers/unclosed-delims{,-in-macro}.rs: the diff marker detection is no longer supported (because that detection is implemented in the parser).

In my opinion this cost is outweighed by the magnitude of the code cleanup.

r? `@chenyukang`
bors added a commit to rust-lang-ci/rust that referenced this pull request Dec 13, 2024
…iaskrgr

Rollup of 7 pull requests

Successful merges:

 - rust-lang#130060 (Autodiff Upstreaming - rustc_codegen_llvm changes)
 - rust-lang#132038 (Add lint rule for `#[deprecated]` on re-exports)
 - rust-lang#133937 (Keep track of parse errors in `mod`s and don't emit resolve errors for paths involving them)
 - rust-lang#133942 (Clarify how to use `black_box()`)
 - rust-lang#134081 (Try to evaluate constants in legacy mangling)
 - rust-lang#134192 (Remove `Lexer`'s dependency on `Parser`.)
 - rust-lang#134209 (validate `--skip` and `--exclude` paths)

Failed merges:

 - rust-lang#133099 (forbid toggling x87 and fpregs on hard-float targets)

r? `@ghost`
`@rustbot` modify labels: rollup
compiler-errors added a commit to compiler-errors/rust that referenced this pull request Dec 13, 2024
…=compiler-errors

Remove `Lexer`'s dependency on `Parser`.

Lexing precedes parsing, as you'd expect: `Lexer` creates a `TokenStream` and `Parser` then parses that `TokenStream`.

But, in a horrendous violation of layering abstractions and common sense, `Lexer` depends on `Parser`! The `Lexer::unclosed_delim_err` method does some error recovery that relies on creating a `Parser` to do some post-processing of the `TokenStream` that the `Lexer` just created.

This commit just removes `unclosed_delim_err`. This change removes `Lexer`'s dependency on `Parser`, and also means that `lex_token_tree`'s return value can have a more typical form.

The cost is slightly worse error messages in two obscure cases, as shown in these tests:
- tests/ui/parser/brace-in-let-chain.rs: there is slightly less explanation in this case involving an extra `{`.
- tests/ui/parser/diff-markers/unclosed-delims{,-in-macro}.rs: the diff marker detection is no longer supported (because that detection is implemented in the parser).

In my opinion this cost is outweighed by the magnitude of the code cleanup.

r? ``@chenyukang``
@chenyukang
Copy link
Member

I noticed there is another PR: #134189
Why do we want to remove diff marker detection? I don't know the context for it.

bors added a commit to rust-lang-ci/rust that referenced this pull request Dec 13, 2024
…mpiler-errors

Rollup of 10 pull requests

Successful merges:

 - rust-lang#130060 (Autodiff Upstreaming - rustc_codegen_llvm changes)
 - rust-lang#132150 (Fix powerpc64 big-endian FreeBSD ABI)
 - rust-lang#133942 (Clarify how to use `black_box()`)
 - rust-lang#134081 (Try to evaluate constants in legacy mangling)
 - rust-lang#134192 (Remove `Lexer`'s dependency on `Parser`.)
 - rust-lang#134208 (coverage: Tidy up creation of covmap and covfun records)
 - rust-lang#134209 (validate `--skip` and `--exclude` paths)
 - rust-lang#134211 (On Neutrino QNX, reduce the need to set archiver via environment variables)
 - rust-lang#134227 (Update wasi-sdk used to build WASI targets)
 - rust-lang#134229 (Fix typos in docs on provenance)

r? `@ghost`
`@rustbot` modify labels: rollup
Zalathar added a commit to Zalathar/rust that referenced this pull request Dec 13, 2024
…=compiler-errors

Remove `Lexer`'s dependency on `Parser`.

Lexing precedes parsing, as you'd expect: `Lexer` creates a `TokenStream` and `Parser` then parses that `TokenStream`.

But, in a horrendous violation of layering abstractions and common sense, `Lexer` depends on `Parser`! The `Lexer::unclosed_delim_err` method does some error recovery that relies on creating a `Parser` to do some post-processing of the `TokenStream` that the `Lexer` just created.

This commit just removes `unclosed_delim_err`. This change removes `Lexer`'s dependency on `Parser`, and also means that `lex_token_tree`'s return value can have a more typical form.

The cost is slightly worse error messages in two obscure cases, as shown in these tests:
- tests/ui/parser/brace-in-let-chain.rs: there is slightly less explanation in this case involving an extra `{`.
- tests/ui/parser/diff-markers/unclosed-delims{,-in-macro}.rs: the diff marker detection is no longer supported (because that detection is implemented in the parser).

In my opinion this cost is outweighed by the magnitude of the code cleanup.

r? `@chenyukang`
bors added a commit to rust-lang-ci/rust that referenced this pull request Dec 13, 2024
Rollup of 9 pull requests

Successful merges:

 - rust-lang#130060 (Autodiff Upstreaming - rustc_codegen_llvm changes)
 - rust-lang#132038 (Add lint rule for `#[deprecated]` on re-exports)
 - rust-lang#132150 (Fix powerpc64 big-endian FreeBSD ABI)
 - rust-lang#133633 (don't show the full linker args unless `--verbose` is passed)
 - rust-lang#133942 (Clarify how to use `black_box()`)
 - rust-lang#134081 (Try to evaluate constants in legacy mangling)
 - rust-lang#134192 (Remove `Lexer`'s dependency on `Parser`.)
 - rust-lang#134208 (coverage: Tidy up creation of covmap and covfun records)
 - rust-lang#134211 (On Neutrino QNX, reduce the need to set archiver via environment variables)

Failed merges:

 - rust-lang#133099 (forbid toggling x87 and fpregs on hard-float targets)

r? `@ghost`
`@rustbot` modify labels: rollup
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Dec 13, 2024
…=compiler-errors

Remove `Lexer`'s dependency on `Parser`.

Lexing precedes parsing, as you'd expect: `Lexer` creates a `TokenStream` and `Parser` then parses that `TokenStream`.

But, in a horrendous violation of layering abstractions and common sense, `Lexer` depends on `Parser`! The `Lexer::unclosed_delim_err` method does some error recovery that relies on creating a `Parser` to do some post-processing of the `TokenStream` that the `Lexer` just created.

This commit just removes `unclosed_delim_err`. This change removes `Lexer`'s dependency on `Parser`, and also means that `lex_token_tree`'s return value can have a more typical form.

The cost is slightly worse error messages in two obscure cases, as shown in these tests:
- tests/ui/parser/brace-in-let-chain.rs: there is slightly less explanation in this case involving an extra `{`.
- tests/ui/parser/diff-markers/unclosed-delims{,-in-macro}.rs: the diff marker detection is no longer supported (because that detection is implemented in the parser).

In my opinion this cost is outweighed by the magnitude of the code cleanup.

r? ``@chenyukang``
bors added a commit to rust-lang-ci/rust that referenced this pull request Dec 13, 2024
…iaskrgr

Rollup of 8 pull requests

Successful merges:

 - rust-lang#132038 (Add lint rule for `#[deprecated]` on re-exports)
 - rust-lang#132150 (Fix powerpc64 big-endian FreeBSD ABI)
 - rust-lang#133633 (don't show the full linker args unless `--verbose` is passed)
 - rust-lang#133942 (Clarify how to use `black_box()`)
 - rust-lang#134081 (Try to evaluate constants in legacy mangling)
 - rust-lang#134192 (Remove `Lexer`'s dependency on `Parser`.)
 - rust-lang#134208 (coverage: Tidy up creation of covmap and covfun records)
 - rust-lang#134211 (On Neutrino QNX, reduce the need to set archiver via environment variables)

r? `@ghost`
`@rustbot` modify labels: rollup
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Dec 13, 2024
…=compiler-errors

Remove `Lexer`'s dependency on `Parser`.

Lexing precedes parsing, as you'd expect: `Lexer` creates a `TokenStream` and `Parser` then parses that `TokenStream`.

But, in a horrendous violation of layering abstractions and common sense, `Lexer` depends on `Parser`! The `Lexer::unclosed_delim_err` method does some error recovery that relies on creating a `Parser` to do some post-processing of the `TokenStream` that the `Lexer` just created.

This commit just removes `unclosed_delim_err`. This change removes `Lexer`'s dependency on `Parser`, and also means that `lex_token_tree`'s return value can have a more typical form.

The cost is slightly worse error messages in two obscure cases, as shown in these tests:
- tests/ui/parser/brace-in-let-chain.rs: there is slightly less explanation in this case involving an extra `{`.
- tests/ui/parser/diff-markers/unclosed-delims{,-in-macro}.rs: the diff marker detection is no longer supported (because that detection is implemented in the parser).

In my opinion this cost is outweighed by the magnitude of the code cleanup.

r? ```@chenyukang```
bors added a commit to rust-lang-ci/rust that referenced this pull request Dec 13, 2024
…iaskrgr

Rollup of 7 pull requests

Successful merges:

 - rust-lang#132038 (Add lint rule for `#[deprecated]` on re-exports)
 - rust-lang#132150 (Fix powerpc64 big-endian FreeBSD ABI)
 - rust-lang#133633 (don't show the full linker args unless `--verbose` is passed)
 - rust-lang#133942 (Clarify how to use `black_box()`)
 - rust-lang#134081 (Try to evaluate constants in legacy mangling)
 - rust-lang#134192 (Remove `Lexer`'s dependency on `Parser`.)
 - rust-lang#134211 (On Neutrino QNX, reduce the need to set archiver via environment variables)

r? `@ghost`
`@rustbot` modify labels: rollup
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Dec 13, 2024
…=compiler-errors

Remove `Lexer`'s dependency on `Parser`.

Lexing precedes parsing, as you'd expect: `Lexer` creates a `TokenStream` and `Parser` then parses that `TokenStream`.

But, in a horrendous violation of layering abstractions and common sense, `Lexer` depends on `Parser`! The `Lexer::unclosed_delim_err` method does some error recovery that relies on creating a `Parser` to do some post-processing of the `TokenStream` that the `Lexer` just created.

This commit just removes `unclosed_delim_err`. This change removes `Lexer`'s dependency on `Parser`, and also means that `lex_token_tree`'s return value can have a more typical form.

The cost is slightly worse error messages in two obscure cases, as shown in these tests:
- tests/ui/parser/brace-in-let-chain.rs: there is slightly less explanation in this case involving an extra `{`.
- tests/ui/parser/diff-markers/unclosed-delims{,-in-macro}.rs: the diff marker detection is no longer supported (because that detection is implemented in the parser).

In my opinion this cost is outweighed by the magnitude of the code cleanup.

r? ````@chenyukang````
bors added a commit to rust-lang-ci/rust that referenced this pull request Dec 13, 2024
Rollup of 3 pull requests

Successful merges:

 - rust-lang#134081 (Try to evaluate constants in legacy mangling)
 - rust-lang#134192 (Remove `Lexer`'s dependency on `Parser`.)
 - rust-lang#134211 (On Neutrino QNX, reduce the need to set archiver via environment variables)

r? `@ghost`
`@rustbot` modify labels: rollup

try-job: dist-x86_64-linux
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Dec 13, 2024
…=compiler-errors

Remove `Lexer`'s dependency on `Parser`.

Lexing precedes parsing, as you'd expect: `Lexer` creates a `TokenStream` and `Parser` then parses that `TokenStream`.

But, in a horrendous violation of layering abstractions and common sense, `Lexer` depends on `Parser`! The `Lexer::unclosed_delim_err` method does some error recovery that relies on creating a `Parser` to do some post-processing of the `TokenStream` that the `Lexer` just created.

This commit just removes `unclosed_delim_err`. This change removes `Lexer`'s dependency on `Parser`, and also means that `lex_token_tree`'s return value can have a more typical form.

The cost is slightly worse error messages in two obscure cases, as shown in these tests:
- tests/ui/parser/brace-in-let-chain.rs: there is slightly less explanation in this case involving an extra `{`.
- tests/ui/parser/diff-markers/unclosed-delims{,-in-macro}.rs: the diff marker detection is no longer supported (because that detection is implemented in the parser).

In my opinion this cost is outweighed by the magnitude of the code cleanup.

r? `````@chenyukang`````
Zalathar added a commit to Zalathar/rust that referenced this pull request Dec 14, 2024
…=compiler-errors

Remove `Lexer`'s dependency on `Parser`.

Lexing precedes parsing, as you'd expect: `Lexer` creates a `TokenStream` and `Parser` then parses that `TokenStream`.

But, in a horrendous violation of layering abstractions and common sense, `Lexer` depends on `Parser`! The `Lexer::unclosed_delim_err` method does some error recovery that relies on creating a `Parser` to do some post-processing of the `TokenStream` that the `Lexer` just created.

This commit just removes `unclosed_delim_err`. This change removes `Lexer`'s dependency on `Parser`, and also means that `lex_token_tree`'s return value can have a more typical form.

The cost is slightly worse error messages in two obscure cases, as shown in these tests:
- tests/ui/parser/brace-in-let-chain.rs: there is slightly less explanation in this case involving an extra `{`.
- tests/ui/parser/diff-markers/unclosed-delims{,-in-macro}.rs: the diff marker detection is no longer supported (because that detection is implemented in the parser).

In my opinion this cost is outweighed by the magnitude of the code cleanup.

r? ``````@chenyukang``````
bors added a commit to rust-lang-ci/rust that referenced this pull request Dec 14, 2024
…iaskrgr

Rollup of 7 pull requests

Successful merges:

 - rust-lang#132150 (Fix powerpc64 big-endian FreeBSD ABI)
 - rust-lang#133633 (don't show the full linker args unless `--verbose` is passed)
 - rust-lang#133942 (Clarify how to use `black_box()`)
 - rust-lang#134081 (Try to evaluate constants in legacy mangling)
 - rust-lang#134192 (Remove `Lexer`'s dependency on `Parser`.)
 - rust-lang#134208 (coverage: Tidy up creation of covmap and covfun records)
 - rust-lang#134211 (On Neutrino QNX, reduce the need to set archiver via environment variables)

r? `@ghost`
`@rustbot` modify labels: rollup
bors added a commit to rust-lang-ci/rust that referenced this pull request Dec 14, 2024
…iaskrgr

Rollup of 6 pull requests

Successful merges:

 - rust-lang#132150 (Fix powerpc64 big-endian FreeBSD ABI)
 - rust-lang#133942 (Clarify how to use `black_box()`)
 - rust-lang#134081 (Try to evaluate constants in legacy mangling)
 - rust-lang#134192 (Remove `Lexer`'s dependency on `Parser`.)
 - rust-lang#134208 (coverage: Tidy up creation of covmap and covfun records)
 - rust-lang#134211 (On Neutrino QNX, reduce the need to set archiver via environment variables)

r? `@ghost`
`@rustbot` modify labels: rollup
bors added a commit to rust-lang-ci/rust that referenced this pull request Dec 14, 2024
…iaskrgr

Rollup of 6 pull requests

Successful merges:

 - rust-lang#132150 (Fix powerpc64 big-endian FreeBSD ABI)
 - rust-lang#133942 (Clarify how to use `black_box()`)
 - rust-lang#134081 (Try to evaluate constants in legacy mangling)
 - rust-lang#134192 (Remove `Lexer`'s dependency on `Parser`.)
 - rust-lang#134208 (coverage: Tidy up creation of covmap and covfun records)
 - rust-lang#134211 (On Neutrino QNX, reduce the need to set archiver via environment variables)

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit ac6ac81 into rust-lang:master Dec 14, 2024
6 checks passed
@rustbot rustbot added this to the 1.85.0 milestone Dec 14, 2024
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request Dec 14, 2024
Rollup merge of rust-lang#134192 - nnethercote:rm-Lexer-Parser-dep, r=compiler-errors

Remove `Lexer`'s dependency on `Parser`.

Lexing precedes parsing, as you'd expect: `Lexer` creates a `TokenStream` and `Parser` then parses that `TokenStream`.

But, in a horrendous violation of layering abstractions and common sense, `Lexer` depends on `Parser`! The `Lexer::unclosed_delim_err` method does some error recovery that relies on creating a `Parser` to do some post-processing of the `TokenStream` that the `Lexer` just created.

This commit just removes `unclosed_delim_err`. This change removes `Lexer`'s dependency on `Parser`, and also means that `lex_token_tree`'s return value can have a more typical form.

The cost is slightly worse error messages in two obscure cases, as shown in these tests:
- tests/ui/parser/brace-in-let-chain.rs: there is slightly less explanation in this case involving an extra `{`.
- tests/ui/parser/diff-markers/unclosed-delims{,-in-macro}.rs: the diff marker detection is no longer supported (because that detection is implemented in the parser).

In my opinion this cost is outweighed by the magnitude of the code cleanup.

r? ```````@chenyukang```````
@nnethercote nnethercote deleted the rm-Lexer-Parser-dep branch December 15, 2024 22:30
@Kobzol
Copy link
Contributor

Kobzol commented Dec 23, 2024

@rust-timer build 3ca6e82

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (3ca6e82): comparison URL.

Overall result: no relevant changes - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results (primary 2.4%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
2.4% [2.4%, 2.4%] 1
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 2.4% [2.4%, 2.4%] 1

Cycles

Results (primary -3.0%, secondary 3.2%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
8.4% [8.4%, 8.4%] 1
Improvements ✅
(primary)
-3.0% [-3.0%, -3.0%] 1
Improvements ✅
(secondary)
-2.1% [-2.1%, -2.1%] 1
All ❌✅ (primary) -3.0% [-3.0%, -3.0%] 1

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 769.535s -> 769.55s (0.00%)
Artifact size: 333.08 MiB -> 333.10 MiB (0.01%)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants