-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarify the workings of StringReader in the lexer #36470
Clarify the workings of StringReader in the lexer #36470
Conversation
The meaning of the "last" character is non-obvious: it could be (a) the current character or (b) the character prior to that. `last_pos` uses meaning (a), but `curr_pos` makes that clearer. Even better, `curr_pos` matches `curr`. This commit also avoids a few unnecessary local variables.
This makes it clearer that it refers to the next character, and not to the current character.
This makes it clearer that it refers to the current character. (Prior to this commit, even the comment describing `col` got this wrong.)
First, assert! is redundant w.r.t. the unwrap() immediately afterwards. Second, `byte_offset_diff` is effectively computed as `current_byte_offset + ch.len_utf8() - current_byte_offset` (with `next` as an intermediate) which is silly and can be simplified.
This commit renames the variables to make it clearer which char each one refers to. It also slightly reorders and rearranges some statements.
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @sfackler (or someone else) soon. If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes. Please see the contribution instructions for more information. |
The two branches of this `if` compute the same value. This commit gets rid of the first branch, which makes this calculation identical to the one in scan_block_comment().
This changes public API of libsyntax, so it needs special handling. cc #31645. |
@sfackler: 7 day review ping! |
So the patches above do these renamings:
Another possibility is this:
The idea being that we avoid |
☔ The latest upstream changes (presumably #36154) made this pull request unmergeable. Please resolve the merge conflicts. |
r? @eddyb, perhaps you can help take a look here? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Manishearth r=me
Oh... I pulled out two of these commits in #36921, and I want to redo the renaming in the way I mentioned above, and I want to do some similar renamings in the parser. So: thank you for the review, @eddyb, but I'd like to hold off from landing these commits in their current form. (Well, they need a rebase, but I'm going to change them in a bigger way.) I will file a new PR for that to make things clearer. Apologies for the lack of clarity. |
Two lexer tweaks 19 days later, I haven't received a review of my commits in rust-lang#36470. In an attempt to make some progress, I'm going to split up the changes. Here are the ones that don't relate to renaming things.
I'm going to reuse this PR for my updated commits. |
Nope, that didn't work because I did the new patches in a different local clone. So I opened #36969. |
In a lexer you want it to be extremely clear what the current character is, as well as those before and after it. The current code and comments in StringReader don't achieve this:
curr
refers to the current character, but so doeslast_pos
;pos
refers to the next character; andcol
refers to the current character but its comment says it refers to the next character.This PR renames some of these fields and does some related clean-ups. This makes the lexer code easier to understand and marginally faster.