You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While working on doc gen (#21) I discovered that there is special code in the tokenizer state machine to "wrap up" the current state when EOF is encountered (stage1 and std/zig/tokenizer).
The logic here to terminate certain tokens or raise errors is duplicated from the main loop.
This final, extra state handler is a place where discrepancies or accidentally omitted cases could be introduced when updating tokenizer rules.
When I see a diff with no trailing newline, I cry. Just a drop, but all those tears add up.
🚫 ↩️ 😢
Inspired by #663 and feeling brassy, I thought I'd propose one further source file encoding requirement: The final character must be LF (0x0A).
Benefits:
Simpler tokenizing logic: upon EOF, no need to handle unterminated tokens.
Increases source code uniformity and minor accidental diff noise.
Downsides:
Adds one more rule to source file validation (not sure of the status in stage1; according to Zig source encoding #663 self-hosted is compliant).
Adds a new restriction by which the programmer/editor must abide.
Questions from newcomers and those who have not configured their editors.
Unconventional.
Neutral:
This restriction on the programmer/editor would be the same level of severity as The Hard Tabs Issue #544, and just as easy for them to acquiesce, so we have a precedent.
The text was updated successfully, but these errors were encountered:
andrewrk
added
the
proposal
This issue suggests modifications. If it also has the "accepted" label then it is planned.
label
Nov 24, 2018
I'm making the call here, that this is going to work the same as hard tabs and CRLFs, which is, it is accepted by the stage2 parser. However, zig fmt fixes all whitespace issues, including this one.
Simpler tokenizing logic: upon EOF, no need to handle unterminated tokens.
We could remove code from stage1, but since zig fmt has to be able to fix this, the code would have to stay in the self hosted tokenizer.
Increases source code uniformity and minor accidental diff noise.
Here we have the separate issue of how much to enforce zig fmt. That's something that is reasonable to discuss, but I'm confident that "handle final newline the same as hard tabs & carriage returns" is the right approach here.
While working on doc gen (#21) I discovered that there is special code in the tokenizer state machine to "wrap up" the current state when EOF is encountered (stage1 and std/zig/tokenizer).
🚫 ↩️ 😢
Inspired by #663 and feeling brassy, I thought I'd propose one further source file encoding requirement: The final character must be LF (
0x0A
).Benefits:
Downsides:
Neutral:
The text was updated successfully, but these errors were encountered: