Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make final newline mandatory #1779

Closed
hryx opened this issue Nov 24, 2018 · 3 comments
Closed

Make final newline mandatory #1779

hryx opened this issue Nov 24, 2018 · 3 comments
Labels
proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Milestone

Comments

@hryx
Copy link
Contributor

hryx commented Nov 24, 2018

While working on doc gen (#21) I discovered that there is special code in the tokenizer state machine to "wrap up" the current state when EOF is encountered (stage1 and std/zig/tokenizer).

  • The logic here to terminate certain tokens or raise errors is duplicated from the main loop.
  • This final, extra state handler is a place where discrepancies or accidentally omitted cases could be introduced when updating tokenizer rules.
  • When I see a diff with no trailing newline, I cry. Just a drop, but all those tears add up.

🚫 ↩️ 😢

Inspired by #663 and feeling brassy, I thought I'd propose one further source file encoding requirement: The final character must be LF (0x0A).

Benefits:

  • Simpler tokenizing logic: upon EOF, no need to handle unterminated tokens.
  • We get to remove code: stage1 and std/zig/tokenizer
  • Increases source code uniformity and minor accidental diff noise.

Downsides:

  • Adds one more rule to source file validation (not sure of the status in stage1; according to Zig source encoding #663 self-hosted is compliant).
  • Adds a new restriction by which the programmer/editor must abide.
  • Questions from newcomers and those who have not configured their editors.
  • Unconventional.

Neutral:

  • This restriction on the programmer/editor would be the same level of severity as The Hard Tabs Issue #544, and just as easy for them to acquiesce, so we have a precedent.
@andrewrk andrewrk added the proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. label Nov 24, 2018
@andrewrk andrewrk added this to the 0.5.0 milestone Nov 24, 2018
@andrewrk
Copy link
Member

Note that zig fmt enforces this, if the user chooses to run it.

@thejoshwolfe
Copy link
Contributor

An empty source file doesn't end with a newline, and that should be ok.

The final newline restriction is also in place for C/C++ (until C++11), so we're not doing anything too unprecedented.

I think the best argument in favor of this restriction is that it makes tokenizers easier to implement. I'm in favor of this.

@andrewrk andrewrk modified the milestones: 0.5.0, 0.6.0 Aug 16, 2019
@andrewrk
Copy link
Member

I'm making the call here, that this is going to work the same as hard tabs and CRLFs, which is, it is accepted by the stage2 parser. However, zig fmt fixes all whitespace issues, including this one.

  • Simpler tokenizing logic: upon EOF, no need to handle unterminated tokens.
  • We get to remove code: stage1 and std/zig/tokenizer

We could remove code from stage1, but since zig fmt has to be able to fix this, the code would have to stay in the self hosted tokenizer.

  • Increases source code uniformity and minor accidental diff noise.

Here we have the separate issue of how much to enforce zig fmt. That's something that is reasonable to discuss, but I'm confident that "handle final newline the same as hard tabs & carriage returns" is the right approach here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Projects
None yet
Development

No branches or pull requests

3 participants