-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comment constructor #57
Comments
You'll probably need to add that support by using an annotated AST (there's a file for supplying annotation in the repository). |
So the parser just skips comments right now? |
Yes, I believe so. I'm not sure if the annotated parser also does, though. |
I'm out of time these days, but can Ichip-in or help otherwise to make this happen? This would be ground work for declarative documentation in Nix, to generate docs from the source code. |
Right now, comments are skipped in the lexer: https://hackage.haskell.org/package/hnix-0.3.4/docs/src/Nix-Parser-Library.html#commentStyle, with Comments are tricky: # function comment
# Something something
{ args /*here’s another comment*/
, moreArgs # but where did it come from
# and more important:
, evenMoreArgs
# where do these comments belong?
}:
/* should I be
explicitely kept as multiline comment?
*/
# if so, is it necessary to keep the info that this was not a ML-comment?
# Maybe pretty-printing should handle that by looking at the line size?
/* also
* what about
* comments formatted like this
*
* Should lines be preserved?
* What about the strange stars in front of each line?
And if the comment suddenly stops using stars?
What then?
*/
# not all that easy on first thought?
{
# docstring?
attrWithComment = "foo"; # and line comment
} Comments seem to be completely removed in the lexer phase, stuff like { foo /*uh oh*/ = /*oh uh*/ "23" /*test*/; }
/*comment*/ parses into That might make it a tad tricky to integrate them into the parsed AST in a sensible way? |
I thought about it a bit more. data Comment = Comment
{ commentType :: CommentType
, commentContent :: Text
, commentSpan :: SrcSpan }
data CommentType = SingleLine | MultiLine
data CommentSpan = CommentSpan
{ srcSpan :: SrcSpan
, comments :: Comment }
type NExprCommentF = AnnF CommentSpan NExprF Comments are parsed out into a list containing their contents & Right now the parser always creates a |
@Profpatsch I like the idea of adding the comments to the annotations. I wouldn't worry about the speed impact just yet. |
@Profpatsch, I'm slightly worried when you mention docstrings -- since they presumably can't be attached at arbitrary points, and also potentially due to the transformations you mention.. Do you envision being able to have an identity parser/pretty-printer with that scheme? |
I don’t really have a vision, just incoherent thoughts. :) hnix already keeps source spans by default:
So I don’t see how something like |
Hmm, so you're thinking of tying each comment to an AST item, if I understand correctly the type signature? I'm not sure how you'd be able to do it for all the examples you raised in #57 (comment): maybe full-line comments attach to the token after and part-of-line comments attach to the token before? (for the context, I'm currently trying to implement a tool for indenting nix code that basically parses then pretty-prints it, and comments are likely the last missing piece to integrate, so I'm looking this quite closely :)) |
@Ekleog I suspect you’d have to just decide on what lexer token you assign them to. The definition of data CommentSpan = CommentSpan
{ srcSpan :: SrcSpan
, comments :: [Comment] } because of stuff like |
Hmm, but then the issue is to decide to which lexer token to assign the comments? For a few comments it's easy, like: nothere
here // my comment
nothere nothere
here /* my comment */
nothere nothere
// my comment
here here
// my comment
nothere But for others even as a human without additional context I couldn't guess to which token the following are to be assigned? maybehere /* my comment */ orhere maybehere
// my comment
orhere Then I guess it's just good enough to autocratically pick one and say “this is it” :) |
It's been a long time since I looked at the
This allows the comment information to be fully preserved and accessible without making the AST cumbersome to use. |
In regards to the announced sprint tomorrow at nixcon it was said that adding parsing for comments is a wanted feature. I just wanted to ask that if this feature be added to https://github.com/NixOS/nix so that other language projects can benefit as well! I myself have been working on a python parser and the one thing that is dropped is comments due to the nix ast parser. Mic92/pythonix#2 |
I've had a look at the AST and it seems that for the purpose of formatting nix code reliably, we will need to have access to token spans. This is because the AST simply can not (and should not) represent the comments and/or whitespace in every possible place. Example:
vs
Of course preserving all comments is not a requirement for retaining comments, but both a formatter and some kind of comment processing seem to benefit from having a token stream, as @taktoa points out. I'll see how far I can still get with tokenizing today at the NixCon 2018 hackday. |
As I see people pointing to formatting nix code reliably, I'll point to https://github.com/Ekleog/nix-bikeshed/ so that people don't need to re-do the same thing twice :) It's more or less working, mostly missing comments at the time being. (actually, running it is how I found out #152: parsing, generating the file and re-parsing didn't give the same answer, and for my generation) So if some support for comments could be added, it'd be great! :) |
@Ekleog how does it compare to hnix formatter? |
If you're speaking of `Nix.Pretty`, then `nix-bikeshed` handles wrapping
overlong lines correctly, and doesn't hesitate to switch between a
`''foo''` string and a `"foo"` string (depending on line length), for
the two first changes I can see off-the-top-of-my-head.
If there's another hnix formatter, then I didn't find about it when I
started `nix-bikeshed` a few months ago (before putting it on hold
waiting for this issue to be fixed)
|
Sorry guys, I have given up on lexing. My findings:
|
+1 See https://github.com/NorfairKing/nixfmt for an attempt at a normalising formatter. |
@NorfairKing @Ekleog now we have three formatters :) It would be good to see what's the difference and just get |
@domenkozar |
You can format by invoking |
@domenkozar I think foo:
bar:
- a
- b
- c
baz:
- some long stuff
- other long stuff
- let us say this overflows max line length While formatting it would be like: foo:
bar: [a, b, c]
baz:
- some long stuff
- other long stuff
- let us say this overflows max line length As such, I think the objective of the two projects ( As for having |
The difference would only be setting how long can a line be, so setting it to 1 char would "pretty-print" it. So it could easily be under the same codebase, I think. |
Well, we totally agree that a formatter can almost do pretty-printing. However I think is that a pretty-printer can't do formatting at all :) (also, the difference also includes eg. whether |
Just my 2 cents: I don't much care about formatting vs pretty-printing. I just want a single normalized form of a given AST that is vaguely good for humans to edit. Something like |
@NorfairKing same here. I worked on formatter to do "something sane", but I don't prefer the discussion on the style as much as I value consistency. The current formatter is used in https://github.com/domenkozar/hnix-lsp and I'd love if hnix would preserve even most basic comments (after lines and in-between). |
If you're interested in something *vaguely* good for humans to edit,
you'll likely be OK with a pretty-printer.
If you want a tool to add to your CI to ensure your code is properly
formatted, you'll likely want a formatter.
I think this sums up the difference I see, and the reason why I think
`Nix.Pretty`, by its name, doesn't try to handle the problem
`nix-bikeshed` is trying to handle. But maybe I'm wrong here, I'll let
@jwiegley contradict me :)
Also, a reason why I don't think `nix-bikeshed` should live under `hnix`
is due to its name: it'll likely see a lot of churn because no one will
agree on the way things should be formatted, if we want to enforce
formatting across nixpkgs, and so keeping the issue tracker separate
sounded like a net win to me.
BTW, the reason why I didn't announce `nix-bikeshed` until now is
because it doesn't support comments yet, and I believe if it doesn't
support comments then it's not a formatter -- while it's already a
pretty-printer -- and I thus didn't want to ping people on a half-baked
thing that wouldn't be actually usable :)
|
One advantage of pretty printing over formatting: smaller diffs. With automated formatting, you never know when it's going to decide to completely rewrite a large block of code because some limit has been reached. |
Did anything ever come of this? I think that even a simple solution which recognizes comments only in specific places would be quite useful! |
I've written a pretty basic comment annotator here: https://github.com/expipiplus1/update-nix-fetchgit/blob/2e9df344c40e289920e94c82dbce2d87ae54ecc5/src/Nix/Comments.hs It simply attaches comments to the expression which immediately precedes them (so multiple expressions may own the same comment). It also only handles single line comments. |
Would mention the core point of current comment processing: Lines 498 to 506 in 57a55fd
So, indeed (as points the topic), currently, HNix parser processes comments as whitespaces. |
As far as I can see, there is no way to represent comments in the AST at the moment.
Since I’m generating files that should be human-readable as well, it would be nice to have them (only line comments as far as I’m concerned, but inline/block comments are probably a good idea as well).
The text was updated successfully, but these errors were encountered: