Comment constructor #57

Profpatsch · 2017-05-15T11:04:25Z

As far as I can see, there is no way to represent comments in the AST at the moment.

Since I’m generating files that should be human-readable as well, it would be nice to have them (only line comments as far as I’m concerned, but inline/block comments are probably a good idea as well).

jwiegley · 2017-05-15T14:55:55Z

You'll probably need to add that support by using an annotated AST (there's a file for supplying annotation in the repository).

Profpatsch · 2017-05-16T11:52:06Z

So the parser just skips comments right now?

jwiegley · 2017-05-28T03:56:30Z

Yes, I believe so. I'm not sure if the annotated parser also does, though.

domenkozar · 2017-06-27T11:43:31Z

I'm out of time these days, but can Ichip-in or help otherwise to make this happen?

This would be ground work for declarative documentation in Nix, to generate docs from the source code.

Profpatsch · 2017-06-28T04:32:22Z

Right now, comments are skipped in the lexer: https://hackage.haskell.org/package/hnix-0.3.4/docs/src/Nix-Parser-Library.html#commentStyle, with CommentStyle.

Comments are tricky:

# function comment
# Something something
{ args /*here’s another comment*/
, moreArgs # but where did it come from
# and more important:
, evenMoreArgs
# where do these comments belong?
}:

/* should I be
   explicitely kept as multiline comment?
*/

# if so, is it necessary to keep the info that this was not a ML-comment?
# Maybe pretty-printing should handle that by looking at the line size?

/* also
 * what about
 * comments formatted like this
 *
 * Should lines be preserved?
 * What about the strange stars in front of each line?

And if the comment suddenly stops using stars?
What then?
*/

# not all that easy on first thought?

{
  # docstring?
  attrWithComment = "foo"; # and line comment

}

Comments seem to be completely removed in the lexer phase, stuff like

{ foo /*uh oh*/ = /*oh uh*/ "23" /*test*/; }
/*comment*/

parses into { foo = "23"; } like a charm.

That might make it a tad tricky to integrate them into the parsed AST in a sensible way?

Profpatsch · 2017-06-28T17:09:32Z

I thought about it a bit more.
It doesn’t make very much sense to add comments to NExprF because they are on the lexer level and should probably be added to the annotations:

data Comment = Comment
  { commentType :: CommentType
  , commentContent :: Text
  , commentSpan :: SrcSpan }

data CommentType = SingleLine | MultiLine

data CommentSpan = CommentSpan
  { srcSpan :: SrcSpan
  , comments :: Comment }

type NExprCommentF = AnnF CommentSpan NExprF

Comments are parsed out into a list containing their contents & SrcSpans. Then a second pass can be used to convert this annotation to docstrings (by some more elaborate rules, like looking at the placement of comments and for instance combining multiple single line comments).

Right now the parser always creates a NExprLocF and then strips it away if one only uses nixExpr :: Parser NExpr. If we extend that do the same with comments, I’m not sure if the performance/memory footprint of nixExpr would suffer? Probably depends if the laziness of stripAnnotation <$> nixExprLoc reaches the many (annotateLocation (void $ symbol name))-part in nixExprLoc? I’m not sure.

jwiegley · 2018-04-02T23:07:28Z

@Profpatsch I like the idea of adding the comments to the annotations. I wouldn't worry about the speed impact just yet.

deepfire · 2018-04-09T00:30:50Z

@Profpatsch, I'm slightly worried when you mention docstrings -- since they presumably can't be attached at arbitrary points, and also potentially due to the transformations you mention..

Do you envision being able to have an identity parser/pretty-printer with that scheme?

Profpatsch · 2018-04-09T12:59:13Z

Do you envision being able to have an identity parser/pretty-printer with that scheme?

I don’t really have a vision, just incoherent thoughts. :)

hnix already keeps source spans by default:

parseNixFileLoc :: MonadIO m => FilePath -> m (Result NExprLoc)
type NExprLoc = Fix NExprLocF
type NExprLocF = AnnF SrcSpan NExprF

parseNixFile is just a parseNixFileLoc which throws away its SrcSpans (probably doesn’t even instantiate them into memory given enough laziness).

So I don’t see how something like parseNixFileLocComments :: MonadIO m => FilePath -> m (Result (Fix (AnnF CommentSpan (AnnF SourceSpan NExprF)))) changes the status quo.

Ekleog · 2018-04-09T14:28:08Z

Hmm, so you're thinking of tying each comment to an AST item, if I understand correctly the type signature? I'm not sure how you'd be able to do it for all the examples you raised in #57 (comment): maybe full-line comments attach to the token after and part-of-line comments attach to the token before?

(for the context, I'm currently trying to implement a tool for indenting nix code that basically parses then pretty-prints it, and comments are likely the last missing piece to integrate, so I'm looking this quite closely :))

Profpatsch · 2018-04-09T15:23:55Z

@Ekleog I suspect you’d have to just decide on what lexer token you assign them to.

The definition of CommentSpan in #57 (comment) is missing a list I think:

data CommentSpan = CommentSpan
  { srcSpan :: SrcSpan
  , comments :: [Comment] }

because of stuff like { foo /*uh oh*/ = /*oh uh*/ "23" /*test*/; }

Ekleog · 2018-05-09T02:43:09Z

Hmm, but then the issue is to decide to which lexer token to assign the comments?

For a few comments it's easy, like:

nothere
here // my comment
nothere

nothere
here /* my comment */
nothere

nothere

// my comment
here

here
// my comment

nothere

But for others even as a human without additional context I couldn't guess to which token the following are to be assigned?

maybehere /* my comment */ orhere

maybehere

// my comment

orhere

Then I guess it's just good enough to autocratically pick one and say “this is it” :)

taktoa · 2018-08-21T20:55:07Z

It's been a long time since I looked at the hnix source code, so I'm not sure how much rearchitecting this would take, but what do you guys think about this way of integrating comments:

Lex the source file into a stream of tokens, each of which has a corresponding source span
Parse this into an AST, but instead of having source spans in the AST, you have offsets into the stream of tokens (or perhaps you have both).
The stream of tokens contains comments, but they don't result in any AST nodes.

This allows the comment information to be fully preserved and accessible without making the AST cumbersome to use.

costrouc · 2018-10-26T14:05:10Z

In regards to the announced sprint tomorrow at nixcon it was said that adding parsing for comments is a wanted feature. I just wanted to ask that if this feature be added to https://github.com/NixOS/nix so that other language projects can benefit as well! I myself have been working on a python parser and the one thing that is dropped is comments due to the nix ast parser. Mic92/pythonix#2

roberth · 2018-10-27T14:48:13Z

I've had a look at the AST and it seems that for the purpose of formatting nix code reliably, we will need to have access to token spans. This is because the AST simply can not (and should not) represent the comments and/or whitespace in every possible place. Example:

inherit /*a*/ ( /*b*/ foo) bar;

vs Binding constructor:

Inherit !(Maybe r) ![NKeyName r] !SourcePos

Of course preserving all comments is not a requirement for retaining comments, but both a formatter and some kind of comment processing seem to benefit from having a token stream, as @taktoa points out.

I'll see how far I can still get with tokenizing today at the NixCon 2018 hackday.

Ekleog · 2018-10-27T15:55:14Z

As I see people pointing to formatting nix code reliably, I'll point to https://github.com/Ekleog/nix-bikeshed/ so that people don't need to re-do the same thing twice :)

It's more or less working, mostly missing comments at the time being. (actually, running it is how I found out #152: parsing, generating the file and re-parsing didn't give the same answer, and for my generation)

So if some support for comments could be added, it'd be great! :)

domenkozar · 2018-10-27T16:02:21Z

@Ekleog how does it compare to hnix formatter?

Ekleog · 2018-10-27T16:22:48Z

If you're speaking of `Nix.Pretty`, then `nix-bikeshed` handles wrapping overlong lines correctly, and doesn't hesitate to switch between a `''foo''` string and a `"foo"` string (depending on line length), for the two first changes I can see off-the-top-of-my-head. If there's another hnix formatter, then I didn't find about it when I started `nix-bikeshed` a few months ago (before putting it on hold waiting for this issue to be fixed)

roberth · 2018-10-27T17:25:53Z

Sorry guys, I have given up on lexing. My findings:

Error messages will be worse when encountering a lexing error. Most errors may not be lexing errors, but those that are will be awful.
The nix lexer has to support string interpolation. Chances that this can be ported to alex are low. Alex might have improved performance compared to parsing with combinators twice (lex-parse + parse)
Locations where we can't preserve whitespace/comments in the AST seem unlikely to be very problematic in practice
To support a comment in the empty list AST [/* empty because important */] we'll need to use the dual of Ann
We can reference the source text in the annotations to recover preceding and following comments on demand without IO
Reverse scanning can be done, but it won't be pretty. (Also, a better approach to it should exist and the approach should probably depend on the goal)

NorfairKing · 2018-11-03T17:02:37Z

+1

See https://github.com/NorfairKing/nixfmt for an attempt at a normalising formatter.
The only thing missing there are comments, as mentioned in the README.

domenkozar · 2018-11-04T11:42:28Z

@NorfairKing @Ekleog now we have three formatters :) It would be good to see what's the difference and just get hnix formatter to do the correct thing (or provide modes of operation).

NorfairKing · 2018-11-04T11:49:20Z

@domenkozar https://github.com/NorfairKing/nixfmt just provides a CLI wrapper around the hnix parser and pretty-printer, nothing more. If the hnix parser and pretty printer are fixed to deal with comments, nixfmt will work as intended.

domenkozar · 2018-11-04T11:52:14Z

You can format by invoking hnix <filepath> :)

Ekleog · 2018-11-04T14:52:44Z

@domenkozar I think Nix.Pretty doesn't try to format nix code but to pretty-print nix expressions, which is IMO a vastly different objective. For instance, pretty-printing YAML may be just setting it to always-indent like:

foo:
 bar:
  - a
  - b
  - c
 baz:
  - some long stuff
  - other long stuff
  - let us say this overflows max line length

While formatting it would be like:

foo:
 bar: [a, b, c]
 baz:
  - some long stuff
  - other long stuff
  - let us say this overflows max line length

As such, I think the objective of the two projects (Nix.Pretty and nix-bikeshed) don't match, and thus they aren't overlapping :)

As for having hnix do the correct thing, it's another problem, and it could very well call into a formatting library.

domenkozar · 2018-11-04T14:57:23Z

The difference would only be setting how long can a line be, so setting it to 1 char would "pretty-print" it. So it could easily be under the same codebase, I think.

Ekleog · 2018-11-04T15:00:48Z

Well, we totally agree that a formatter can almost do pretty-printing. However I think is that a pretty-printer can't do formatting at all :)

(also, the difference also includes eg. whether ''foo'' can be turned to "foo" which changes the AST and thus can't be included in a pretty-printer, and similar stuff)

NorfairKing · 2018-11-04T15:03:29Z

Just my 2 cents: I don't much care about formatting vs pretty-printing. I just want a single normalized form of a given AST that is vaguely good for humans to edit. Something like hindent or gofmt. If that's done with a formatter instead of a pretty-printer: great, but I'm not (personally) interested in bikeshedding what the format would be.

domenkozar · 2018-11-04T15:05:29Z

@NorfairKing same here. I worked on formatter to do "something sane", but I don't prefer the discussion on the style as much as I value consistency.

The current formatter is used in https://github.com/domenkozar/hnix-lsp and I'd love if hnix would preserve even most basic comments (after lines and in-between).

Ekleog · 2018-11-04T15:14:48Z

If you're interested in something *vaguely* good for humans to edit, you'll likely be OK with a pretty-printer. If you want a tool to add to your CI to ensure your code is properly formatted, you'll likely want a formatter. I think this sums up the difference I see, and the reason why I think `Nix.Pretty`, by its name, doesn't try to handle the problem `nix-bikeshed` is trying to handle. But maybe I'm wrong here, I'll let @jwiegley contradict me :) Also, a reason why I don't think `nix-bikeshed` should live under `hnix` is due to its name: it'll likely see a lot of churn because no one will agree on the way things should be formatted, if we want to enforce formatting across nixpkgs, and so keeping the issue tracker separate sounded like a net win to me. BTW, the reason why I didn't announce `nix-bikeshed` until now is because it doesn't support comments yet, and I believe if it doesn't support comments then it's not a formatter -- while it's already a pretty-printer -- and I thus didn't want to ping people on a half-baked thing that wouldn't be actually usable :)

jwiegley · 2018-11-05T17:40:27Z

One advantage of pretty printing over formatting: smaller diffs. With automated formatting, you never know when it's going to decide to completely rewrite a large block of code because some limit has been reached.

expipiplus1 · 2020-10-23T15:58:05Z

Did anything ever come of this?

I think that even a simple solution which recognizes comments only in specific places would be quite useful!

expipiplus1 · 2020-10-25T09:46:20Z

I've written a pretty basic comment annotator here: https://github.com/expipiplus1/update-nix-fetchgit/blob/2e9df344c40e289920e94c82dbce2d87ae54ecc5/src/Nix/Comments.hs

It simply attaches comments to the expression which immediately precedes them (so multiple expressions may own the same comment). It also only handles single line comments.

Requires a fix for #743 (#744)

Anton-Latukha · 2021-07-31T08:46:46Z

Would mention the core point of current comment processing:

hnix/src/Nix/Parser.hs

Lines 498 to 506 in 57a55fd

    
           whiteSpace :: Parser () 
        
           whiteSpace = do 
        
             put =<< getSourcePos 
        
             Lexer.space space1 lineCmnt blockCmnt 
        
            where 
        
             lineCmnt  = skipLineComment' "#" 
        
             blockCmnt = Lexer.skipBlockComment "/*" "*/"

So, indeed (as points the topic), currently, HNix parser processes comments as whitespaces.

Profpatsch mentioned this issue Jul 17, 2017

nixpkgs: poor mans doc blocks wip / rfc / etc NixOS/nixpkgs#27394

Closed

jwiegley added the enhancement Betterness, without new explicit feature possibilities label Apr 12, 2018

angerman mentioned this issue Feb 28, 2019

docs: Update import instructions in user guide input-output-hk/haskell.nix#77

Merged

Anton-Latukha added the documentation Art of how not to explain. label May 21, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comment constructor #57

Comment constructor #57

Profpatsch commented May 15, 2017 •

edited

Loading

jwiegley commented May 15, 2017

Profpatsch commented May 16, 2017

jwiegley commented May 28, 2017

domenkozar commented Jun 27, 2017 •

edited

Loading

Profpatsch commented Jun 28, 2017 •

edited

Loading

Profpatsch commented Jun 28, 2017 •

edited

Loading

jwiegley commented Apr 2, 2018

deepfire commented Apr 9, 2018 •

edited

Loading

Profpatsch commented Apr 9, 2018

Ekleog commented Apr 9, 2018

Profpatsch commented Apr 9, 2018

Ekleog commented May 9, 2018

taktoa commented Aug 21, 2018

costrouc commented Oct 26, 2018

roberth commented Oct 27, 2018

Ekleog commented Oct 27, 2018

domenkozar commented Oct 27, 2018

Ekleog commented Oct 27, 2018 via email

roberth commented Oct 27, 2018

NorfairKing commented Nov 3, 2018 •

edited

Loading

domenkozar commented Nov 4, 2018

NorfairKing commented Nov 4, 2018 •

edited

Loading

domenkozar commented Nov 4, 2018 •

edited

Loading

Ekleog commented Nov 4, 2018

domenkozar commented Nov 4, 2018

Ekleog commented Nov 4, 2018

NorfairKing commented Nov 4, 2018

domenkozar commented Nov 4, 2018 •

edited

Loading

Ekleog commented Nov 4, 2018 via email

jwiegley commented Nov 5, 2018

expipiplus1 commented Oct 23, 2020

expipiplus1 commented Oct 25, 2020 •

edited

Loading

Anton-Latukha commented Jul 31, 2021 •

edited

Loading

Comment constructor #57

Comment constructor #57

Comments

Profpatsch commented May 15, 2017 • edited Loading

jwiegley commented May 15, 2017

Profpatsch commented May 16, 2017

jwiegley commented May 28, 2017

domenkozar commented Jun 27, 2017 • edited Loading

Profpatsch commented Jun 28, 2017 • edited Loading

Profpatsch commented Jun 28, 2017 • edited Loading

jwiegley commented Apr 2, 2018

deepfire commented Apr 9, 2018 • edited Loading

Profpatsch commented Apr 9, 2018

Ekleog commented Apr 9, 2018

Profpatsch commented Apr 9, 2018

Ekleog commented May 9, 2018

taktoa commented Aug 21, 2018

costrouc commented Oct 26, 2018

roberth commented Oct 27, 2018

Ekleog commented Oct 27, 2018

domenkozar commented Oct 27, 2018

Ekleog commented Oct 27, 2018 via email

roberth commented Oct 27, 2018

NorfairKing commented Nov 3, 2018 • edited Loading

domenkozar commented Nov 4, 2018

NorfairKing commented Nov 4, 2018 • edited Loading

domenkozar commented Nov 4, 2018 • edited Loading

Ekleog commented Nov 4, 2018

domenkozar commented Nov 4, 2018

Ekleog commented Nov 4, 2018

NorfairKing commented Nov 4, 2018

domenkozar commented Nov 4, 2018 • edited Loading

Ekleog commented Nov 4, 2018 via email

jwiegley commented Nov 5, 2018

expipiplus1 commented Oct 23, 2020

expipiplus1 commented Oct 25, 2020 • edited Loading

Anton-Latukha commented Jul 31, 2021 • edited Loading

Profpatsch commented May 15, 2017 •

edited

Loading

domenkozar commented Jun 27, 2017 •

edited

Loading

Profpatsch commented Jun 28, 2017 •

edited

Loading

Profpatsch commented Jun 28, 2017 •

edited

Loading

deepfire commented Apr 9, 2018 •

edited

Loading

NorfairKing commented Nov 3, 2018 •

edited

Loading

NorfairKing commented Nov 4, 2018 •

edited

Loading

domenkozar commented Nov 4, 2018 •

edited

Loading

domenkozar commented Nov 4, 2018 •

edited

Loading

expipiplus1 commented Oct 25, 2020 •

edited

Loading

Anton-Latukha commented Jul 31, 2021 •

edited

Loading