Ensure every CST element has a SOURCE, suggest how to capture whitespace and comments #28

eschulte · 2018-05-24T19:06:55Z

I'd like to be able to parse a source file into a tree structure which includes all information required to build the source (including whitespace and comments). This would be easier if every parsed CST element has a valid associated source range. Currently that isn't the case for (at least) QUOTE elements. See:

SEL/LISP-DIFF> (labels ((maptree (fn tree)
                          (if (concrete-syntax-tree:consp tree)
                              (cons
                               (maptree fn (concrete-syntax-tree:first tree))
                               (maptree fn (concrete-syntax-tree:rest tree)))
                              (funcall fn tree))))
                 (maptree #'concrete-syntax-tree:source
                          (with-input-from-string (in "'(1 2 3 4)")
                            (cst-read in))))
(NIL ((2 . 3) (4 . 5) (6 . 7) (8 . 9)))

I've taken a shot at building this using cst-read and record-skipped-input with some custom code to fold everything together and add whitespace. See https://github.com/GrammaTech/sel/blob/master/ast-diff/lisp-diff.lisp#L71, is there a better way?

Thanks!

The text was updated successfully, but these errors were encountered:

scymtym · 2018-05-24T21:35:23Z

There are two separate questions here:

Can't we ensure that every CST node has an associate source range?

For eclector.concrete-syntax-tree:cst-read we take the position that not all CST nodes have a corresponding source range.
- One reason are reader macros which can fabricate arbitrary expressions not related to the source at all. The symbol quote in (quote 1 2 3 4) is a simple example of this.
- Another reason are parts of the list structure that have no representation in the input string:
```
(labels ((maptree (fn tree)
           (if (concrete-syntax-tree:consp tree)
               (list
                (funcall fn tree) ; CHANGED
                (maptree fn (concrete-syntax-tree:first tree))
                (maptree fn (concrete-syntax-tree:rest tree)))
               (funcall fn tree))))
  (maptree #'concrete-syntax-tree:source
           (with-input-from-string (in "'(1 2 3 4)")
             (eclector.concrete-syntax-tree:cst-read in))))
((0 . 10) NIL (NIL ((1 . 10) (2 . 3) (NIL (4 . 5) (NIL (6 . 7) (NIL (8 . 9) NIL)))) NIL))
```
A custom client can of course implement a different behavior. To this end, consider the wip-parse-result-protocol-2 branch and in particular https://github.com/robert-strandh/Eclector/blob/wip-parse-result-protocol-2/code/parse-result/second-climacs-test.lisp .
Is there a better way to integrate skipped input into the parse result?

Again, consider the wip-parse-result-protocol-2 branch and in particular https://github.com/robert-strandh/Eclector/blob/wip-parse-result-protocol-2/code/parse-result/second-climacs-test.lisp .

There is a good chance we will do something very similar in the master branch to better support clients other than cst-read.

eschulte · 2018-06-08T02:19:32Z

Thank you for the pointer to that branch. I hope this functionality makes its way into master. Using the branch and adopting the example you cited, I was able to get exactly what I wanted (a tree structure with sufficient information to fully reproduce the full text of the source file) with a much simpler implementation. See https://github.com/GrammaTech/sel/blob/wip-lisp-diff-uses-eclector-parse-result-protocol-2/ast-diff/lisp-diff.lisp

eschulte · 2019-09-06T16:40:07Z

To followup, over a year later we're still using this branch. Any chance this functionality will (or maybe has already) make its way to master?

scymtym · 2019-09-06T16:49:41Z

Any chance this functionality will (or maybe has already) make its way to master?

This has not happened yet, but several steps towards that goal are underway.

scymtym · 2019-09-06T16:52:42Z

To clarify, the parse-result protocol has made it into master, but I'm not sure it fully supports this use-case.

scymtym · 2019-09-07T16:30:43Z

I made GrammaTech/sel#11. Let me know if that works for you.

scymtym added the question Further information is requested label May 24, 2018

scymtym self-assigned this May 24, 2018

scymtym closed this as completed in c3de63d Aug 9, 2018

scymtym reopened this Sep 6, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ensure every CST element has a SOURCE, suggest how to capture whitespace and comments #28

Ensure every CST element has a SOURCE, suggest how to capture whitespace and comments #28

eschulte commented May 24, 2018

scymtym commented May 24, 2018

eschulte commented Jun 8, 2018

eschulte commented Sep 6, 2019

scymtym commented Sep 6, 2019

scymtym commented Sep 6, 2019

scymtym commented Sep 7, 2019

Ensure every CST element has a SOURCE, suggest how to capture whitespace and comments #28

Ensure every CST element has a SOURCE, suggest how to capture whitespace and comments #28

Comments

eschulte commented May 24, 2018

scymtym commented May 24, 2018

eschulte commented Jun 8, 2018

eschulte commented Sep 6, 2019

scymtym commented Sep 6, 2019

scymtym commented Sep 6, 2019

scymtym commented Sep 7, 2019