Add locations to the AST #21

Chris00 · 2013-09-04T15:50:44Z

Adding locations to the AST would be very useful to report errors. say for example you want to run some code but it fails. Without locations, it is difficult to report a nice error message to the user.

pw374 · 2013-09-04T15:52:00Z

Can you be more specific? I mean, there is no notion of error in Markdown...

Chris00 · 2013-09-04T16:01:05Z

It is not for Markdown processing in itself. It is for what you do once the markdown has been processed (e.g. check that the code is correct).

pw374 · 2013-09-04T16:12:59Z

Ok, I understand that basically, for a node in the AST, you want to be able to know which Markdown expression that led to it. I agree that it would be nice to have this feature.

I'm currently thinking that this might need a lot of code refactoring... I have to think more.
(The first problem that I see is that we'd need good locations for the token list manipulated by the parser. When it does out of the lexer, the list is certainly correct, but after a while, it's likely that the parser would have messed with the list and made weird. So the locations would have to be robust to the parser's manipulations. What I mean is that I think we cannot easily retrieve locations just by looking at the current tokens list, whereas if the parser was nicer with this list, we'd just have to read the list...)

If you have suggestions on how to do it, don't hesitate ;)

Chris00 · 2013-09-04T16:18:47Z

There is more. Since we intend to pre-process some files, an annotation like the # in OCaml must be supported to be able to add HTML code without loosing the locations (think about the 99problems page on the web site, the fact that the answers are hidden requires to wrap them with HTML code but, if the solution contains an error — say a syntax problem — then we want to tell the author using the locations in the original file).

pw374 · 2013-09-04T16:24:03Z

Yes of course, if we know where we come from (at the AST level), then it's easy to produce the locations, in say HTML comments or something more hack-ish like empty span tags. This could be for each block (e.g., paragraph, blockquote, ...), for instance...

darioteixeira · 2014-08-08T11:26:01Z

I've added Markdown support to Lambdoc via OMD (see here for the code). And indeed, the major problem is still the lack of location information. This is necessary in Lambdoc for two main reasons: first, because Lambdoc allows customisable feature sets (you may want to forbid your users from formatting text as bold, for a silly example); second, because not all OMD features are present in Lambdoc (nesting beyond H3, for instance). In both cases I need to know the line number where the offending input occurred, so I can present a user-friendly error message.

nojb · 2020-06-20T14:11:54Z

I am not a 100% sure, but this may make the cut for 2.0.

shonfeder · 2021-02-20T03:36:48Z

I think the discussion on #223 is relevant here (arguably #223 is a generalization of this issue).

iiuc, if we start putting line numbers tracking certain token were taken from during parsing, then we are definitely not constructing an AST any more. We are building a parse tree.

It seems clear there are a lot of users who would benefit from having access to a detailed a parse tree of the markdown!

shonfeder · 2021-02-20T03:44:38Z

I've suggested an approach to accommodate this kind of feature while still producing an AST for higher-level uses here: #223 (comment)

shonfeder · 2021-05-29T02:13:04Z

@sonologico found a better approach for an AST that can support this in #234, so I think we have a good way forward.

The next step will be to work out a sensible way of working this kind of additional information into the parsing routine.

artempyanykh · 2022-04-25T15:51:54Z

As another use-case, having location information would be great for LSP servers too.

For context, I'm talking about this kind of LSP server for Markdown. I wrote it in Rust but after some time found the ceremony around ownership/borrowing rather exhausting and now am thinking about rewriting it in either OCaml or F#.

pw374 mentioned this issue Oct 1, 2013

reference-links not handled like pandoc or github #31

Closed

darioteixeira mentioned this issue Aug 8, 2014

TyXML #82

Open

pw374 added this to the 1.1.0 milestone Aug 10, 2014

darioteixeira mentioned this issue Oct 13, 2014

markdown: reports incorrect location when unsupported feature is used darioteixeira/lambdoc#24

Closed

shonfeder mentioned this issue Mar 13, 2021

Add a parse tree structure (a CST) in addition to the AST in oder to preserve all relevant details of the original markdown #223

Open

shonfeder removed this from the 1.1.0 milestone May 29, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add locations to the AST #21

Add locations to the AST #21

Chris00 commented Sep 4, 2013

pw374 commented Sep 4, 2013

Chris00 commented Sep 4, 2013

pw374 commented Sep 4, 2013

Chris00 commented Sep 4, 2013

pw374 commented Sep 4, 2013

darioteixeira commented Aug 8, 2014

nojb commented Jun 20, 2020

shonfeder commented Feb 20, 2021

shonfeder commented Feb 20, 2021

shonfeder commented May 29, 2021

artempyanykh commented Apr 25, 2022

Add locations to the AST #21

Add locations to the AST #21

Comments

Chris00 commented Sep 4, 2013

pw374 commented Sep 4, 2013

Chris00 commented Sep 4, 2013

pw374 commented Sep 4, 2013

Chris00 commented Sep 4, 2013

pw374 commented Sep 4, 2013

darioteixeira commented Aug 8, 2014

nojb commented Jun 20, 2020

shonfeder commented Feb 20, 2021

shonfeder commented Feb 20, 2021

shonfeder commented May 29, 2021

artempyanykh commented Apr 25, 2022