-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AST representation changes #234
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a few suggestions, but they are just that. Let me know if you think any of them aren't for the better :)
I'm psyched to get this in place!
| Hard_break of 'attr | ||
| Soft_break of 'attr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a bit strange at first to see these constructors take attributes, tho we have a good reason. Perhaps a comment on these constructors (or above this type declaration) to explain why we want to old space for attrs throughout would be helpful for future readers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point. Haven't added a comment yet. Not sure if I want to add a comment here or at the "external interface" level (omd.mli
). I'm leaning towards adding it here. I'll add something before we merge this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added a comment. Let me know if it's good enough.
src/ast.ml
Outdated
type raw = (string, attributes) block | ||
type t = (attributes inline, attributes) block |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just an idea for consideration -- one reason we might think about nested modules here is for namespacing like:
module Raw = struct
type inline = string
type t = (inline, attributes) block
end
module Plain = struct
type inline = attributes inline
type t = (inline, attributes) block
end
module SourceMap = struct
type loc
type inline = loc inline
type t = (inline, loc) block
end
We might also end up with some functors that would apply to any of these ast representations. That might be too speculative, and perhaps not worth the complication currently, but just wanted to float the idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see the benefit of it. I don't feel strongly about it, but I would either
- keep it as is and wait until we have a need for functors (which is what I'd favour)
- do it in the previous style in which the constructors are part of a functor so that they are also namespaced
Let me know what you think after I have changed the rest of the code. If we're going with this, I think that removing the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good to me! Just two small comments.
After having propagated the AST through the code base, what's your feeling on its merits compared to the current representation? Do you feel like it's an improvement to work with?
src/omd.mli
Outdated
| Html_block of 'attr * string | ||
| Definition_list of 'attr * 'a def_elt list | ||
|
||
type doc = (attributes, attributes inline) block list |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about exposing the Ast.t
in this interface and then using it here?
type doc = (attributes, attributes inline) block list | |
type doc = attributes Ast.t list |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the thing that is making me consider moving back to using a functor as we were using before in the Ast
module. There's no way to expose block without a parameter here. It feels weird to have Omd.block
, Omd.t
and Omd.doc
.
I'd like to either
- Pick a different name for
block
and use the nameblock
fort
and update theAst
module to match it - Go back to having the functor in the
Ast
module as it was before so thatblock
doesn't have a type parameter and we don't need at
.
Let me know what you think and I'll try to wrap up this up by tomorrow so that we can unblock the other things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm afraid I don't quite follow. I think it makes sense to have a these two different type constructors exposed:
('attr, 'a) block
: types for block-level elements of any attributes and any content'attr Ast.t
: types for the standard markdown AST, over any attributes
And to have this additional type:
doc = attributes t list
: the type for the standard representation of a markdown document, with the standard attributes.
Tho I could also see an argument Ast.t
being defines as a list of blocks.
(Well, if I were designing from scratch, I'd probably call doc
Omd.t
, but that's inessential.)
That said, if you think some use of simple functors will improve the interface and make a nicer presentation of types, I wouldn't be opposed. Do you want to sketch out what you have in mind?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Ast
module used a functor. I'm thinking of reintroducing that so that Omd
can have the following signature:
(* ... *)
type 'attr inline =
| Concat of 'attr * 'attr inline list
| Text of 'attr * string
| Emph of 'attr * 'attr inline
| Strong of 'attr * 'attr inline
| Code of 'attr * string
| Hard_break of 'attr
| Soft_break of 'attr
| Link of 'attr * 'attr inline link
| Image of 'attr * 'attr inline link
| Html of 'attr * string
(* ... *)
type 'attr block =
| Paragraph of 'attr * 'attr inline
| List of 'attr * list_type * list_spacing * 'attr block list list
| Blockquote of 'attr * 'attr block list
| Thematic_break of 'attr
| Heading of 'attr * int * 'attr inline
| Code_block of 'attr * string * string
| Html_block of 'attr * string
| Definition_list of 'attr * 'attr def_elt list
type doc = attributes block list
With the current setup in my branch, we can't really expose Ast.t in Omd without exposing the Ast module as well (maybe with a restricted signature). I find that this Omd signature, the solution currently in my branch, is awkward:
(* ... *)
type ('attr, 'a) block =
| Paragraph of 'attr * 'a
| List of 'attr * list_type * list_spacing * ('attr, 'a) block list list
| Blockquote of 'attr * ('attr, 'a) block list
| Thematic_break of 'attr
| Heading of 'attr * int * 'a
| Code_block of 'attr * string * string
| Html_block of 'attr * string
| Definition_list of 'attr * 'a def_elt list
type 'attr t = ('attr, 'attr inline) block (* equivalent to Ast.t *)
type doc = attributes t list
(** A markdown document *)
I would either change the name of block
to something else and change t
to block
or change t
to something else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added back the functor to the Ast
module. There's no real cons internally besides a heavier weight syntax. I prefer the simpler external interface with a single parameter for the attributes on the block
and inline
types in Omd
.
If you give me your approval, I'll squash the commits and merge this.
I'm fine with narrower scoping of the module if you think it's an improvement 👍 |
I think it's an improvement overall.
|
d15477e
to
950bb76
Compare
c6d49b2
to
67cf3a1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not crazy about the functorial code here, but that's inherited :)
I suspect we can maybe find a nicer way of wrangling these types in the future, but I think the external API exposed in terms of the AST won't change!
LGTM!
* Remove the record wrapper for each node and inline the attribute field * Make the block type parametric with the goal of using this later to have a version of the AST with source locations and whatever information we might want per node.
67cf3a1
to
2bba77e
Compare
As you probably know, leaving it unconstrained can result in opam trying to build sail with omd 2.0.0~alpha2, which ends up producing an error something like: ``` File "latex.ml", line 226, characters 6-21: 226 | | Paragraph elems -> ^^^^^^^^^^^^^^^ Error: The constructor Paragraph expects 2 argument(s), but is applied here to 1 argument(s) ``` ... due to omd 2's revamp of the AST (<ocaml/omd#234>). Anyway, it turns out that opam can still end up picking sail 0.13 and earlier, which (unlike version 0.14) don't yet have this constraint, and so still fail fail to build in this manner. I'm not really sure why sail 0.14 chose the constraint "omd<1.4", but in the interest of monotonicity, I elected to apply the same one retroactively: ``` sh opam admin add-constraint 'omd<1.4' ``` It may seem alarming to update existing packages in this way, but apparently this is the state-of-the-art in opam repository administration at the moment. (Meaning: nobody's managed to figure out how to avoid needing to do it.) Apparently, nothing too terrible happens as long as you restrict such changes to tightening up version constraints, and don't do so in ways that would prevent something that worked before from working now. Since this change only excludes two versions of omd that currently exist (2.0.0~alpha1 and 2.0.0~alpha2), which as we've seen don't work for sail, we should be fine.
Opening the PR to continue the discussion started on #228.
Changes to the AST are: