Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Let's talk about joined effort #156

Open
jaapio opened this issue Aug 13, 2021 · 71 comments
Open

Let's talk about joined effort #156

jaapio opened this issue Aug 13, 2021 · 71 comments

Comments

@jaapio
Copy link

jaapio commented Aug 13, 2021

Thanks @stof for reaching out to the phpDocumentor team. I would like to talk to you guys for a moment about the doctrine/rst-parser. Which we integrated in phpDocumentor. More that a year ago we had contact with Jonathan Wage about the library. If I remember it correctly he forked the original library to a doctrine version. Back than there was nearly any maintenance on the lib. And from the doctrine team there was not much interest to support any additional features and improvements. That’s why we (@mvriel and me) have forked the doctrine version and started working on it.

With phpDocumentor we are trying to get a combination of API docs like we are rendering for years, with hand written manuals like doctrine and symfony are providing. In combination with the templates phpDocumentor provides and all static information we know about the code this should raise the level of documentation for PHP in general. That’s our goal.

Beside the missing support from the doctrine team which slowed us down in the additional features for phpDocumentor we wanted to do some large refactoring in the code to make it faster and fit better into the application workflow of phpDocumentor. So we have split the parsing and rendering in our version and some other improvements which allow us in the end to cache the parser results of each document, do some compile steps like we always do on cached items and then start the rendering. This includes also some major changes that are reducing the memory footprint of the library.

But now the there is some more movement on this project I think we can all benefit from the changes we are doing on both sides. And maybe bring it back into a single project which is maintained by all of us.

Please let me know if you are interested to join our forces and make some nice magic happen with the documentation of all the great projects out there.

@greg0ire
Copy link
Member

greg0ire commented Aug 14, 2021

This sounds good to me! How do you propose we move forward with this? Are you going to send PRs to us, or would you rather have everyone work on a fork of doctrine/rst-parser in the phpDocumentor namespace, or some other namespace?

Consider joining #rst-parser at https://www.doctrine-project.org/slack

If you want to get merge permissions on this repository, that should be possible too.

@wouterj
Copy link
Collaborator

wouterj commented Aug 14, 2021

Hi @jaapio! Thanks for this issue, I didn't know phpDocumentor maintained their own fork.

As you might know already, we've started using this to render some docs on symfony.com (e.g. twig and swiftmailer) and are planning to slowly integrate it in all doc build processes. One of the main reasons to use this repository instead of starting from scratch was because we would start with something that is (somewhat) battle-tested and has other communities that are involved in maintenance. I think joining efforts makes all the sense, as it would mean that in the end we get a better package, and with less efforts from each individual contributor.

I'll let you and the Doctrine team figure out what the best way to join efforts is :)

@jaapio
Copy link
Author

jaapio commented Aug 14, 2021

I think our first step would be it describe what all projects need from this library? Do you want just an easy way to parse the RST into a html website? Or do you have more tooling around this library? When I look at the doctrine website project, I see a lot of overlap between phpDocumentor & planned features of phpDocumentor for example.

From my point of view, it would make sense to move the library into the phpDocumentor organisation. Since we are focussing on documentation :-) while Symfony is a framework, and doctrine is about databases. That would help people to look around and find a working application with this component. Documentation is our primary focus, so you can focus on what is actually needed for your projects. Writing that documentation :-)

Merging the work from both sides could be a challenge, we moved around a lot of parts. And I'm not entirely sure if we can just extract the Guides namespace from phpDocumentor at this moment. We didn't focus on keeping it as a separate component. But we can start working on that of course. To be complete, we forked this library, but it took too much effort for us to update the library outside phpDocumentor. If you want to have a look at what we did: https://github.com/phpDocumentor/phpDocumentor/tree/master/src/Guides. The fork is unmaintained.

Use phpDocumentor?

What is holding you back from using phpDocumentor, if we would support RST? I'm not trying to force you to use phpDocumentor, but I'm trying to understand what you need from this lib. Since the output of this library for you both is html rendered by twig. Which is exactly what phpDocumentor does. I think our current templating system is flexible enough to support everything you need?

Our plans / dreams

We are planning to have a tight integration between the api and guides. We are planning to do much more that just RST. Like integrating class diagrams based on packages defined in your code. Which will automatically update with your code.

Versions

It is not visible yet, but phpDocumentor will support versions in the future like the symfony documentation has right now. Which will allow you to link between versions.

Linking between projects

Inspired by sphinx, we would introduce some inter-project links. Which would allow you to link between projects, even from symfony to doctrine for example. If symfony and doctrine would both use phpDocumentor it would even allow other private projects to link to the symfony and doctrine documentation.

AST

Like API docs, RST parser in phpDocumentor should produce a cached AST, which would allow consumers of phpDocumentor to do the post processing themself. Like the transformation from objects to HTML or json?

Plugins

Support everything in phpDocumentor will be impossible. So phpDocumentor will support a way to affect our internal AST. Or extend the AST to render your own annotations for example. This is what we are planning to do for wordpress for example.

Final remark

Reading all this, I notice that I'm selling phpDocumentor here :-) and to be clear. I'm not stating that you MUST move to phpDocumentor. it is just to make clear what we are up to. ofcourse it would be an honor to serve projects like Doctrine and Symfony :-). Having users of your projects make working on it even more fun 😎

@greg0ire
Copy link
Member

What strikes me at first, is how different both trees look when I compare https://github.com/doctrine/rst-parser/tree/0.5.x/lib and https://github.com/phpDocumentor/phpDocumentor/tree/master/src/Guides … but maybe things are not so different in terms of usage?

Also, the README is not very reassuring 😅

Anything regarding Guide generation, in this folder or elsewhere, is to be considered experimental and prone to be changed or removed without notice or consideration for BC. This is a follow-up to an earlier POC and can, at best, seen as an incubator project.

From my point of view, it would make sense to move the library into the phpDocumentor organisation. Since we are focussing on documentation :-) while Symfony is a framework, and doctrine is about databases

Doctrine is no longer solely about databases, we have packages that are used independently from doctrine/dbal, like our coding standard, or the annotations library, or this library. Still, it would make sense to move the library to phpDocumentor. But it would really have to be a move I think, as in "a copy of the project", with only the namespaces changed, to make the migration easy. Then maybe you could port changes from https://github.com/phpDocumentor/phpDocumentor/tree/master/src/Guides while respecting semver (picking different branches depending on whether the change is BC or not?)

BC is important here because we have packages that depend on us: https://packagist.org/packages/doctrine/rst-parser/dependents?order_by=downloads

@mvriel
Copy link
Contributor

mvriel commented Aug 14, 2021

Also, the README is not very reassuring 😅

Please view the README from phpDocumentor's perspective. We needed to do a lot of work to make the library fit into a more abstract architecture.

As for the difference, we have had to (and still are) refactoring significant parts to make the library generate a cacheable AST. The version from which we took the fork had cross dependencies between the Node, Environment and Configuration that made it nigh impossible to split the process into the generation of a cachable AST, and in loading that AST and producing generated output with it.

Another big investment that we needed to do was to change the way the TOC was handled so that we could unify the TOCs of the API parser and the RST parser. We have brought the API parser towards the RST parser and vice versa to make this work.

And another significant challenge, is that the Span processing is currently not functioning to spec; the spec states it should be able to parse a Block, but the current implementation has a limited subparser which we are changing into handling complete blocks.

@mvriel
Copy link
Contributor

mvriel commented Aug 14, 2021

The thing I forgot to mention, I have tried hard to keep the top level API between the libraries similar so that the entrypoint of the original library could still be made to work. But for the Nodes and Processors it was inevitable to break BC due to architectural issues and coupling

@jaapio
Copy link
Author

jaapio commented Aug 15, 2021

Doctrine is no longer solely about databases, we have packages that are used independently from doctrine/dbal, like our coding standard, or the annotations library, or this library.

you are right, I was a bit too short there. :-)

I noticed I pointed you to the wrong directory...
https://github.com/phpDocumentor/phpDocumentor/tree/master/src/Guides/RestructuredText should look more familiar :-D

@greg0ire
Copy link
Member

It does look less different 😄

Do you plan to extract RestructuredText or at least Guides to a separate repository at some point? If yes, then a plan could be:

  • port work done on rst-parser since you forked to that separate repository;
  • migrate Doctrine and Symfony to it;
  • abandon rst-parser in favor of it.

@javiereguiluz
Copy link
Contributor

javiereguiluz commented Sep 8, 2021

From the Symfony point of view, the current situation is:

  1. We use Doctrine RST Parser strictly as a RST parser and nothing more
  2. We use Symfony Doc Builder (https://github.com/symfony-tools/docs-builder) on top of it to provide customizations and things strictly related to Symfony's RST and not RST in general

Would "Guides" be (1) + (2) (an RST parser + customizations needed for phpDocumentor) or keep being only (1) like current Doctrine RST parser?

If "Guides" is (1) + (2) and Doctrine RST parser is abandoned, we'll need to keep our own fork of this project. We really need an agnostic RST parser that only does that. Thanks!

@wouterj
Copy link
Collaborator

wouterj commented Sep 8, 2021

@javiereguiluz the "Guides" project started based on the symfony-tools/docs-builder, and now also includes doctrine/rst-parser. So "Guides" is (1) + (2).

However, we don't need to maintain a fork of this library, as it is also our docs-builder. We would only maintain a project that contains e.g. our custom theme and node customizations. At least, that is what I concluded when @jaapio shared some experiments he did with the Symfony docs (see also my posts on slack about this).

@javiereguiluz
Copy link
Contributor

javiereguiluz commented Sep 8, 2021

I'm all for joining efforts 💪 but if the new project is more than a RST parser, then sooner or later it will do things that we don't need or that even "hurt us" and we need to undo or neutralize with code somehow.

I really think we need "parser + custom doc builder" ((1) and (2) separated) instead of a "custom but general-purpose doc builder" ((1) + (2)).

@wouterj
Copy link
Collaborator

wouterj commented Sep 8, 2021

If the general-purpose doc builder is just as (if not more) configurable as the original parser, I don't see why it would be bad.

As I see it, we get more maintainers for both the parser and the docs builder. At the very least, this allows us to remove some features from the docs builder (as they are maintained by phpDocumentor Guides). In the most ideal scenario, this would allow us to remove everything but custom nodes, node visitors (to be implemented) and templates from the docs-builder.

@mvriel
Copy link
Contributor

mvriel commented Sep 9, 2021

The goal with phpDocumentor is to provide a complete documentation solution that will help you write documentation in RestructuredText but also generate (Code) API Documentation; and allow linking between the two with complete support for multiple versions and components.

In all honesty, the Symfony documentation is our inspiration on which features to support since it is the best, most extensive and feature-rich documentation we currently see. Our conviction is that if we can provide a splendid solution for Symfony, including customizability, then it will excel for other projects as well.

As such, I find it hard to think of a situation where we would hurt the documentation efforts; but rather make it easier. Especially tasks such as crosslinking between different versions of documents is a very interesting use-case that we want to fully support.

@mvriel
Copy link
Contributor

mvriel commented Sep 9, 2021

If you have any concerns or concrete examples where you see a possible issue; please do mention this. At the very least we could consider an extension/customization point for a challenge that falls beyond the realm that we can include in the core.

@javiereguiluz
Copy link
Contributor

@mvriel first of all, thanks again for your offering of joining efforts. I think it's the smart thing to do. My disagreement is only about the technical implementation of this idea, so please don't see it as something against you or phpDocumentor 🙏

Now, let me try to explain again my point using another example:

phpDocumentor publishes three packages related to Reflection:

Why 3 packages? They all are about the same thing. Why not a single package that does many things related to Reflection? If someone needs to add or remove a feature provided by that super package, they can use the extensibility points to do that.

Or ... even better ... given that Reflection and phpDoc are super closely related, why not merge the code inside this other package:

https://github.com/phpDocumentor/phpDocumentor

We all know the answer: a stand-alone reflection package is great because it provides an entire useful feature that others can use to build higher level features. I think the same happens with a RST parser: it's a very complex feature that it's perfectly suited to be published as a stand-alone package for others to build higher level features (phpDocumentor Guides, Doctrine website builder, Symfony doc builder, etc.)


Finally, please remember that Symfony Doc Builder doesn't follow any good practices from the rest of the Symfony project. We can follow or break semver, we can introduce tons of BC breaks, we can add or remove support for PHP and Symfony versions as we wish, etc. We need to do that ... but we can't do that with a published package such as Guides.

I hope it's more clear now. Thanks!

@mvriel
Copy link
Contributor

mvriel commented Sep 9, 2021

I would love to separate the RST Parser back into a separate repository; I choose to integrate it into phpDocumentor first with the intention to separate it again later because the original RST Parser had coupling in key area's that prevented us from integrating it into another project without major rework.

As you may have noticed yourself, the Gregwar RST Parser, and by extension Doctrine RST Parser, does not generate a standalone AST but rather has the whole parsing and transformation bit mashed together in the Nodes (if you check the SpanNode and SpanRenderer for example, it renders HTML during the parsing phase.

What we have done so far, is to integrate the library into phpDocumentor and start extracting all coupling related to rendering from the Nodes and change the TOC generation so that it can integrate into another over-arching TOC. This is a lot of work and we are not done yet, but my hope is that we can extract the library from phpDocumentor again (like we did with the libraries you mention above) as soon as it stabilizes and it only generates an AST that can be consumed by another application as a basis for rendering.

@mvriel
Copy link
Contributor

mvriel commented Sep 29, 2021

As an FYI: at phpDocumentor, we have been reworking the DocumentParser to bring it more in line with the theory and concepts of a context-sensitive Recursive Descent Parser. The effect of this is that adding new features will become easier since every Production Rule is responsible for furthering the line cursor, instead of doing this in the DocumentParser and switching back and forth between a state.

This also introduces the ability for elements to process a subset of elements. An example is the List Item that only supports Body Elements; something which is impossible in the previous implementation.

In the upcoming weeks more iterations will be done to allow for better support on the ReST spec, and also to move the definitions of (Compound) Production Rules onto configuration so that custom Directive Handlers and Production Rules can be applied by projects.

See phpDocumentor/phpDocumentor#3020 for the changes

@mvriel
Copy link
Contributor

mvriel commented Sep 29, 2021

Oh, and these changes fix a couple of bugs as well that occur. Such as the lack of support for all title letters as mentioned in the spec

@javiereguiluz
Copy link
Contributor

Mike, thanks a lot for your immense work to improve the RST parser 🙇 Having a fully-compliant RST parser in PHP (ideally as a stand-alone package) would be a dream come true 😍 Thanks

@jaapio
Copy link
Author

jaapio commented Sep 29, 2021

When we are ready with the major refactoring steps we can start extracting the parser and maybe some related components from the main repository of phpdocumentor.

I can't promise when that will happen. But the moment will come for sure.

@jaapio
Copy link
Author

jaapio commented Oct 10, 2021

An update from our side:

This week we started another enormous improvement in the parser. As many of you might now the parser and rendering are both using an Environment class which is used everywhere. This forces us to create the renderer and some related services every time when a new document is rendered. Since phpDocumentor has a service container, we wanted to have reusable services registered in the service container. For phpDocumentor this means that we are moving towards a option to render documents whenever we want. Like blocks of content that can be included in a twig template.

Extracting the environment also learned us a lot about the decoupling between the parser and renderers. Which will make us end with a pure parser, with a separate rendering layer.

phpDocumentor/phpDocumentor#3032

@mvriel
Copy link
Contributor

mvriel commented Oct 15, 2021

The quest continues:

We have just merged a change (phpDocumentor/phpDocumentor#3048) where we have moved the Guides into a path-based composer package as an incubator package. This doesn't mean it is available on packagist soon; but with this step, I want to segregate the boundaries between the phpDocumentor main application and the upcoming library or libraries.

There is a new PR, phpDocumentor/phpDocumentor#3049, in the works as well that splits the library even further where the main library provides the infrastructure and AST nodes, and subsequent libraries provide the translation for a specific language (such as RestructuredText and Markdown).

With phpDocumentor, it is our aim to at least support both RestructuredText and CommonMark Markdown; by making guides flexible enough by swapping 'transport' this should be doable without peppering the library with exceptions :D

(As an analogy, think symfony messenger with its redis-messenger and doctrine-messenger transports)

@wouterj
Copy link
Collaborator

wouterj commented Oct 16, 2021

Thanks for all the updates and work @mvriel and @jaapio, you're making incredible progress every week!

@jaapio
Copy link
Author

jaapio commented Dec 10, 2021

We had some more time to refactor things more. I started to extract all rendering from the parser phase. The parser is now just using the renderer in de directives. That will be a next step to take. Restructure the way directives work.

The directives are at the moment enriching the nodes produced by the parser, and in some cases the directives are calling the renderer to render sub nodes. We will try to get rid of this connection, so we will have a cachable AST that can be rendered.

Main while the shortcuts we took to integrate the library in to phpDocumentor are coming to the surface. As I see it right now we will be able to start extracting those as will and create a solid integration layer that will allow people to use the guides library with the RestructuredText as a stand alone library.

I hope we can continue making progress as we are doing right now, so I can soon notify you all about a new phpDocumentor library that allows everyone to improve the documentation tools that are used by the projects.

@jaapio
Copy link
Author

jaapio commented Dec 10, 2021

Beyond my own expectations I was able to refactor all existing directives and turn them in to normal nodes. 🎉

And the parser is now almost clean from any output format. This means that we can start thinking about a way we could add a cache layer, so we only have to parse changed files since the last run.
We have a lot of tests to write, and many things can be improved. But I mark this step as a major step forward towards a real RST parser in php.

I will send you more updates once we are making progress again.

@javiereguiluz
Copy link
Contributor

These are great news. Thanks for your efforts so far and we're excited about being able to test your parser refactor when it's ready. Please, continue keeping us posted about the progress. Thanks!

@jaapio
Copy link
Author

jaapio commented Dec 14, 2021

Spend some more time to dig in the parser code. And I think that I discovered a missing link. A simple step that would allow a lot of advanced features.

phpDocumentor is using 3 major steps in its process.

  1. Parsing
  2. Compiling
  3. Rendering

Each stage produces an output, which could be stored in some way. However we do not store the compiled AST consumed by the rendering stage. As the compiler uses dynamic information from other nodes to enrich the output. Saving this output could result into a complex mechanism to ensure the cache is correctly cleared when a parsed file is updated.

In this library, at least the version we are working on right now in the phpDocumentor repo, doesn't have the compile step. And this looks like the missing link in the process. As far as I get it right now the parser is complete decoupled from the rendering step, but internally the parser adds each Document into a map called Metas. Which also contains the output url. This url makes it more or less impossible to change the output format without reparsing all documents.

I did a small experiment to see what would happen if we wouldn't create the normal output url as it does right now. And basically nothing happens... just all urls are incorrect. But the output is still generated.
So I think when we would introduce an in-between step we would clear the paths to have a cached AST output from the parser.

This compiler step would also allow people to manipulate the AST after the parser is completed. I think this is part of #145. By simply adding more steps into the compiler, extra information can be added to the nodes when needed.

See you in the next update 👋

@jaapio
Copy link
Author

jaapio commented Jan 22, 2022

While I was trying a prototype of a possible caching layer I figured out that during the parsing phase of the parser a set of references is built to create the TOC for example. In my next update, this has been refactored. Allowing us to replace the way links are created.

In the phpDocumentor context this means that we can now render links like

:PHP:class:`phpDocumentor\Application`

Using our internal router, with this change the hacks are removed to link to project source code and other odd link handling. Allowing us and hopefully in the future you all to inject your own way of source code linking without any new directives or other complex stuff.

The other downside of the way references were handled, the parser always needed to run on each file. As we are now consuming the meta's, this dependency is gone. I will continue the experimental cache layer to see if we can skip the parsing of the unchanged files, and still be able to render them.

@jaapio
Copy link
Author

jaapio commented Apr 4, 2022

Today was a great day, I'm in-between two customer projects which allowed me to have a full day focus on the rst rendering part. The result is good enough to share it with you again.

I was able to create an example script that is now doing some basic parsing and rendering using the new infrastructure of the library.
For the people interested in what is now needed:
https://github.com/phpDocumentor/phpDocumentor/pull/3212/files#diff-2894f97ac5114facabb48d0348c126f0f0e02e299c3978ee1fc87819623486bb

When we started the refactoring we moved all the templates out of the library to be able to include them in the phpDocumentor templates. Today I implemented a way in phpDocumentor itself that allows us to move the templates back into the library. Besides that I added an extra layer to allow phpDocumentor to manipulate the twig environment creation as we need some more global variables which are not required by the rst lib.

This brings us yet another step closer to the extraction of the library in a separate project. As the example file shows you it is runnable as a stand-alone lib again with some bootstrapping and wiring. Tomorrow I will continue to improve the setup. But also want to have a look at the functional tests this repo contains. As that will tell me how much it will take to align both projects.

@wouterj
Copy link
Collaborator

wouterj commented Apr 5, 2022

Very cool!

I've just tried it out on the Symfony docs, and (as expected) I get quite a lot of PHP warnings about little things that we've fixed in this library in the past 2 years. I guess after the Guides extraction is finished, we can create a plan on how to include the bug fixes/features from this project in Guides (without having to redo all the tedious debugging efforts).
E.g. I noticed none of the references (the `Something`_ things) are parsed in the Symfony docs at the moment.

Also, I guess it's already known, but information about the original file/directory names is not included in the document renderer - making it impossible to render a directory of documents.

All in all, it's great to actually see the progress locally!

@jaapio
Copy link
Author

jaapio commented Apr 12, 2022

I decided to have a mono-repo for all guides related packages. This will help us to use the existing tests, but also maintain the libraries. As I do agree with you that in most cases we will need to add new nodes to the core project to be able to support them in the format-specific libraries.

I used the same repo. The current guides package will be renamed to guides-core as it will contain the core code to have guides rendering. I'm open to better names. :-D

@stof
Copy link
Member

stof commented Apr 12, 2022

The current package could keep the existing name, if you don't publish the mono-repo to Packagist.

@jaapio
Copy link
Author

jaapio commented Apr 12, 2022

I thought packagist forces the name of a repo to be equal to the name of the package since composer 2.

@stof
Copy link
Member

stof commented Apr 12, 2022

No. It never did that. The repo URL can be what you want.

What composer 2 did is forcing package names to be lowercase (instead of making them case-insensitive) and forbidding some chars in them that are not safe for usage in filesystem paths or URLs (packagist.org was already blocking those btw)

@wouterj
Copy link
Collaborator

wouterj commented Apr 13, 2022

Cool, looks great. Let's keep phpdocumentor/guides for the core package, that makes the most sense to me.

If you have any questions about splitsh, feel free to reach out to the Symfony team :)

@javiereguiluz
Copy link
Contributor

Maybe I'm not understanding things well, but I expected these packages:

phpdocumentor/rst-parser
phpdocumentor/guides
phpdocumentor/...

"rst-parser" would be the package required by Symfony and other projects needing RST parsing. "guides" would be the RST/formatting customizations needed by PHPDocumentor and it would be similar to the docs-builder Symfony package, which adds customizations on top of Doctrine's rst-parser

@jaapio
Copy link
Author

jaapio commented Apr 13, 2022

What the markup specific packages like the RST package are doing is parsing an text format into an AST, the guides lib contains the rendering part of the AST.

So the symfony docs-builder will use the guides library. and the guides-restructuredtext.

phpdocumentor specific stuff is part of our application. And will not be exposed in a separate repo.

@wouterj
Copy link
Collaborator

wouterj commented Apr 13, 2022

You can think of phpdocumentor/guides as replacement for the hardcore stuff in symfony-tools/docs-builder (like rendering). Unlike rst-parser, it'll do a bit more than just parsing reStructured Text. Which is great news, the more complexity is maintained by a bigger team, the better :)

Our symfony-tools/docs-builder package can be stripped down quite a bit this way. It'll only contain the custom theme for symfony.com and our custom directives and roles (e.g. :class: and .. versionadded::). You can see how Guides and Docs-Builder work together in my basic tryout: https://github.com/wouterj/symfony-docs-guides/tree/main/lib/docs-builder

@javiereguiluz
Copy link
Contributor

I think I disagree 🤔 For example, if this library does things such as traverse directories to look for RST files and then parses them ... this might be a drawback if we have other requirements:

(1) Traverse directories in certain way, excluding some dirs and some files applying certain custom logic
(2) Loading RST contents in memory, not via files, because we're parsing things on-the-fly (e.g. a blog post written in RST)

These are not imaginary examples, these are real requirements that I have today when using RST.

I know I'm insisting a lot ... but I think it'd be a mistake to have anything different from "a pure RST parser" + "other independent libraries that implement rich features on top of it".

@mvriel
Copy link
Contributor

mvriel commented Apr 13, 2022

Hey Javier,

A pure RST parser sounds good in theory, but the 'problem' is that quite a few features of RST depend on references to external locations and files. This means that you need to build some sort of context or TOC.

Your first point is quite well-supported in the base package and if you are missing a feature there, let's discuss on how to add that :)

@mvriel
Copy link
Contributor

mvriel commented Apr 13, 2022

Although I do not exactly remember if on-the-fly generation is impossible now (don't think so 🤔 ); but the combination of the two packages is a pure RST parser, the core package is mainly a framework

@stof
Copy link
Member

stof commented Apr 13, 2022

the second point is probably supported as well, given that the package relies on Flysystem to abstract the filesystem, and Flysystem has a memory adapter.

@mvriel
Copy link
Contributor

mvriel commented Apr 13, 2022

Exactly for that reason we had chosen Flysystem as an adapter layer, and with the intention to be able to make/use a github abstraction in the future so that people can literally configure a github repo to read from

@jaapio
Copy link
Author

jaapio commented Apr 13, 2022

I would like to elaborate a bit on the on going discussion.

A parser is for me a unit that takes some input and returns an AST. What is done with the AST is up to the consuming application of the parser.

According to this definition, the project doctrine/rst-parser isn't a pure RST-parser. It does the parsing and rendering part in technically one run. Not to attack anyone about what has been created but that's what it is right now.

In the new format, we separated the parsing and rendering. I think we could invest more on the splitting of classes so people could directly put a string content into the Markdown parser allowing a consumer to just focus on the parsing. This would just require us to move some classes around in other packages and splitting them into separate repositories. Like I did for the 3 current projects.

The result would be:
Guide-AST containing the nodes we identified as generic nodes between multiple markup formats.
Guide-RST-parser a pure parser producing an AST. this library would consume Guide-AST
Guide-Renderer containing all rendering logic of the AST. This would also contain logic to resolve references?
Guides the glue between the packages mentioned above.

Technically speaking the current Guides package is a combination of Guide-AST, Guide-Renderer and Guides. But for me this doesn't make sense, it would just increase the maintenance burden on all libraries. As they are somewhat connected. The current separation between guides, rst and markdown is just there because not everyone would combine RST and markdown.
But most people would need parsing and rendering in one go. Unless you are building a analyser for the AST?

The guides library is definitely missing a lot of documentation, but I think when using the correct classes you will be able to interact directly with the parser, fetch the AST and pass it into a different rendering engine. We tried to split as much as possible but also include a lot of glue in guides.

I hope this helps you to understand a bit more what is possible. I hear your concerns and I'm open for all suggestions to improve the system. If anyone thinks we should split the packages more, please let me know. I think we are heading to something very nice and flexible. So please get in touch when you think we have something to improve.

Feel free to ping me on slack, when you would like to have a chat. I'm working on the phpDocumentor project every Tuesday evening (CEST). If you want to have a call at some point, as it would be easier to communicate at some point just ping me on slack. So we can arrange something. Let's do this as a team.

@wouterj
Copy link
Collaborator

wouterj commented Apr 14, 2022

I just tried out parsing and rendering a string of rSt into HTML: https://github.com/wouterj/symfony-docs-guides/blob/main/build.php

While the DX can be improved (especially the RenderContext seems a bit convoluted), it's already quite possible without needing crazy stuff. So this doesn't look like an issue that cannot be overcome.

I think the best is just continuing the development and experiments, before deciding what to do with this new library on symfony.com. I can imagine it is a bit scary to rewrite symfony.com's code using a new library again, but I haven't yet seen any big blockers. Given Guides is also still experimental, there seems to be room to shape it to the needs of e.g. symfony.com if we continue experimenting early (it can be interesting to see if symfony.com has more advanced use cases than the 2 you already mentioned).

@jaapio
Copy link
Author

jaapio commented Apr 20, 2022

I pushed some improvements. And reintroduced the functional tests that were written for this project. As expected we broke some of the behavior this version supports.

The first focus for me will be backporting some of the fixes you did in the tables. The whole table thing is a bit funky. As it still has the mixed parsing and rendering stuff in there.

https://github.com/phpDocumentor/guides/runs/6101692188?check_suite_focus=true

@wouterj
Copy link
Collaborator

wouterj commented Apr 20, 2022

@jaapio I would highly recommend you to remove the weird email-addresses test case from guides. For some reason, this library included a testcase with raw email headers just to test email address detection. As reStructuredText is not compatible with raw headers, it produces lots of junk with unexpected cases (I highly doubt the rst-parser output is expected there).

@jaapio
Copy link
Author

jaapio commented Jul 21, 2022

It has been a while since I posted an update here. The development in phpdocumentor/guides is still going on.

Today I finished a PR, bringing us yet another step closer to a real AST. The sections in the current implementation were created using a begin and end section node. However, this caused several issues and strange code in other parts of the library. You might have noticed this as well. There were several locations where sections were closed. This is now resolved by introducing a new section node. These nodes are forming a real object tree now. And can be nested just like the Docutils spec is describing.

This step makes the rendering part also a bit easier as there is no need to track the current state, no more invalid closing divs.

I need to add more tests and many more improvements. But we are slowly getting closer to a more understandable parser.

@linawolf
Copy link
Collaborator

As we are looking to switch to PHP based ReST rendering for the TYPO3 docs I am starting to wonder if we should use the doctrine rst-parser or the phpdocumentor. it seems like both have pros and cons

@jaapio
Copy link
Author

jaapio commented Nov 29, 2022

I think that depends on your needs. The current doctrine project is very stable. It has its short comings, and does not always follow the rst standard docutils. But it works for both doctrine and symfony. Which makes it a reliable basis to build on.

The phpDocumentor version was based on this project but is heavily changed. We did some major refactoring to improve the parser and make it more extendable. But our work is far from finished. The current version in our repo is even broken. It's far from stable. I would not recommend to use it in a production like project yet. Regarding our plans with phpDocumentor and our new guides project I think projects like typo would benefit from using it. But not right now directly. It will take more time to bring our fork at a production ready level.

@jaapio
Copy link
Author

jaapio commented Dec 30, 2022

I think after a week of work, it is time to report something again about our progress.

This week was quite productive, I finished the rework on meta's that allows us to have proper toc-tree support. Our metas are now stored as the actual tree of sections and documents; this makes it much easier to render the toc-tree and transform it into something equal like sphinx does.

I finished the work Wouter started to introduce a visitor pattern in our parser. I introduced a new step I mentioned earlier called "Compiler". The compiler step gives us the option to transform/remove nodes from the AST we are now using this to build the final toc-tree node. The idea is that the result of the parser can be cached per document. The Compiler will create the final result before rendering, like resolved toc-tree's, references, and such. Things that might change because other documents have changed. The idea is based on the concept of pending nodes in docutils.

Rework of the list parsing; there was a lot of support missing, but the list parser also needed to be easier to understand. I created a brand new version that follows the docutils spec closely. And removed the mix of enumerated lists and bullet lists. There is some more work to do. But as far as I'm concerned, the functional tests are passing.

To close this, I have been working on a new table parser. The new table parser, specialized for simple tables, has full support for all body elements the project supports. This allows us to include images and other body elements in the tables. Not just lists. The next step is to do the same thing for grid tables.

Along the road, I fixed several issues I forgot because there were so many. Most of them were introduced when we started refactoring the parser.

Down the line, it turns out that the new parser structure is helping me a lot to understand what is going on. So I'm pretty happy with the current progress.

Enjoy your new years eve.

@jaapio
Copy link
Author

jaapio commented Feb 24, 2023

Today I completed the majority of the refactoring we did in the parser. There are still some issues to be resolved but the more significant parts are finished now. I would be happy to get any feedback on the new version. As far as anyone is concerned, we will never be able to merge the projects. We changed way too much for that.

I will continue working on our RST parser to make it more feature complete and cover all improvements in this project. And improve the DX, including a lot of missing documentation.

@javiereguiluz
Copy link
Contributor

@jaapio thanks for the update. So, if I understood you correctly, we should forget about these joined efforts. We need to keep working on this project to build a general-purpose RST-parser while you work in your tool, which is a custom-purpose doc building tool that parses RST files too.

It's a bit sad, but I totally understand it. Hopefully we can make a great and full-featured RST-parser that you can even use as a dependency in your project. Cheers!

@wouterj
Copy link
Collaborator

wouterj commented Feb 27, 2023

The phpDocumentor tool is just as usable to render documentation for symfony.com or doctrine-project.org as this library is. Even more, it includes many of the features that we've introduced in symfony-tools/docs-builder (i.e. less "non-standard" things to maintain).

As reStructured text is fully customizable, it still allows to register custom nodes (e.g. directives), roles (e.g. :method:), custom themes, etc. IIRC, the phpDocumentor team (read: Jaap & Mike) are even doing some extra effort to make sure phpdocumentor/guides is standalone from phpdocumentor/phpdocumentor.

In my honest opinion, the code in this library is an unstable foundation, given all the shared state and mixing of responsibilities between classes. The total rewrite done in phpdocumentor/guides gives a much more solid foundation for a future PHP-based reStructured Text parser.

@javiereguiluz
Copy link
Contributor

@wouterj then I didn't understand @jaapio comment at all. I'm sorry.

@wouterj
Copy link
Collaborator

wouterj commented Feb 27, 2023

I think what Jaap was trying to say is: the projects have diverged so much that you can no longer call phpdocumentor/guides a "fork" of this project. In practice, this means that (a) work done in this project can not easily be merged into the work done over at the guides library (or vise-versa) and (b) when changing (if we'll ever do so) it'll not be a smooth upgrade.

Regarding (b), fortunately for us as Symfony team, symfony-tools/docs-builder is an abstraction between symfony.com and this library. So we should be able to "just" refactor docs-builder when switching if we want.

@linawolf
Copy link
Collaborator

So I am wondering is it worth to keep working on this libary then or should I switch my effords?

@jaapio
Copy link
Author

jaapio commented Feb 27, 2023

Thanks for your kind words @wouterj :-)

I was hoping to be able to contribute to this library and make it a shared project. But when diving deeper into the code we realized that there was so much to improve that a merge back would not work. Basically, when you look at our version, you will see a different library. Based on the same original code, you might recognize some parts. But the majority of the code is refactored so much that creating a merge request would result in so many conflicts that it would never happen.

Everything we did in the past year was to build a stable foundation separate from phpDocumentor, but under the phpDocumentor flag to have full support for reStructured Text. The new library is what I think is more open to change than this one is. But I'm biased. If you are just using the output of the library, (HTML) our version could be a valid replacement. It's not a drop-in replacement, but it's doable.

I'm still open to a joint effort with the doctrine & Symfony team to maintain the new version together. However, I cannot force you to move to our package. And I do totally get it when you decide to stick to this project, you are already using it and it will take effort to make that move. If you decide to do so, I would be glad to help you with that effort.

@linawolf I cannot make that decision for you. It depends on the future of this project and your requirements. We (phpDocumentor Team) will continue to maintain the library we created. Your contributions are more than welcome, I'm happy to assist whenever needed. But as this repo is maintained by the Doctrine Team I cannot make a call about the future of this library. But as far as I can recommend you something I would make the switch. For the reasons, @wouterj mentioned earlier.

@linawolf
Copy link
Collaborator

@jaapio, we should urgendly talk to each other to prevent duplicate work if possible.I can see the flaws of the design In this libary. But I already put so much work into it that it would be quite hurtful to switch to another one now and loose all the progress.

I don't know how good is your test coverage but it seems like many features we (TYPO3 documentation team) urgendly need are not supported yet. So I would probably have to put a large amount of work into you project until I could use it in production. And much of it would be duplicate work as I already put it into the doctrine/rst-parser.

@jaapio
Copy link
Author

jaapio commented Feb 27, 2023

We are at the same timezone, so I suggest we have a call to discuss this. That's much easier than via this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants