-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor list parser #143
Refactor list parser #143
Conversation
The second line of each enumerated list item is checked for validity: | ||
|
||
1. This is not a list. | ||
It's a paragraph starting with a number. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously, there was a tests that tested this behavior as "being recognized as a list" (tests/Functional/tests/ordered2/ordered2.rst
). However, the markup specs are very explicit about this:
The second line of each enumerated list item is checked for validity. This is to prevent ordinary paragraphs from being mistakenly interpreted as list items, when they happen to begin with text identical to enumerators. For example, this text is parsed as an ordinary paragraph:
A. Einstein was a really smart dude.
However, ambiguity cannot be avoided if the paragraph consists of only one line. This text is parsed as an enumerated list item:
A. Einstein was a really smart dude.
If a single-line paragraph begins with text identical to an enumerator ("A.", "1.", "(b)", "I)", etc.), the first character will have to be escaped in order to have the line parsed as an ordinary paragraph:
\A. Einstein was a really smart dude.
@@ -1 +1,3 @@ | |||
Testing that this is escaped: <script> | |||
|
|||
\And escape \characters are ig\nored, but this one isn't: \\stdClass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Escape characters are not rendered according to the Sphinx parser. This was also required to make \1. Single line paragraphs starting with a number may use an escape.
pass.
@@ -1,7 +0,0 @@ | |||
<p>Testing an indented list:</p> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This tested indented lists to be parsed as lists. But in fact, reSt doesn't allow any whitespace before the list marker. Such whitespace (indentation) is recognized as a blockquote.
This is now tested by tests/Functional/tests/list-indented/list-indented.rst
@@ -79,8 +79,15 @@ <h1> | |||
<p>Appendix A.4. Messages with Trace Fields</p> | |||
<blockquote> | |||
<hr /> | |||
<p>Received: from x.y.test by example.net via TCP with ESMTP id ABC12345 for <<a href="mailto:mary@example4.net">mary@example4.net</a>>; 21 Nov 1997 10:05:43 -0600 | |||
Received: from node.example by x.y.test; 21 Nov 1997 10:01:22 -0600 | |||
<dl> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly, I'm blown away by this test file. It was added to test the automatic recognition of e-mailaddresses. However, testing raw emails headers as reStructuredText seems very weird (resulting in strange things like this definition list).
@@ -43,15 +37,18 @@ public function isSpecialLine(string $line): ?string | |||
return $letter; | |||
} | |||
|
|||
public function isListLine(string $line, bool $isCode): bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what the $isCode
's function was here. Code blocks shouldn't be subparsed, so this should never be true. So I simply removed it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only place it was used is in DocumentParser... related to lists... and you have already refactored that part. So this is fine with me - probably some old, unwanted piece related to the odd "is code" handling in DocumentParser.
175c2d6
to
918b203
Compare
Updated this PR. Fixed behavior when mixing different list markers and fixed the LaTex out (and checked its validness with a LaTex parser).
|
a319cd8
to
ddae0e2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is terrifying... in a good way!
This does break some BC... but all in ways to, clearly, drastically fix the list rendering. A bug 👍 from me!
@@ -43,15 +37,18 @@ public function isSpecialLine(string $line): ?string | |||
return $letter; | |||
} | |||
|
|||
public function isListLine(string $line, bool $isCode): bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only place it was used is in DocumentParser... related to lists... and you have already refactored that part. So this is fine with me - probably some old, unwanted piece related to the odd "is code" handling in DocumentParser.
This refactors parsing lists using subparsers, in order to parse list contents (like code blocks or other body elements). It also fixed some inconsistencies between the reStructuredText Markup specification and the behavior of this parser.
ddae0e2
to
5dc37a0
Compare
This PR should be ready. I've documented the BC breaks very minimalistic, I don't really expect anyone to be hit by these. |
I tried the new utility locally, it errors out like this:
I was looking into it because I wanted to understand if the new gitignored directory was going to be useful or if we should maybe use |
@greg0ire I intended the new utility to be used like
I put it in the project dir because I want to look at the generated HTML (that's the whole purpose of running Sphinx). Opening a file from the project dir is much easier (at least for me) than opening a tmp directory. |
Thanks @wouterj ! |
Fixes #142, fixes https://github.com/weaverryan/docs-builder/issues/70, fixes https://github.com/weaverryan/docs-builder/issues/38
Previously, lists were parsed as flat structure and the renderer took care of finding out the nested lists. That didn't really work and didn't allow subparsing the list item (e.g. directives or code blocks nested in the list). This PR refactors it to parse lists using a subparser, similarly to definition lists and directives.
I also added a little
tests/sphinx
utility that allows you to easily run the functional using a locally installed Sphinx:./tests/sphinx list
will rendertests/Functional/tests/list/list.rst
. All tests in this PR are validated using Sphinx (there were quite some tests that didn't really follow the official specs).Todo
I wanted to get this PR ready for review as soon as possible. A few things needs to be done:
UPGRADE.md
Done, no changes needed and all green