Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix RST parsing when no indent after enum.item (fix #17249) #17257

Merged
merged 9 commits into from
Mar 12, 2021
Merged
Show file tree
Hide file tree
Changes from 8 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions compiler/docgen.nim
Original file line number Diff line number Diff line change
Expand Up @@ -1366,8 +1366,9 @@ proc commandRstAux(cache: IdentCache, conf: ConfigRef;
var d = newDocumentor(filen, cache, conf, outExt)

d.isPureRst = true
var rst = parseRst(readFile(filen.string), filen.string, 0, 1, d.hasToc,
{roSupportRawDirective, roSupportMarkdown}, conf)
var rst = parseRst(readFile(filen.string), filen.string,
line=LineRstInit, column=ColRstInit,
d.hasToc, {roSupportRawDirective, roSupportMarkdown}, conf)
var modDesc = newStringOfCap(30_000)
renderRstToOut(d[], rst, modDesc)
d.modDesc = rope(modDesc)
Expand Down
20 changes: 11 additions & 9 deletions doc/contributing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -599,14 +599,16 @@ to existing modules is acceptable. For two reasons:

Conventions
-----------
1. New stdlib modules should go under `Nim/lib/std/`. The rationale is to require
users to import via `import std/foo` instead of `import foo`, which would cause
potential conflicts with nimble packages. Note that this still applies for new modules
in existing logical directories, e.g.:
use `lib/std/collections/foo.nim`, not `lib/pure/collections/foo.nim`.
1. New stdlib modules should go under `Nim/lib/std/`. The rationale is to
require users to import via `import std/foo` instead of `import foo`,
which would cause potential conflicts with nimble packages.
Note that this still applies for new modules in existing logical
directories, e.g.: use `lib/std/collections/foo.nim`,
not `lib/pure/collections/foo.nim`.

2. New module names should prefer plural form whenever possible, e.g.:
`std/sums.nim` instead of `std/sum.nim`. In particular, this reduces chances of conflicts
between module name and the symbols it defines. Furthermore, module names should
use `snake_case` and not use capital letters, which cause issues when going
from an OS without case sensitivity to an OS with it.
`std/sums.nim` instead of `std/sum.nim`. In particular, this reduces
chances of conflicts between module name and the symbols it defines.
Furthermore, module names should use `snake_case` and not use capital
letters, which cause issues when going from an OS without case
sensitivity to an OS with it.
15 changes: 8 additions & 7 deletions doc/manual.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6073,13 +6073,14 @@ avoid ambiguity when there are multiple modules with the same path.

There are two pseudo directories:

1. ``std``: The ``std`` pseudo directory is the abstract location of Nim's standard
library. For example, the syntax ``import std / strutils`` is used to unambiguously
refer to the standard library's ``strutils`` module.
2. ``pkg``: The ``pkg`` pseudo directory is used to unambiguously refer to a Nimble
package. However, for technical details that lie outside the scope of this document,
its semantics are: *Use the search path to look for module name but ignore the standard
library locations*. In other words, it is the opposite of ``std``.
1. ``std``: The ``std`` pseudo directory is the abstract location of
Nim's standard library. For example, the syntax ``import std / strutils``
is used to unambiguously refer to the standard library's ``strutils`` module.
2. ``pkg``: The ``pkg`` pseudo directory is used to unambiguously refer to
a Nimble package. However, for technical details that lie outside the
scope of this document, its semantics are: *Use the search path to look for
module name but ignore the standard library locations*.
In other words, it is the opposite of ``std``.


From import statement
Expand Down
2 changes: 1 addition & 1 deletion lib/impure/db_sqlite.nim
Original file line number Diff line number Diff line change
Expand Up @@ -154,7 +154,7 @@
## The reasoning is as follows:
## 1. it's close to what many DBs offer natively (char**)
## 2. it hides the number of types that the DB supports
## (int? int64? decimal up to 10 places? geo coords?)
## (int? int64? decimal up to 10 places? geo coords?)
## 3. it's convenient when all you do is to forward the data to somewhere else (echo, log, put the data into a new query)
##
## See also
Expand Down
51 changes: 44 additions & 7 deletions lib/packages/docutils/rst.nim
Original file line number Diff line number Diff line change
Expand Up @@ -467,12 +467,20 @@ type
s*: PSharedState
indentStack*: seq[int]
filename*: string
line*, col*: int
line*, col*: int ## initial line/column of whole text or
## documenation fragment that will be added
## in case of error/warning reporting to
## (relative) line/column of the token.
hasToc*: bool
curAnchor*: string # variable to track latest anchor in s.anchors

EParseError* = object of ValueError

const
LineRstInit* = 1 ## Initial line number for standalone RST text
ColRstInit* = 0 ## Initial column number for standalone RST text
## (Nim global reporting adds ColOffset=1)

template currentTok(p: RstParser): Token = p.tok[p.idx]
template prevTok(p: RstParser): Token = p.tok[p.idx - 1]
template nextTok(p: RstParser): Token = p.tok[p.idx + 1]
Expand Down Expand Up @@ -542,8 +550,8 @@ proc initParser(p: var RstParser, sharedState: PSharedState) =
p.idx = 0
p.filename = ""
p.hasToc = false
p.col = 0
p.line = 1
p.col = ColRstInit
p.line = LineRstInit
p.s = sharedState

proc addNodesAux(n: PRstNode, result: var string) =
Expand Down Expand Up @@ -1439,8 +1447,8 @@ proc countTitles(p: var RstParser, n: PRstNode) =
if p.s.hTitleCnt >= 2:
break

proc tokenAfterNewline(p: RstParser): int =
result = p.idx
proc tokenAfterNewline(p: RstParser, start: int): int =
result = start
while true:
case p.tok[result].kind
of tkEof:
Expand All @@ -1450,6 +1458,9 @@ proc tokenAfterNewline(p: RstParser): int =
break
else: inc result

proc tokenAfterNewline(p: RstParser): int {.inline.} =
result = tokenAfterNewline(p, p.idx)

proc isAdornmentHeadline(p: RstParser, adornmentIdx: int): bool =
## check that underline/overline length is enough for the heading.
## No support for Unicode.
Expand Down Expand Up @@ -1937,13 +1948,34 @@ proc parseEnumList(p: var RstParser): PRstNode =
wildToken: array[0..5, int] = [4, 3, 3, 4, 3, 3] # number of tokens
wildIndex: array[0..5, int] = [1, 0, 0, 1, 0, 0]
# position of enumeration sequence (number/letter) in enumerator
result = newRstNodeA(p, rnEnumList)
let col = currentTok(p).col
var w = 0
while w < wildcards.len:
if match(p, p.idx, wildcards[w]): break
inc w
assert w < wildcards.len
proc checkAfterNewline(p: RstParser, report: bool): bool =
a-mr marked this conversation as resolved.
Show resolved Hide resolved
let j = tokenAfterNewline(p, start=p.idx+1)
if p.tok[j].kind notin {tkIndent, tkEof} and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getting there, it's almost perfect...

another regression:

1. hello 1
1. hello 2
   foo 2
1. hello 3
   foo 3
   1. sub1
   1. sub2
   1. sub3
   zook5
1. hello 4
1. hello 5
1. hello 6
 zook6
1. hello 7

1. hello 8


1. hello 9

1. hello 10

separator

1. hello 11
1. hello 12

before PR

image

after PR

image

  • sub3 gets displaced
  • there shouldn't be an interruption in numbering form 1. hello 5 to 1. hello 6

but this could potentially be addressed in future PR, wdyt ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything is working as expected here.

  • sub3 is not displaced. It's just parsed as normal paragraph according to RST specification because there is zook5 without indent after it
  • the same here: "1. hello 6" was parsed as a normal paragraph and zook6 became a block quote because it's indented by 1

And the warning is issued in both cases, so it looks ok to me.

rst2html.py does the same thing if you change 1s to #s.

Also note that presence of auto-numerator 1 does not mean that it's a real "markdown mode".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's possible to implement also markdown-style enumeration list parsing because it's basically just relaxation of the RST kind. But it would be another task.

p.tok[j].col < p.tok[p.idx+wildToken[w]].col and
(p.tok[j].col > col or
(p.tok[j].col == col and not match(p, j, wildcards[w]))):
if report:
let n = p.line + p.tok[j].line
let msg = "\n" & """
not enough indentation on line $2
(if it's continuation of enumeration list),
or no blank line after line $1 (if it should be the next paragraph),
or no escaping \ at the beginning of line $1
(if lines $1..$2 are a normal paragraph, not enum. list)""".
unindent(8)
rstMessage(p, mwRstStyle, msg % [$(n-1), $n])
result = false
else:
result = true
if not checkAfterNewline(p, report = true):
return nil
result = newRstNodeA(p, rnEnumList)
let autoEnums = if roSupportMarkdown in p.s.options: @["#", "1"] else: @["#"]
var prevAE = "" # so as not allow mixing auto-enumerators `1` and `#`
var curEnum = 1
Expand All @@ -1963,6 +1995,10 @@ proc parseEnumList(p: var RstParser): PRstNode =
result.add(item)
if currentTok(p).kind == tkIndent and currentTok(p).ival == col and
match(p, p.idx+1, wildcards[w]):
# don't report to avoid duplication of warning since for
# subsequent enum. items parseEnumList will be called second time:
if not checkAfterNewline(p, report = false):
break
let enumerator = p.tok[p.idx + 1 + wildIndex[w]].symbol
# check that it's in sequence: enumerator == next(prevEnum)
if "n" in wildcards[w]: # arabic numeral
Expand Down Expand Up @@ -2336,7 +2372,8 @@ proc selectDir(p: var RstParser, d: string): PRstNode =
of "warning": result = dirAdmonition(p, d)
of "default-role": result = dirDefaultRole(p)
else:
rstMessage(p, meInvalidDirective, d)
let tok = p.tok[p.idx-2] # report on directive in ".. directive::"
rstMessage(p, meInvalidDirective, d, tok.line, tok.col)

proc prefix(ftnType: FootnoteType): string =
case ftnType
Expand Down
8 changes: 6 additions & 2 deletions lib/packages/docutils/rstgen.nim
Original file line number Diff line number Diff line change
Expand Up @@ -1506,7 +1506,8 @@ proc rstToHtml*(s: string, options: RstParseOptions,
initRstGenerator(d, outHtml, config, filen, options, myFindFile,
rst.defaultMsgHandler)
var dummyHasToc = false
var rst = rstParse(s, filen, 0, 1, dummyHasToc, options)
var rst = rstParse(s, filen, line=LineRstInit, column=ColRstInit,
a-mr marked this conversation as resolved.
Show resolved Hide resolved
dummyHasToc, options)
result = ""
renderRstToOut(d, rst, result)

Expand All @@ -1518,4 +1519,7 @@ proc rstToLatex*(rstSource: string; options: RstParseOptions): string {.inline,
var option: bool
var rstGenera: RstGenerator
rstGenera.initRstGenerator(outLatex, defaultConfig(), "input", options)
rstGenera.renderRstToOut(rstParse(rstSource, "", 1, 1, option, options), result)
rstGenera.renderRstToOut(
rstParse(rstSource, "", line=LineRstInit, column=ColRstInit,
option, options),
result)
9 changes: 9 additions & 0 deletions tests/stdlib/trstgen.nim
Original file line number Diff line number Diff line change
Expand Up @@ -708,6 +708,15 @@ Test1
doAssert count(output7, "<li>") == 3
doAssert "start=\"3\"" in output7 and "class=\"upperalpha simple\"" in output7

# check that it's not recognized as enum.list without indentation on 2nd line
let input8 = dedent """
A. string1
string2
"""
# TODO: find out hot to catch warning here instead of throwing a defect
expect(AssertionDefect):
let output8 = input8.toHtml

test "Markdown enumerated lists":
let input1 = dedent """
Below are 2 enumerated lists: Markdown-style (5 items) and RST (1 item)
Expand Down