Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

generate: demote headings, and group link reference definitions #620

Merged
merged 26 commits into from
Jul 1, 2022
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
a098a9c
generate: demote level of non-first headers
ee7 Jun 17, 2022
8e4bc8a
generate: refactor top-level header removal
ee7 Jun 17, 2022
b62775e
tests: generate: add test for header processing
ee7 Jun 17, 2022
5284615
generate: don't alter `#` line inside fenced code block
ee7 Jun 19, 2022
cb47886
generate: add starting level-2 heading with concept's name
ee7 Jun 19, 2022
2f65060
generate: simplify
ee7 Jun 22, 2022
118fd27
generate: refactor via `continuesWith`
ee7 Jun 22, 2022
da1d695
generate: support HTML comment block
ee7 Jun 22, 2022
3a514c2
generate: don't insert h2 when under h2; demote dynamically
ee7 Jun 22, 2022
d7cea65
generate: avoid exception for unknown concept in placeholder
ee7 Jun 22, 2022
ff23522
tests: generate: use a shorter concept name
ee7 Jun 24, 2022
dc291af
generate: don't alter `#` line inside `~~~` block
ee7 Jun 24, 2022
42a8d14
generate: put all reference links at the bottom
ee7 Jun 24, 2022
d1a1ab1
generate: rename `getSlugLookup` proc
ee7 Jun 28, 2022
394221c
generate: rename `header` to `heading`
ee7 Jun 28, 2022
3b0afc2
generate: move `writeError` proc down
ee7 Jun 28, 2022
0d88452
generate: add doc comment
ee7 Jun 28, 2022
94878f0
generate: rename `links` to `linkDefs`
ee7 Jun 28, 2022
42b1460
generate: improve comments about headings
ee7 Jun 28, 2022
5d333b8
generate: rename `title` to `h2`
ee7 Jun 28, 2022
09f4fca
generate: simplify demotion
ee7 Jun 28, 2022
c244d1a
generate: avoid backticks
ee7 Jun 28, 2022
1ecc512
generate: use distinct Slug
ee7 Jun 28, 2022
607f369
generate, sync, types: move borrowed procs for Slug
ee7 Jun 28, 2022
4ee45ef
generate: refer to "link reference definitions"
ee7 Jun 28, 2022
7727f73
generate: ensure exactly 1 newline at EOF
ee7 Jun 30, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
62 changes: 47 additions & 15 deletions src/generate/generate.nim
Original file line number Diff line number Diff line change
@@ -1,5 +1,11 @@
import std/[strbasics, strformat, strscans, terminal]
import ".."/[cli, helpers]
import std/[parseutils, strbasics, strformat, strscans, tables, terminal]
import ".."/[cli, helpers, types_track_config]

proc getSlugLookup(trackDir: Path): Table[string, string] =
ee7 marked this conversation as resolved.
Show resolved Hide resolved
let concepts = TrackConfig.init(readFile(trackDir / "config.json")).concepts
result = initTable[string, string](concepts.len)
for `concept` in concepts:
result[`concept`.slug] = `concept`.name

proc writeError(description: string, path: Path) =
let descriptionPrefix = description & ":"
Expand All @@ -10,21 +16,43 @@ proc writeError(description: string, path: Path) =
stderr.writeLine(path)
stderr.write "\n"

proc conceptIntroduction(trackDir: Path, slug: string,
func alterHeaders(s: string, title: string): string =
# Markdown implementations differ on whether a space is required after the
# final '#' character that begins the header.
result = newStringOfCap(s.len)
var i = 0
i += s.skipWhitespace()
# Skip the top-level header (if any)
if i < s.len and s[i] == '#' and i+1 < s.len and s[i+1] == ' ':
i += s.skipUntil('\n', i)
result.add &"## {title}"
# Demote other headers
var inFencedCodeBlock = false
while i < s.len:
result.add s[i]
if s[i] == '\n':
# When inside a fenced code block, don't alter a line that begins with '#'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The heading could also be in a comment block I think.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. In 5284615 I mentioned that it doesn't support HTML blocks, and there are a lot of them beside comment blocks.

However, in Exercism's markdown spec we write:

Prefer Markdown comments instead of HTML comments (e.g. use [comment]: # (Actual comment...) rather than <!-- Actual comment -->

So do we want to support the below anyway?

# Introduction

<!--
# Not a header
-->

And if we add markdownlint to every track, would we enable MD033? If "yes" to the latter, then we probably don't need to avoid demoting # inside HTML blocks in this PR.

The markdown handling in this PR is very naive, but doing it robustly requires a full markdown parser. And sadly, it doesn't seem like there's a mature, robust, and pure-Nim markdown parser that allows:

markdown -> AST -> manipulate to demote headers -> markdown

It looks like the best option is to use the wrapper for libcmark, but that will make it more complicated to keep configlet as a dependency-free binary.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be more specific, it looks like soasme/nim-markdown at best only supports

markdown -> AST -> manipulate AST -> HTML

The best I could do was along the lines of:

import std/[lists, strbasics, strutils, tables]
import pkg/markdown {.all.}

proc main =
  var s = """
    # Introduction

    Foo.

    ## Some header

    Bar.
  """.unindent()
  s.strip(chars={'\n'})

  # Avoid the overhead from calling the `markdown` proc
  # which uses a non-inline `strip`, and calls `render`.
  let root = Document(doc: s)
  var state = State(references: initTable[string, Reference](), config: initCommonmarkConfig())
  state.parse(root)
  for child in root.children:
    if child of Heading:
      echo child

when isMainModule:
  main()
$ nim r foo.nim
<h1>Introduction</h1>
<h2>Some header</h2>

There are also performance killers like token.children.toSeq.map((t: Token) => $t).join("\n") everywhere in the critical path.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't worry much about performance, as it is likely in the range of milliseconds? But the fact that you can't go back to markdown is quite annoying.

So do we want to support the below anyway?

I think most editors would use that comment format by default. But maybe nobody actually uses headings in comments? 🤷

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't worry much about performance

I meant more like "this is code smell to me". It'd push me further towards "just use libcmark", even if soasme/nim-markdown supported markdown -> AST -> markdown.

I think most editors would use that comment format by default.

Mmm. Probably true.

Supporting <!-- --> with the approach in this PR isn't much work. But do we want to support a line beginning with # inside any others from https://spec.commonmark.org/0.30/#html-blocks?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed da1d695 to add support for HTML comment blocks

if i+1 < s.len and s[i+1] == '#' and not inFencedCodeBlock:
result.add '#'
elif i+3 < s.len and s[i+1] == '`' and s[i+2] == '`' and s[i+3] == '`':
inFencedCodeBlock = not inFencedCodeBlock
result.add "```"
i += 3
inc i
strip result

proc conceptIntroduction(trackDir: Path, slug: string, title: string,
templatePath: Path): string =
## Returns the contents of the `introduction.md` file for a `slug`, but
## without any top-level heading, and without any leading/trailing whitespace.
## Returns the contents of the `introduction.md` file for a `slug`, but:
## - Without a first top-level header.
## - Adding a starting a second-level header containing `title`.
## - Demoting the level of any other header.
## - Without any leading/trailing whitespace.
let conceptDir = trackDir / "concepts" / slug
if dirExists(conceptDir):
let path = conceptDir / "introduction.md"
if fileExists(path):
result = readFile(path)
var i = 0
# Strip the top-level heading (if any)
if scanp(result, i, *{' ', '\t', '\v', '\c', '\n', '\f'}, "#", +' ',
+(~'\n')):
result.setSlice(i..result.high)
strip result
result = path.readFile().alterHeaders(title)
else:
writeError(&"File {path} not found for concept '{slug}'", templatePath)
quit(1)
Expand All @@ -33,7 +61,8 @@ proc conceptIntroduction(trackDir: Path, slug: string,
templatePath)
quit(1)

proc generateIntroduction(trackDir: Path, templatePath: Path): string =
proc generateIntroduction(trackDir: Path, templatePath: Path,
slugLookup: Table[string, string]): string =
## Reads the file at `templatePath` and returns the content of the
## corresponding `introduction.md` file.
let content = readFile(templatePath)
Expand All @@ -48,7 +77,8 @@ proc generateIntroduction(trackDir: Path, templatePath: Path): string =
if scanp(content, i,
"%{", *{' '}, "concept", *{' '}, ':', *{' '},
+{'a'..'z', '-'} -> conceptSlug.add($_), *{' '}, '}'):
result.add conceptIntroduction(trackDir, conceptSlug, templatePath)
let title = slugLookup[conceptSlug]
result.add conceptIntroduction(trackDir, conceptSlug, title, templatePath)
else:
result.add content[i]
inc i
Expand All @@ -60,9 +90,11 @@ proc generate*(conf: Conf) =

let conceptExercisesDir = trackDir / "exercises" / "concept"
if dirExists(conceptExercisesDir):
let slugLookup = getSlugLookup(trackDir)
for conceptExerciseDir in getSortedSubdirs(conceptExercisesDir):
let introductionTemplatePath = conceptExerciseDir / ".docs" / "introduction.md.tpl"
if fileExists(introductionTemplatePath):
let introduction = generateIntroduction(trackDir, introductionTemplatePath)
let introduction = generateIntroduction(trackDir, introductionTemplatePath,
slugLookup)
let introductionPath = introductionTemplatePath.string[0..^5] # Removes `.tpl`
writeFile(introductionPath, introduction)
1 change: 1 addition & 0 deletions tests/all_tests.nim
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
import "."/[
test_binary,
test_fmt,
test_generate,
test_json,
test_lint,
test_probspecs,
Expand Down
18 changes: 8 additions & 10 deletions tests/test_binary.nim
Original file line number Diff line number Diff line change
Expand Up @@ -872,17 +872,16 @@ proc testsForSync(binaryPath: static string) =
&"failed to merge '{mainBranchName}' in " &
&"problem-specifications directory: '{probSpecsDir}'")

proc prepareIntroductionFiles(trackDir, header, placeholder: string;
removeIntro: bool) =
proc prepareIntroductionFiles(trackDir, placeholder: string; removeIntro: bool) =
# Writes an `introduction.md.tpl` file for the `bird-count` Concept Exercise,
# containing the given `header` and `placeholder`. Also removes the
# `introduction.md` file if `removeIntro` is `true`.
# containing the given `placeholder`. Also removes the `introduction.md` file
# if `removeIntro` is `true`.
let
docsPath = trackDir / "exercises" / "concept" / "bird-count" / ".docs"
introPath = docsPath / "introduction.md"
templatePath = introPath & ".tpl"
templateContents = fmt"""
# {header}
# Introduction

{placeholder}
""".unindent()
Expand All @@ -897,7 +896,7 @@ proc testsForGenerate(binaryPath: string) =

# Setup: clone a track repo, and checkout a known state
setupExercismRepo("elixir", trackDir,
"f3974abf6e0d4a434dfe3494d58581d399c18edb") # 2021-05-09
"91ccf91940f32aff3726c772695b2de167d8192a") # 2022-06-12

test "`configlet generate` exits with 0 when there are no `.md.tpl` files":
execAndCheck(0, generateCmd, "")
Expand All @@ -906,8 +905,7 @@ proc testsForGenerate(binaryPath: string) =
checkNoDiff(trackDir)

# Valid placeholder syntax without spaces, and invalid slug
prepareIntroductionFiles(trackDir, "Recursion",
"%{concept:not-a-real-concept-slug}",
prepareIntroductionFiles(trackDir, "%{concept:not-a-real-concept-slug}",
removeIntro = false)

test "`configlet generate` exits with 1 for an invalid placeholder usage":
Expand All @@ -917,7 +915,7 @@ proc testsForGenerate(binaryPath: string) =
checkNoDiff(trackDir)

# Valid placeholder syntax without spaces, and valid slug
prepareIntroductionFiles(trackDir, "Recursion", "%{concept:recursion}",
prepareIntroductionFiles(trackDir, "%{concept:recursion}",
removeIntro = true)

test "`configlet generate` exits with 0 for a valid `.md.tpl` file":
Expand All @@ -927,7 +925,7 @@ proc testsForGenerate(binaryPath: string) =
checkNoDiff(trackDir)

# Valid placeholder syntax with spaces, and valid slug
prepareIntroductionFiles(trackDir, "Recursion", "%{ concept : recursion }",
prepareIntroductionFiles(trackDir, "%{ concept : recursion }",
removeIntro = true)

test "`configlet generate` exits with 0 for valid placeholder usage with spaces":
Expand Down
70 changes: 70 additions & 0 deletions tests/test_generate.nim
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
import std/[strutils, unittest]
from "."/generate/generate {.all.} import alterHeaders

proc testGenerate =
suite "generate":
test "alterHeaders":
const s = """
# Header 1

The quick brown fox jumps over a lazy dog.

The five boxing wizards jump quickly.

## Header 2

The quick brown fox jumps over a lazy dog.

The five boxing wizards jump quickly.

### Header 3

The quick brown fox jumps over a lazy dog.

```nim
# This line is not a header
echo "hi"
```

## Header 4

The quick brown fox jumps over a lazy dog.

The five boxing wizards jump quickly.
""".unindent()

const expected = """
## Operator Overloading

The quick brown fox jumps over a lazy dog.

The five boxing wizards jump quickly.

### Header 2

The quick brown fox jumps over a lazy dog.

The five boxing wizards jump quickly.

#### Header 3

The quick brown fox jumps over a lazy dog.

```nim
# This line is not a header
echo "hi"
```

### Header 4

The quick brown fox jumps over a lazy dog.

The five boxing wizards jump quickly.""".unindent() # No final newline

check alterHeaders(s, "Operator Overloading") == expected

proc main =
testGenerate()

main()
{.used.}