Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

generate: demote headings, and group link reference definitions #620

Merged
merged 26 commits into from
Jul 1, 2022

Conversation

ee7
Copy link
Member

@ee7 ee7 commented Jun 17, 2022

To-do:

  • Decide on desired behavior.
  • Implement it.
  • Move reference links to the bottom, ordering by usage, and deduplicating them.
  • Support ~~~ code blocks?

As an example of something that isn't supported, GFM multi-line block quotes:

>>>
This line is part of a multi-line block quote
### This is not a heading in GFM
>>>

Closes: #328
Closes: #626


This PR will also demote the level of a line that lacks a whitespace after the final # character, like:

###Some heading

But previously we only removed a first top-level heading if the separating space was present.

Edge case that I don't expect to be important: the current implementation will demote a level-6 heading (beginning with ######) to "a level-7 heading". But a line beginning with 7 # characters is actually no longer a heading.

ee7 added 4 commits June 19, 2022 12:47
To-do: update the integration tests accordingly.

Closes: 328
Previously, `configlet generate` assumed that every line beginning
with a '#' character inside a concept `introduction.md` is a header.
For our purposes, the most common exception to that idea is probably
a comment inside a fenced code block. With this commit, we no longer
add a '#' to such a line.

The markdown handling here is rudimentary, but it may be good enough to
avoid a full markdown parser for `configlet generate`.

But this doesn't support other blocks, like HTML blocks.
@ee7 ee7 force-pushed the generate-demote-headers branch from e3111b5 to 5284615 Compare June 19, 2022 10:47
@ee7 ee7 marked this pull request as ready for review June 19, 2022 10:52
@ee7 ee7 requested a review from ErikSchierboom as a code owner June 19, 2022 10:52
@ee7
Copy link
Member Author

ee7 commented Jun 19, 2022

@ErikSchierboom Given this introduction.md.tpl, and this concept introduction.md, what is the expected generated introduction.md file?

Currently, this PR produces the following diff to the exercise's introduction.md in the common-lisp repo, meaning that it contains only level-1 and level-3 headers:

-## Comparing characters.
+### Comparing characters.
 
 Characters can be compared for equality by `char=` or `char-equal` (the latter being case-insensitive).
 
 Characters can be ordered with functions such as `char<`, `char>`, `char<=`, `char>=` and their case-insensitive versions `char-lessp` and `char-greaterp`.
 
-## Types of characters
+### Types of characters
 
 Characters can be categorized as different types: graphical, alphabetic, alpha-numeric, digit, upper-case or lower-case. 
 A character can be more than one type such as an upper-case alphabetic character.
 There are predicates for each of these types: `graphic-char-p`, `alpha-char-p`, `alphanumericp`, `digit-char-p`, `upper-case-p` and `lower-case-p`.
 
-## Converting character case
+### Converting character case
 
 Some characters can be upper or lower case and can be converted between them with `char-upcase` and `char-downcase`.

Is configlet generate supposed to add a header with the name of the concept? That is, to generate the second of these files:

does introduction.md.tpl contain

# Introduction

## Binaries

%{concept:binaries}

or this?

# Introduction

%{concept:binaries}

And if the latter, should we set the contents of the inserted level-2 header from the concept's name in the track-level config.json file?

Also update the tests to assert that `configlet generate` makes no
changes to the current state of the elixir track repo.

In a later PR we'll update those tests to assert the same for
`configlet generate -uy`.
@ee7
Copy link
Member Author

ee7 commented Jun 19, 2022

Pushed cb47886 assuming that the answer is "yes" to both of:

Is configlet generate supposed to add a header with the name of the concept?

should we set the contents of the inserted level-2 header from the concept's name in the track-level config.json file?

(I'll try to improve some variable names like title later).

@ErikSchierboom
Copy link
Member

Is configlet generate supposed to add a header with the name of the concept?
should we set the contents of the inserted level-2 header from the concept's name in the track-level config.json file?

Personally, I think the answer to both should be "yes", but I'll double-check with the other maintainers.

Copy link
Member

@ErikSchierboom ErikSchierboom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM Let's hold off on merging until other maintainers have chimed in.

while i < s.len:
result.add s[i]
if s[i] == '\n':
# When inside a fenced code block, don't alter a line that begins with '#'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The heading could also be in a comment block I think.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. In 5284615 I mentioned that it doesn't support HTML blocks, and there are a lot of them beside comment blocks.

However, in Exercism's markdown spec we write:

Prefer Markdown comments instead of HTML comments (e.g. use [comment]: # (Actual comment...) rather than <!-- Actual comment -->

So do we want to support the below anyway?

# Introduction

<!--
# Not a header
-->

And if we add markdownlint to every track, would we enable MD033? If "yes" to the latter, then we probably don't need to avoid demoting # inside HTML blocks in this PR.

The markdown handling in this PR is very naive, but doing it robustly requires a full markdown parser. And sadly, it doesn't seem like there's a mature, robust, and pure-Nim markdown parser that allows:

markdown -> AST -> manipulate to demote headers -> markdown

It looks like the best option is to use the wrapper for libcmark, but that will make it more complicated to keep configlet as a dependency-free binary.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be more specific, it looks like soasme/nim-markdown at best only supports

markdown -> AST -> manipulate AST -> HTML

The best I could do was along the lines of:

import std/[lists, strbasics, strutils, tables]
import pkg/markdown {.all.}

proc main =
  var s = """
    # Introduction

    Foo.

    ## Some header

    Bar.
  """.unindent()
  s.strip(chars={'\n'})

  # Avoid the overhead from calling the `markdown` proc
  # which uses a non-inline `strip`, and calls `render`.
  let root = Document(doc: s)
  var state = State(references: initTable[string, Reference](), config: initCommonmarkConfig())
  state.parse(root)
  for child in root.children:
    if child of Heading:
      echo child

when isMainModule:
  main()
$ nim r foo.nim
<h1>Introduction</h1>
<h2>Some header</h2>

There are also performance killers like token.children.toSeq.map((t: Token) => $t).join("\n") everywhere in the critical path.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't worry much about performance, as it is likely in the range of milliseconds? But the fact that you can't go back to markdown is quite annoying.

So do we want to support the below anyway?

I think most editors would use that comment format by default. But maybe nobody actually uses headings in comments? 🤷

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't worry much about performance

I meant more like "this is code smell to me". It'd push me further towards "just use libcmark", even if soasme/nim-markdown supported markdown -> AST -> markdown.

I think most editors would use that comment format by default.

Mmm. Probably true.

Supporting <!-- --> with the approach in this PR isn't much work. But do we want to support a line beginning with # inside any others from https://spec.commonmark.org/0.30/#html-blocks?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed da1d695 to add support for HTML comment blocks

src/generate/generate.nim Outdated Show resolved Hide resolved
@angelikatyborska
Copy link

does introduction.md.tpl contain

# Introduction

## Binaries

%{concept:binaries}

or this?

# Introduction

%{concept:binaries}

And if the latter, should we set the contents of the inserted level-2 header from the concept's name in the track-level config.json file?

I always imagined that it would be the first option because it just seems more straight-forward (that merging markdown files doesn't require reading into the track's JSON config) and it's more flexible (you can add any header that you want). Anyway, either option is going to work for the Elixir track because we only ever nest single-concept introductions under a h2 that contains the concept's name. The second option might be nicer if you want to enforce this kind of pattern for all tracks.

@verdammelt
Copy link
Member

For the Common Lisp track it appears I'd favor the first - the track does not assume, nor necessarily want, a header with the name of the concept to be inserted - simply that the concept introductions "# Introduction" line be removed when inserting.

So I'd prefer what seems to be keeping the status quo. That being said I would be fine with the other style as well.

@SleeplessByte
Copy link
Member

I agree with the last two comments. I think for JS we kinda expect it too sometimes so things might break if option two is chosen.

@ErikSchierboom
Copy link
Member

Okay, then we should with option number 1. Could you update @ee7?

@ee7
Copy link
Member Author

ee7 commented Jun 21, 2022

So I'd prefer what seems to be keeping the status quo.

I don't know whether there's really a "status quo" currently, since the common-lisp track is currently the only track that has a .md.tpl file:

$ find . -name '*.md.tpl' | sort
./common-lisp/exercises/concept/character-study/.docs/introduction.md.tpl
./common-lisp/exercises/concept/gigasecond-anniversary/.docs/introduction.md.tpl
./common-lisp/exercises/concept/high-scores/.docs/introduction.md.tpl
./common-lisp/exercises/concept/key-comparison/.docs/introduction.md.tpl
./common-lisp/exercises/concept/larrys-winning-checker/.docs/introduction.md.tpl
./common-lisp/exercises/concept/leslies-lists/.docs/introduction.md.tpl
./common-lisp/exercises/concept/lillys-lasagna/.docs/introduction.md.tpl
./common-lisp/exercises/concept/lillys-lasagna-leftovers/.docs/introduction.md.tpl
./common-lisp/exercises/concept/logans-numeric-partition/.docs/introduction.md.tpl
./common-lisp/exercises/concept/log-levels/.docs/introduction.md.tpl
./common-lisp/exercises/concept/lucys-magnificent-mapper/.docs/introduction.md.tpl
./common-lisp/exercises/concept/pal-picker/.docs/introduction.md.tpl
./common-lisp/exercises/concept/pizza-pi/.docs/introduction.md.tpl
./common-lisp/exercises/concept/reporting-for-duty/.docs/introduction.md.tpl
./common-lisp/exercises/concept/socks-and-sexprs/.docs/introduction.md.tpl

Alternatively: https://github.com/search?q=org%3Aexercism+filename%3A*.md.tpl

It'd be slightly disruptive to the common-lisp track, which omits the h2 in the template file when only 1 concept is referenced. But is that a pattern we want?

The contents of those files (click to expand)

character-study/.docs/introduction.md.tpl

# Introduction

%{concept:characters}

gigasecond-anniversary/.docs/introduction.md.tpl

# Introduction

## Dates and Times

%{concept:date-time}

## Multiple Values

%{concept:multiple-values}

high-scores/.docs/introduction.md.tpl

# Introduction

%{concept:hash-tables}

key-comparison/.docs/introduction.md.tpl

# Introduction

%{concept:equality}

larrys-winning-checker/.docs/introduction.md.tpl

# Introduction

%{concept:arrays}

leslies-lists/.docs/introduction.md.tpl

# Introduction

%{concept:lists}

lillys-lasagna/.docs/introduction.md.tpl

# Introduction

%{concept:functions}

lillys-lasagna-leftovers/.docs/introduction.md.tpl

# Introduction

%{concept:lambda-list}

## Optional Parameters

%{concept:optional-parameters}

## Keyword Parameters

%{concept:keyword-parameters}

## Rest Parameters

%{concept:rest-parameters}

logans-numeric-partition/.docs/introduction.md.tpl

# Introduction

%{concept:reducing}

log-levels/.docs/introduction.md.tpl

# Introduction

%{concept:strings}

lucys-magnificent-mapper/.docs/introduction.md.tpl

# Introduction

## Mapping

%{concept:mapping}

## Filtering

%{concept:filtering}

pal-picker/.docs/introduction.md.tpl

# Introduction

## Truthy And Falsy

%{concept:truthy-and-falsy}

## Conditionals

%{concept:conditionals}

pizza-pi/.docs/introduction.md.tpl

# Introduction

In Common Lisp, like many languages, numbers come in a few of types – two of the most basic are:

## Integers

%{concept:integers}

## Floating Point Numbers

%{concept:floating-point-numbers}

## Arithmetic

%{concept:arithmetic}

reporting-for-duty/.docs/introduction.md.tpl

# Introduction

%{concept:format-basics}

socks-and-sexprs/.docs/introduction.md.tpl

# Introduction

## Comments

%{concept:comments}

## Cons

%{concept:cons}

## Expressions

%{concept:expressions}

## Symbols

%{concept:symbols}

@ee7
Copy link
Member Author

ee7 commented Jun 21, 2022

the track does not assume, nor necessarily want, a header with the name of the concept to be inserted - simply that the concept introductions "# Introduction" line be removed when inserting.

Then configlet generate must demote headers by some dynamic amount, right? Because if we don't hard-code inserting a h2, then configlet generate must not demote any headers for the below (to avoid going from h1 to h3):

# Introduction

%{concept:foo}

and presumably should demote the headers in foo by two levels for:

# Introduction

Blah blah

## h2

Blah blah.

### h3

%{concept:foo}

But I'm not sure whether we can support that use robustly with the current placeholder syntax. It's hard for configlet generate to guess the intended amount of demotion with something like:

# Introduction

Blah blah.

## h2

Blah blah.

### h3

Blah blah.

%{concept:foo}

(is %{concept:foo} supposed to be under the h3 or not?).

If we want to do that, I think we need to agree on some concise rule for determing the demotion level, and maybe lint the .tpl files to avoid content being placed under headers of unexpected level. But would it be simpler and more robust to hard-code a demotion level of 1?

Note that https://exercism.org/docs/building/markdown/markdown contains:

No heading may descend a level greater than one below the previous (e.g. ## may only be followed by ###, not ####).

@verdammelt
Copy link
Member

@ee7 common-lisp is the only track using this? did not know.

So by "status quo" I guess I mean: I would prefer it to continue to do whatever it may be doing now. But again, I can likely work with whatever y'all choose generate to do.

@ErikSchierboom
Copy link
Member

It'd be slightly disruptive to the common-lisp track, which omits the h2 in the template file when only 1 concept is referenced. But is that a pattern we want?

The C# and F# tracks also use this: https://github.com/exercism/csharp/blob/main/exercises/concept/log-levels/.docs/introduction.md I think it looks cleaner than an additional h2 heading, but 🤷

But would it be simpler and more robust to hard-code a demotion level of 1?

Would this depend on the concept placeholder always being place under an h2 heading? As in: what if the concept placeholder is directly under an h1 heading and the concept's introduction.md has h2 and h3 subheadings?

I think for JS we kinda expect it too sometimes so things might break if option two is chosen.

@SleeplessByte Do you perhaps have an example?

@ee7
Copy link
Member Author

ee7 commented Jun 22, 2022

The C# and F# tracks also use this: https://github.com/exercism/csharp/blob/main/exercises/concept/log-levels/.docs/introduction.md I think it looks cleaner than an additional h2 heading, but 🤷

So the main difference is in the right panel of:

Screenshots:

12

Would it help to have an Exercism-wide preference for one or the other? I think I prefer the Elixir way, with the h2. On the csharp link, there's no immediate signal that the entire introduction is about strings. The Elixir way is also more consistent between an exercise that uses a single placeholder, and an exercise that uses multiple placeholders.

Would this depend on the concept placeholder always being place under an h2 heading? As in: what if the concept placeholder is directly under an h1 heading and the concept's introduction.md has h2 and h3 subheadings?

One possibility is "always demote by 1, but if the placeholder is not directly under a h2, add an h2 with the name of the concept". But there are many options. I don't know if we can decide what's best without enumerating them.

@SleeplessByte
Copy link
Member

@ee7 I like the Elixir way more.

ee7 added 5 commits June 22, 2022 20:11
This is arguable. But I think this way is more readable overall by the
time we add handling of HTML comment blocks.
More readable, and easier to extend to support HTML comment blocks, but
probably marginally slower.
This is only one of many possible HTML blocks [1], but probably the most
important one.

[1] https://spec.commonmark.org/0.30/#html-blocks
Insert a h2 using the concept's name in the track `config.json`, and
demote the headers in the `introduction.md` for the `foo` concept by 1:

    # Introduction

    %{concept:foo}

Don't insert a h2, and demote by 1:

    # Introduction

    ## Foo

    %{concept:foo}

Don't insert a h2, and demote by 2:

    # Introduction

    ## Some header

    Blah blah.

    ### Foo

    %{concept:foo}
@ee7
Copy link
Member Author

ee7 commented Jun 22, 2022

common-lisp is the only track using this? did not know.

@verdammelt Yes, common-lisp is the early adopter :)

configlet generate hasn't really been fully usable yet, due to the deficiency that this PR tries to resolve.

For example, with these:

The last contains only level-2 headers, but should contain some level-3 headers.

configlet generate previously only produced acceptable output when there was only a top-level header in the .md.tpl file, like:

ee7 added 3 commits June 24, 2022 15:33
Consider this `.md.tpl` file:

    # Introduction

    ## Foo

    %{concept:foo}

    ## Bar

    %{concept:bar}

Before this commit, when more than one placeholder had a reference link,
`configlet generate` would produce something like

    # Introduction

    ## Foo

    Here is a line with a link [Foo][foo].

    [foo]: http://www.example.com

    ## Bar

    Here is another line with a link [Bar][bar].

    [bar]: http://www.example.com

With this commit, we place the reference links at the bottom, ordering
by first usage, and deduplicating them:

    # Introduction

    ## Foo

    Here is a line with a link [Foo][foo].

    ## Bar

    Here is another line with a link [Bar][bar].

    [foo]: http://www.example.com
    [bar]: http://www.example.com

But it's tricky to do this robustly without using a proper markdown
parser.

From the CommonMark Spec [1], the rules for reference links have some
complexity:

    A link reference definition consists of a link label, optionally
    preceded by up to three spaces of indentation, followed by a colon (:),
    optional spaces or tabs (including up to one line ending),
    a link destination, optional spaces or tabs (including up to one line
    ending), and an optional link title, which if it is present must be
    separated from the link destination by spaces or tabs. No further
    character may occur.

    ---

    A link label begins with a left bracket ([) and ends with the first
    right bracket (]) that is not backslash-escaped. Between these
    brackets there must be at least one character that is not a space,
    tab, or line ending. Unescaped square bracket characters are not
    allowed inside the opening and closing square brackets of link
    labels. A link label can have at most 999 characters inside the
    square brackets.

[1] https://spec.commonmark.org/0.30/#link-reference-definitions
@ErikSchierboom
Copy link
Member

It parses into an AST that doesn't contain nodes for reference link definitions. So markdown -> libcmark -> render as commonmark converts all reference links into inline links.

If I'm reading this correctly, we wouldn't be able to retain reference links if we were to attempt to format Markdown?

The AST doesn't store some other information, e.g. the delimiter kind used for emphasis or code blocks (so _ is later rendered as *, and an admonition like ~~~~exercism/note is later rendered as ``` exercism/note).

Not the best of ASTs then.

You can't customize the markdown output format.

:( Ruby's cmark wrapper allows this, so they must have custom-built it?

It would increase the configlet binary size by about 50% (roughly 487 KiB to 727 KiB on Linux).

I don't mind this much to be honest.

I don't know yet whether the Windows build is trivial.

Having a proper Windows build is important.

@ee7
Copy link
Member Author

ee7 commented Jun 24, 2022

we wouldn't be able to retain reference links if we were to attempt to format Markdown?

Not without an additional step to add them back in again. See in ee7/exercism-elixir@d05c1e7, in particular the transformations of reference links. That commit is the diff produced by a configlet generate that uses libcmark vs the output of configlet generate from this PR.

Not the best of ASTs then.

Mmm. I imagine that it's deliberate, though - John Macfarlane knows what he's doing.

:( Ruby's cmark wrapper allows this, so they must have custom-built it?

Do you mean commonmarker, the wrapper for libcmark-gfm?

My statement of "can't customize the markdown output format" was a bit too strong - I meant that the only options to customize rendering are these:

$ man 3 cmark
[...]
Options affecting rendering
     #define CMARK_OPT_SOURCEPOS (1 << 1)

     Include a data-sourcepos attribute on all block elements.

     #define CMARK_OPT_HARDBREAKS (1 << 2)

     Render softbreak elements as hard line breaks.

     #define CMARK_OPT_SAFE (1 << 3)

     CMARK_OPT_SAFE is defined here for API compatibility, but it no longer has
     any effect. "Safe" mode is now the default: set CMARK_OPT_UNSAFE to disable
     it.

     #define CMARK_OPT_UNSAFE (1 << 17)

     Render raw HTML and unsafe links (javascript:, vbscript:, file:, and data:,
     except for image/png, image/gif, image/jpeg, or image/webp mime types).
     By default, raw HTML is replaced by a placeholder HTML comment.
     Unsafe links are replaced by empty strings.

     #define CMARK_OPT_NOBREAKS (1 << 4)

     Render softbreak elements as spaces.
[...]

So I believe that you can't make it e.g. prefer _ over * for emphasis, or omit the space between a code fence and language name, or omit indentation before list items.

It looks like the render options for commonmarker are the same as the underlying ones for libcmark-gfm. But yeah, it looks like commonmarker exposes a way to use a custom renderer. Is that what you're talking about?

Edit:
Looks like "yes": exercism/website/app/commands/markdown/render_html.rb

@ErikSchierboom
Copy link
Member

I was indeed referring to that. But what do you think we should do?

@ee7
Copy link
Member Author

ee7 commented Jun 24, 2022

But what do you think we should do?

Well, this PR at least works for exercism/elixir#1151, including the combining of reference links at the bottom when multiple concepts have at least one reference link.

It's just that the current parsing is somewhat fragile, especially to whitespace and linebreaks.

I'm looking into patching libcmark at install time.

ee7 added 11 commits June 28, 2022 13:25
Clarify the kind of slug. In the future we could even consider:

    type
      ConceptSlug* = distinct string
      ExerciseSlug* = distinct string
The CommonMark spec [1] calls them headings, not headers.

[1] https://spec.commonmark.org/0.30/
Make declaration order match usage order.
The backticks look a bit strange, even if you know that `concept` is a
keyword. Sidestep the issue. We named it `con` elsewhere:

    $ git grep --heading --break 'for .* in concepts:'
    src/info/info.nim
    32:      for con in concepts:

    src/lint/track_config.nim
    303:  for con in concepts:
@ee7
Copy link
Member Author

ee7 commented Jun 28, 2022

I'll do the cmark change in a follow-up PR. This PR is already long, and a separate commit will help if we needed to revert to this non-cmark implementation for some reason.

Erik: the total diff since your last approval is cb47886...607f369

@ee7 ee7 requested a review from ErikSchierboom June 28, 2022 11:28
let concepts = TrackConfig.init(readFile(trackDir / "config.json")).concepts
collect:
for con in concepts:
{con.slug.Slug: con.name}
Copy link
Member Author

@ee7 ee7 Jun 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The .Slug is needed here because the concept's slug is not yet a Slug.

Concept* = object
name*: string
slug*: string
uuid*: string

We should handle changing to slug: Slug in a separate PR, because it's not just a refactoring (depending on which parsing hooks are visible to jsony at parse time, it would add a parse-time error if a slug value in the JSON file is not kebab-case).

But note that, for now, configlet generate does not validate that a slug value is kebab-case. I think that it should, but it doesn't matter much for now: the placeholder matching does enforce kebab-case for the concept value.

@ee7
Copy link
Member Author

ee7 commented Jun 30, 2022

@ErikSchierboom Building configlet from the latest commit in this PR, running configlet generate on the common-lisp track produces the penultimate state in exercism/common-lisp#659. Could you check that PR to see if you're happy with how configlet behaves? Please review individual commits in that PR, to see configlet's behavior when the placeholder is both under and not under a level-2 heading.

common-lisp is still the only track that uses a .tpl file.

@ErikSchierboom
Copy link
Member

That looks great to me. Thanks a lot!

@ee7 ee7 mentioned this pull request Jul 1, 2022
@ee7 ee7 merged commit ef65ef9 into exercism:main Jul 1, 2022
@ee7 ee7 deleted the generate-demote-headers branch July 1, 2022 22:51
@ee7 ee7 changed the title generate: demote remaining headers generate: demote headings, and group link reference definitions Jul 1, 2022
glennj pushed a commit to glennj/exercism-configlet that referenced this pull request Jul 8, 2022
…cism#620)

Before this commit, the implementation of `configlet generate` was
noticeably incomplete - it could produce files with incorrect heading
levels.

Given a Concept Exercise with an `introduction.md.tpl` file like:

    # Introduction

    ## Date and Time

    %{concept:date-time}

`configlet generate` would write an `introduction.md` file, inserting
the introduction for the `date-time` Concept, but without demoting its
headings:

    # Introduction

    ## Date and Time

    Blah blah.

    ## A heading from the date-time `introduction.md` file

    Blah blah.

With this commit, configlet will demote the level of an inserted heading
when appropriate:

    # Introduction

    ## Date and Time

    Blah blah.

    ### A heading from the date-time `introduction.md` file

    Blah blah.

`configlet generate` now also writes the same when given the template
structure of:

    # Introduction

    %{concept:date-time}

That is, it adds a level-2 heading when the placeholder is not already
under a level-2 heading - we concluded that this looks better on the
website. The heading's contents are taken from the concept's `name` in
the track-level `config.json` file, so you may prefer to omit such
level-2 headings in the template file (unless you want the heading
contents to be different).

This commit also fixes configlet's handling of link reference
definitions. Consider this `introduction.md.tpl` file:

    # Introduction

    ## Foo

    %{concept:foo}

    ## Bar

    %{concept:bar}

Before this commit, when more than one placeholder had a link reference
definition, `configlet generate` would produce something like:

    # Introduction

    ## Foo

    Here is a line with a link to [Foo][foo].

    [foo]: http://www.example.com

    ## Bar

    Here is another line with a link to [Bar][bar].

    [bar]: http://www.example.org

That is, it would not combine link reference definitions at the bottom.
With this commit, it does so - ordering by first usage, and
deduplicating them:

    # Introduction

    ## Foo

    Here is a line with a link to [Foo][foo].

    ## Bar

    Here is another line with a link to [Bar][bar].

    [foo]: http://www.example.com
    [bar]: http://www.example.org

The Markdown handling in this commit is deliberately rudimentary, and
has some limitations. We're doing it this way because there is no
pure-Nim Markdown parser and renderer that does what we want. The
plan is to use a wrapper for libcmark [1], the reference implementation
of the CommonMark Spec [2].

However, cmark is not designed foremost to be a Markdown formatter - for
example, it inlines a link destination from a matching link reference
definition. But we want to generate `introduction.md` files without that
inlining, so we need patch or workaround cmark's rendering. There's some
complexity there that's best left for another commit.

[1] https://github.com/commonmark/cmark
[2] https://spec.commonmark.org/0.30

Closes: exercism#328
Closes: exercism#626
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants