Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected escaping of characters since 2.1.1 #65

Closed
4 tasks done
tripodsan opened this issue Nov 12, 2024 · 7 comments
Closed
4 tasks done

Unexpected escaping of characters since 2.1.1 #65

tripodsan opened this issue Nov 12, 2024 · 7 comments
Labels
🙋 no/question This does not need any changes 👎 phase/no Post cannot or will not be acted on

Comments

@tripodsan
Copy link

tripodsan commented Nov 12, 2024

Initial checklist

Affected packages and versions

mdast-util-to-markdown@2.1.1

Link to runnable example

No response

Steps to reproduce

the changes in mdast-util-to-markdown@2.1.1 cause unexpected escaping of normal characters, for example:

├─8  paragraph[3]
│    ├─0 strong[1]
│    │   └─0 link[3]
│    │       │ url: "https://www.example.com"
│    │       │ title: ""
│    │       ├─0 text "Curves on"
│    │       ├─1 emphasis[1]
│    │       │   └─0 text "the"
│    │       └─2 text " ipad. "

generates:

      **[Curves on_the_ ipad. ](https://www.example.com)**

it used to generate:

      **[Curves on_the_ ipad. ](https://www.example.com)**

another example:

├─2 paragraph[1]
│   └─0 emphasis[2]
│       ├─0 text "first"
│       └─1 strong[2]
│           ├─0 text "second"
│           └─1 emphasis[1]
│               └─0 link[1]
│                   │ url: "about:blank"
│                   │ title: "title"
│                   └─0 text "important"

generates:

_first**second_[important](about:blank "title")_**_

before:

_first**second_[important](about:blank "title")_**_

Expected behavior

it should not escape characters that are not needed to be escaped.

Actual behavior

it escapes unexpected characters.

Affected runtime and version

node v20.18.0

Affected package manager and version

No response

Affected OS and version

No response

Build and bundle tools

No response

@github-actions github-actions bot added 👋 phase/new Post is being triaged automatically 🤞 phase/open Post is being triaged manually and removed 👋 phase/new Post is being triaged automatically labels Nov 12, 2024
@wooorm
Copy link
Member

wooorm commented Nov 12, 2024

Hi! You note that a change happened in a version. Did you see the release notes / commits that were made? The changes are intentional. There are commits, there is talk, there are tests. Do you have arguments?

@wooorm wooorm added the 🙉 open/needs-info This needs some more info label Nov 12, 2024

This comment has been minimized.

@tripodsan
Copy link
Author

tripodsan commented Nov 12, 2024

I looked at the changelog but afaiu, the fix should only affect "invalid" mdast trees that would produce whitespace inside formats (eg _ the _). but in my examples above, the escaping also happens for the valid cases, like:

**[Curves on_the_ ipad. ](https://www.example.com)**

Curves on_the_ ipad.

or

_first**second_[important](about:blank "title")_**_

first**secondimportant**

(well, the case above has problems with the **`, but this is another issue)

This comment has been minimized.

@wooorm
Copy link
Member

wooorm commented Nov 12, 2024

This also indeed touches on “invalidness”. Although it isn’t that: all of this is valid.

Asterisks are different than underscores in markdown. There rules are simpler. They can form emphasis/strong more often, because they are less often used in natural language. Underscores, in words, as you perhaps know, cannot form emphasis. That is why this escape happens here. That _ is inside on and the. take for example: on_the_ipad: on_the_ipad. See? The underscores show, no emphasis. Not with on*the*ipad: ontheipad.

I can recommend two things:
a) use asterisks
b) there’s a space missing in the example, you likely want on instead of on

Either is fine?

@wooorm wooorm added the 🙋 no/question This does not need any changes label Nov 12, 2024
@github-actions github-actions bot added 👎 phase/no Post cannot or will not be acted on and removed 🤞 phase/open Post is being triaged manually 🙉 open/needs-info This needs some more info labels Nov 12, 2024
@wooorm
Copy link
Member

wooorm commented Nov 12, 2024

I will add that it is very hard to get emphasis/strong right for mdast-util-to-markdown. The rules are 17 points, and that’s vague ideas instead of actual concrete code: https://spec.commonmark.org/0.31.2/#can-open-emphasis

@tripodsan
Copy link
Author

ok. thanks for the explanation. probably a good thing to switch to * everywhere :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🙋 no/question This does not need any changes 👎 phase/no Post cannot or will not be acted on
Development

No branches or pull requests

2 participants