Async marked #2474

UziTech · 2022-05-22T04:48:00Z

Description

The walkTokens function returns the values of the function so someone can await any Promises.

The way I see this working is anyone who wants to do anything async will do it by awaiting marked.walkTokens(tokens, callback)

Something like:

const tokens = marked.lexer(markdown)
await Promise.all(marked.walkTokens(tokens, asynFunction))
const html = marked.parser(tokens)

To make the walkTokens option be able to be async we have to introduce an async option so users can instead write:

const walkTokens = asyncFunction
marked.use({walkTokens, async: true})
const html = await marked.parse(markdown)

or in an extension:

const importUrl = {
  extensions: [{
    name: 'importUrl',
    level: 'block',
    start(src) { return src.indexOf('\n:'); },
    tokenizer(src) {
      const rule = /^:(https?:\/\/.+?):/;
      const match = rule.exec(src);
      if (match) {
        return {
          type: 'importUrl',
          raw: match[0],
          url: match[1],
          html: '' // will be replaced in walkTokens
        };
      }
    },
    renderer(token) {
      return token.html;
    }
  }],
  async: true, // needed to tell marked to return a promise
  async walkTokens(token) {
    if (token.type === 'importUrl') {
      const res = await fetch(token.url);
      token.html = await res.text();
    }
  }
};

marked.use(importUrl);
const markdown = `
# example.com

:[https://example.com:](https://example.com/)
`;

const html = await marked.parse(markdown);

Fixes async renderer support #458

Contributor

Test(s) exist to ensure functionality and minimize regression

Committer

In most cases, this should be a different person than the contributor.

CI is green (no forced merge required).
Squash and Merge PR following conventional commit guidelines.

vercel · 2022-05-22T04:48:06Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Updated
marked-website	✅ Ready (Inspect)	Visit Preview	Aug 25, 2022 at 2:01AM (UTC)

UziTech · 2022-05-22T17:20:57Z

@calculuschild @styfle @joshbruce @davisjam what do you think about this approach? Is it too much to maintain?

We could probably move more of the duplicate code to helpers.

joshbruce · 2022-05-22T17:46:23Z

@jimmywarting - Saw you had a PR about async and curious if you have thoughts.

@UziTech - Is the extra maintenance from this approach because there's the non-async and then the async here?

Is this a fairly standard approach for the JS community? (Making sure the Marked community can easily understand and update.)

jimmywarting · 2022-05-22T18:36:32Z

Saw you had a PR about async and curious if you have thoughts.

it's a pity that we need to have two version of marked... one being async and the other sync. either way, i think i might prefer to only use the async b/c i'm not that concerned about the performances b/c i would only use it on the client side and process one small chunk of markdown that would resolve pretty quickly, but i could understand if this would have to run on a server and be pretty fast to process lots of markdown data at once

The way i did it in my PR was in my opinion more stream friendly with iterators that yields chunks bits at a time.
so you could pretty much write something like this to get out data faster so you can start see a result quicker

for (let chunk of iterable) {
  if (typeof chunk !== string) chunk = await chunk
  res.write(chunk)
}
res.end()

iterable have the posibility to be turned into a stream as well using node:stream.Readable.from(iterable) and possible also for whatwg streams in the feature if it lands ReadableStream.from(iterable)

with my PR we could keep much of the logic the same.
We could also add Symbol.iterator to easily get a hold of a iterator the yields chunks.
And if the user which to use async stuff then they could just use for await so that it would call Symbol.asyncIterator instead of Symbol.iterator and promises would be resolved instead.
so that you can write this instead

for await (let chunk of iterable) res.write(chunk)
res.end()

UziTech · 2022-05-22T23:18:47Z

@joshbruce ya the reason for the extra maintenance is because we have to maintain files that are almost the same just not quite. For instance anytime we fix a bug in the tokenizer we will have to fix it in two files.

The async code is a little under half the speed of the sync version (bench sync 5000ms, bench async 11000ms)

@jimmywarting the iterable approach doesn't work for renderers that return false to fall back to original renderers. If that feature didn't exist the async code could be much easier.

styfle · 2022-05-23T00:46:20Z

we have to maintain files that are almost the same just not quite. For instance anytime we fix a bug in the tokenizer we will have to fix it in two files.

That sounds like it could be prone to error, especially for someone who is a new contributor to the project.

We could probably move more of the duplicate code to helpers.

That sounds desirable since it would help reduce the chance of needing to modify two files to fix a single bug.

Fixes #458

I'm still a little skeptical that this change is needed at all. It seems like the "fetch image from CMS" use case can be solved with the current lexer and parser. Something like this:

const tokens = marked.lexer(md);

for (const token of tokens) {
  if (token.type === 'image') {
    // mutate token
  }
}

const html = marked.parser(tokens);

UziTech · 2022-05-23T20:13:53Z

I'm still a little skeptical that this change is needed at all. It seems like the "fetch image from CMS" use case can be solved with the current lexer and parser.

I hadn't thought of that approach. That actually seems better. Perhaps all that we have to make async is the walkTokens function.

UziTech · 2022-05-28T15:16:23Z

after playing around with this a little bit it looks like we could just return the values from walkTokens and await those to do just about anything async.

The async code would look something like:

const tokens = marked.lexer(md);

await Promise.all(marked.walkTokens(tokens, (token) => {
  if (token.type === 'image') {
    return fetch(...)
      .then(res => res.json())
      .then(data => {
        token.href = ...
      });
  }
}));

const html = marked.parser(tokens);

We could do this inside marked to just allow an option. This will only slow down marked if they specify async: true.

const html = await marked(md, {
  async: true,
  walkTokens() {
    if (token.type === 'image') {
      return fetch(...)
        .then(res => res.json())
        .then(data => {
          token.href = ...
        });
    }
  },
})

src/Lexer.js

@@ -316,8 +316,9 @@ export class Lexer {
    return tokens;
  }

-  inline(src, tokens) {


vixalien · 2022-07-04T01:24:25Z

updates?

UziTech · 2022-07-04T03:42:47Z

@vixalien Thanks for the push! 😉👍I had this almost done for a while.

I changed this to use the suggestion by @styfle in #2474 (comment). I made the walkTokens function return the values of the function so someone can await any Promises.

The way I see this working is anyone who wants to do anything async will do it by awaiting marked.walkTokens(tokens, callback)

Something like:

const tokens = marked.lexer(markdown)
await Promise.all(marked.walkTokens(tokens, asynFunction))
const html = marked.parser(tokens)

To make the walkTokens option be able to be async we have to introduce an async option so users can instead write:

const walkTokens = asyncFunction
marked.use({walkTokens, async: true})
const html = await marked.parse(markdown)

or in an extension:

const importUrl = {
  extensions: [{
    name: 'importUrl',
    level: 'block',
    start(src) { return src.indexOf('\n:'); },
    tokenizer(src) {
      const rule = /^:(https?:\/\/.+?):/;
      const match = rule.exec(src);
      if (match) {
        return {
          type: 'importUrl',
          raw: match[0],
          url: match[1],
          html: '' // will be replaced in walkTokens
        };
      }
    },
    renderer(token) {
      return token.html;
    }
  }],
  async: true, // needed to tell marked to return a promise
  async walkTokens(token) {
    if (token.type === 'importUrl') {
      const res = await fetch(token.url);
      token.html = await res.text();
    }
  }
};

marked.use(importUrl);
const markdown = `
# example.com

:https://example.com:
`;

const html = await marked.parse(markdown);

TODO

Update docs
Add tests

bebraw · 2022-08-14T14:08:51Z

@UziTech I tried out your promising work and found a potential issue. It seems it breaks link (![]()) handling. I have steps to reproduce below:

Check out https://github.com/gustwindjs/gustwind/tree/bug/async-links
Run the project with deno task start (needs a recent version of Deno)
Head to http://localhost:3000/breezewind/ - Note the links

I have the code at https://github.com/gustwindjs/gustwind/blob/bug/async-links/site/transforms/markdown.ts . The interesting point is that when you remove that marked.use section with the async code, the links get rendered properly again so it probably has to do with the code that was added.

bebraw · 2022-08-14T14:15:06Z

@UziTech I managed to solve it. I had the following piece of code as per your example:

      start(src: string) {
        return src.indexOf(":");
      },

After removing, it fixed the links. Probably what happened is that it was trying to apply the extension on links since they often have :.

My apologies for the noise. 😄

bebraw · 2022-08-15T05:35:37Z

@UziTech Is the start check actually needed? It is there just for optimization purposes? Maybe it would be nice to have short comments at the doc portion related to this (i.e., why something exists). I might also extract importUrl as a const to group it better (nicer for maintenance as it's less brittle when changing things around).

In any case, I think it's a great addition to the API as it allows a lot of imaginative uses. It could be even worth mentioning some use cases at the docs to inspire people.

UziTech · 2022-08-15T13:51:23Z

@bebraw the start is needed to make sure the paragraph and text tokenizers doesn't consume too much.

hi
:https//example.com:

For example, without the start function the above markdown would be seen as one paragraph since there is no blank line between the lines and it doesn't know to stop at : to check the custom extension.

UziTech · 2022-08-21T18:02:15Z

@styfle @joshbruce @calculuschild @davisjam Any objections to this?

I think it will help allow a whole new type of async extensions for marked without any breaking changes. The only down side is an added option (async) to prevent anyone not using async from slowing down.

see the added documentation at https://marked-website-git-fork-uzitech-async-marked-markedjs.vercel.app/using_pro#async

docs/USING_ADVANCED.md

docs/USING_PRO.md

test/unit/marked-spec.js

test/bench.js

calculuschild

This all looks fine to me. I don't really have any feedback other than what @styfle has already mentioned.

Co-authored-by: Steven <steven@ceriously.com>

UziTech · 2022-08-25T02:17:48Z

What do you think about adding a parseAsync function to return the promise?

I know Jasmine ran into an issue when adding async expect where people would forget to await the expect call so they changed it to expectAsync for async expect calls so people wouldn't forget to add await before.

We could do something similar. If the async option is set (either by the user or an extension) than any calls to parse would fail with a message like Must use parseAsync for async marked so people don't forget the await.

styfle · 2022-08-28T18:41:49Z

Sounds good. Also the PR description no longer matches the content so we should probably update to avoid confusion to readers in the future.

UziTech · 2022-08-28T22:26:05Z

After working on the parseAsync way it looks like that will require some other small tradeoffs so we can change it in the future if it is needed.

For now I think this is good to go once you approve @styfle. 😁👍

styfle

Thanks!

# [4.1.0](v4.0.19...v4.1.0) (2022-08-30) ### Features * add async option ([#2474](#2474)) ([994b2e6](994b2e6))

github-actions · 2022-08-30T14:40:57Z

🎉 This PR is included in version 4.1.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

vixalien · 2022-08-31T10:25:34Z

yay!

metaskills · 2023-08-12T01:27:57Z

Thanks for the work on this. I was thinking of using Marked in my Express backend which streaming OpenAI responses. Those responses can be markdown but I'm not sure if this feature would support my use case. My goal is to simulate what OpenAI ChatGPT is doing in their client. I think I have two options as each OpenAI stream arrives.

Use Marked parse on a continually updated string representing the current total of the completion. Change Vue client side code to replace the DOM message's inner HTML with the updated Markdown -> HTML.
Use Marked in some async way that with each part of the stream from OpenAI. Honestly I would have no idea how I would handle these updates on the Vue side because parts will not be guaranteed to represent fully parsed HTML nodes.

So as I say this... I'm thinking #1 is my best option which means I do not need Async Marked, but thought I would put that question out for folks who may have wandered here like I did. Again, thanks for the work!

UziTech changed the title ~~move async to folder~~ Async marked May 22, 2022

vercel bot deployed to Preview May 22, 2022 04:48 View deployment

lolofofo056 approved these changes May 28, 2022

View reviewed changes

src/Lexer.js

@@ -316,8 +316,9 @@ export class Lexer {

return tokens;

}

inline(src, tokens) {

This comment was marked as spam.

Sign in to view

This comment was marked as spam.

Sign in to view

UziTech force-pushed the async-marked branch from 0d1496a to 67a2343 Compare July 4, 2022 01:36

vercel bot deployed to Preview July 4, 2022 01:36 View deployment

UziTech force-pushed the async-marked branch from 67a2343 to 29ecd06 Compare July 4, 2022 01:39

vercel bot deployed to Preview July 4, 2022 01:39 View deployment

UziTech force-pushed the async-marked branch from 29ecd06 to c4df435 Compare July 4, 2022 01:40

vercel bot deployed to Preview July 4, 2022 01:40 View deployment

UziTech force-pushed the async-marked branch from c4df435 to 6820c55 Compare July 4, 2022 03:40

vercel bot deployed to Preview July 4, 2022 03:40 View deployment

fix: return values from walkTokens

9e7ab80

UziTech force-pushed the async-marked branch from 6820c55 to 9e7ab80 Compare July 4, 2022 03:52

vercel bot deployed to Preview July 4, 2022 03:53 View deployment

UziTech requested review from calculuschild, davisjam, joshbruce and styfle July 4, 2022 04:02

UziTech marked this pull request as ready for review August 15, 2022 01:01

docs: add nav to async

3f36b41

vercel bot deployed to Preview August 15, 2022 01:10 View deployment

styfle reviewed Aug 24, 2022

View reviewed changes

docs/USING_ADVANCED.md Outdated Show resolved Hide resolved

styfle reviewed Aug 24, 2022

View reviewed changes

docs/USING_PRO.md Outdated Show resolved Hide resolved

styfle reviewed Aug 24, 2022

View reviewed changes

test/unit/marked-spec.js Outdated Show resolved Hide resolved

calculuschild reviewed Aug 24, 2022

View reviewed changes

test/bench.js Show resolved Hide resolved

calculuschild approved these changes Aug 24, 2022

View reviewed changes

Update docs/USING_PRO.md

b9d7c29

Co-authored-by: Steven <steven@ceriously.com>

vercel bot deployed to Preview August 25, 2022 01:26 View deployment

test: expect promise

13e805d

vercel bot deployed to Preview August 25, 2022 02:00 View deployment

Update docs/USING_ADVANCED.md

67ee2bc

Co-authored-by: Steven <steven@ceriously.com>

vercel bot deployed to Preview August 25, 2022 02:01 View deployment

styfle approved these changes Aug 30, 2022

View reviewed changes

UziTech merged commit 994b2e6 into markedjs:master Aug 30, 2022

github-actions bot pushed a commit that referenced this pull request Aug 30, 2022

chore(release): 4.1.0 [skip ci]

64b22d0

# [4.1.0](v4.0.19...v4.1.0) (2022-08-30) ### Features * add async option ([#2474](#2474)) ([994b2e6](994b2e6))

github-actions bot added the released label Aug 30, 2022

acarl005 mentioned this pull request Sep 2, 2022

marked - add async option to the MarkedExtension interface (new in version 4.1.0) DefinitelyTyped/DefinitelyTyped#62094

Merged

7 tasks

cyrus-and mentioned this pull request Oct 26, 2022

How to write an async renderer? #2626

Closed

UziTech deleted the async-marked branch November 21, 2022 06:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Async marked #2474

Async marked #2474

UziTech commented May 22, 2022 •

edited

Loading

vercel bot commented May 22, 2022 •

edited

Loading

UziTech commented May 22, 2022

joshbruce commented May 22, 2022

jimmywarting commented May 22, 2022 •

edited

Loading

UziTech commented May 22, 2022

styfle commented May 23, 2022

UziTech commented May 23, 2022

UziTech commented May 28, 2022 •

edited

Loading

This comment was marked as spam.

This comment was marked as spam.

vixalien commented Jul 4, 2022

UziTech commented Jul 4, 2022 •

edited

Loading

bebraw commented Aug 14, 2022

bebraw commented Aug 14, 2022 •

edited

Loading

bebraw commented Aug 15, 2022

UziTech commented Aug 15, 2022

UziTech commented Aug 21, 2022 •

edited

Loading

calculuschild left a comment

UziTech commented Aug 25, 2022

styfle commented Aug 28, 2022

UziTech commented Aug 28, 2022

styfle left a comment

github-actions bot commented Aug 30, 2022

vixalien commented Aug 31, 2022

metaskills commented Aug 12, 2023

Async marked #2474

Async marked #2474

Conversation

UziTech commented May 22, 2022 • edited Loading

Description

Contributor

Committer

vercel bot commented May 22, 2022 • edited Loading

UziTech commented May 22, 2022

joshbruce commented May 22, 2022

jimmywarting commented May 22, 2022 • edited Loading

UziTech commented May 22, 2022

styfle commented May 23, 2022

UziTech commented May 23, 2022

UziTech commented May 28, 2022 • edited Loading

This comment was marked as spam.

This comment was marked as spam.

vixalien commented Jul 4, 2022

UziTech commented Jul 4, 2022 • edited Loading

TODO

bebraw commented Aug 14, 2022

bebraw commented Aug 14, 2022 • edited Loading

bebraw commented Aug 15, 2022

UziTech commented Aug 15, 2022

UziTech commented Aug 21, 2022 • edited Loading

calculuschild left a comment

Choose a reason for hiding this comment

UziTech commented Aug 25, 2022

styfle commented Aug 28, 2022

UziTech commented Aug 28, 2022

styfle left a comment

Choose a reason for hiding this comment

github-actions bot commented Aug 30, 2022

vixalien commented Aug 31, 2022

metaskills commented Aug 12, 2023

UziTech commented May 22, 2022 •

edited

Loading

vercel bot commented May 22, 2022 •

edited

Loading

jimmywarting commented May 22, 2022 •

edited

Loading

UziTech commented May 28, 2022 •

edited

Loading

UziTech commented Jul 4, 2022 •

edited

Loading

bebraw commented Aug 14, 2022 •

edited

Loading

UziTech commented Aug 21, 2022 •

edited

Loading