Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve performance of markdown file processing #694

Closed
meizy opened this issue May 24, 2024 · 2 comments · Fixed by #696
Closed

improve performance of markdown file processing #694

meizy opened this issue May 24, 2024 · 2 comments · Fixed by #696

Comments

@meizy
Copy link

meizy commented May 24, 2024

when you have a lot of markdown files to process with mmdc (using xargs or similar), and only some of them contain mermaid diagrams, it takes relatively a lot of time to process those .md files that do not contain any diagram.

regex processing is relatively heavy. it might accelerate performance if, before processing regex to extract the diagrams from the file, the script will do a simple check if the file contains any ```mermaid string.

it seems like a small change in index.js, but I'm not proficient enough in my js to write the code.

For your consideration.

@LeonKuhne
Copy link

yeah that ^^^ and also a switch from puppeteer to playwright would allow processing multiple images (pages) at once -- or just skip that step and use a custom renderer (or any lib that doesn't require me to install chromium pleeeease!)

aloisklink added a commit to aloisklink/mermaid-cli that referenced this issue Jun 1, 2024
Skip launching puppeteer until it's actually needed.

Running mermaid-cli on a markdown file without any mermaid code blocks
shouldn't launch puppeteer, if it's unnecessary.

Fix: mermaid-js#694
@aloisklink
Copy link
Member

when you have a lot of markdown files to process with mmdc (using xargs or similar), and only some of them contain mermaid diagrams, it takes relatively a lot of time to process those .md files that do not contain any diagram.

Good idea! I think the slowest part is actually launching puppeteer! I've made a PR to lazy-load it so it only gets loaded if needed: #696


switch from puppeteer to playwright would allow processing multiple images (pages) at once

mermaid-cli and Puppeteer already supports that! For example, if you have one markdown file with multiple mermaid diagrams in it, it will render them in parallel using a single browser instance.

However, it still needs to create a browser instance per .md file, which is slow if you have lots of .md files.

You can use remark-mermaid-dataurl if you want to process multiple .md files. It's much much faster, since it only uses a single browser instance.

There's also remark-mermaidjs, which is similar, but uses Playwright instead of Puppeteer.

or just skip that step and use a custom renderer (or any lib that doesn't require me to install chromium pleeeease!)

Unfortunately, Mermaid needs a CSS layout engine to render properly, and as far as I'm aware, only browsers support this. See mermaid-js/mermaid#3650.

Although, maybe https://github.com/servo/servo will help in the far future 🤷

You can use Puppeteer with Firefox, though: https://pptr.dev/faq#q-what-is-the-status-of-cross-browser-support

MindaugasLaganeckas pushed a commit that referenced this issue Jun 4, 2024
Skip launching puppeteer until it's actually needed.

Running mermaid-cli on a markdown file without any mermaid code blocks
shouldn't launch puppeteer, if it's unnecessary.

Fix: #694
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants