Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(build): check for bad_src flaws in markdown files #8133

Merged
merged 17 commits into from
Mar 1, 2024

Conversation

yin1999
Copy link
Member

@yin1999 yin1999 commented Feb 5, 2023

Summary

As we are 100% markdown now, but the images check will only match src in HTML files (see function findMatchesInText).

And we already have a function findMatchesInMarkdown, which is used to match links in markdown. I just reuse this function to match image's src. So we could show bad src flaws for markdown files.

Also as a part of #7574, for the bad_image test will be failed after convert testing files to markdown.

Problem

The bad_src flaws for image files are only works for html files. So we couldn't be notified when there is a bad image src within a markdown document.

Solution

check whether the document is a markdown file. If so, use findMatchesInMarkdown to match image's src.

To make this be a test, convert the bad_image test file to markdown (as we are 100% markdown now).


Screenshots

Try to add a image (with bad image src) in document: files/zh-cn/mdn/at_ten/index.md

image

Then, run yarn dev and open http://localhost:5042/zh-CN/docs/MDN/At_ten/index.json

Before

We could not see the bad_src flaws

image

After

We could see the bad_src flaws

image


How did you test this change?

run yarn dev and yarn test

@github-actions github-actions bot added the flaw-system issues and feature requests related to the flaws system label Feb 5, 2023
Comment on lines 37 to 41
const matches = isMarkdown
? findMatchesInMarkdown(rawContent, "image", src)
: findMatchesInText(src, rawContent, {
attribute: "src",
});
Copy link
Member Author

@yin1999 yin1999 Feb 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we still have many HTML tables in content. Is it necessary to match both in Markdown and Text for markdown files? (but we may need to ignore the html codes in <pre>

example: mdn/content#24154

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, indeed, the HTML table on the cursor page is not going away anytime soon, so we should also findMatchesInText() for Markdown files.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @caugner, as we need to sort the matches, so the codes might be:

const matches = findMatchesInText(src, rawContent, {
  attribute: "src",
});
if (isMarkdown) {
  const hasMatchInHTML = matches.length > 0;
  matches.push(
    ...findMatchesInMarkdown(src, rawContent, { type: "image" })
  );
  // note that, we need to sort them by line and column
  if (hasMatchInHTML) {
    matches.sort((a, b) =>
      a.line === b.line ? a.column - b.column : a.line - b.line
    );
  }
}

I'm not sure could we call findMatchesInText function in findMatchesInMarkdown as we could deal with broken html format links in markdown file.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found that the fromMarkdown can also parse html nodes: see AST.

Resolved this by also parse HTML node in markdown

Copy link
Contributor

@caugner caugner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just some nits.

build/flaws/broken-links.ts Outdated Show resolved Hide resolved
build/flaws/broken-links.ts Outdated Show resolved Hide resolved
build/matches-in-text.ts Outdated Show resolved Hide resolved
build/matches-in-text.ts Outdated Show resolved Hide resolved
build/matches-in-text.ts Show resolved Hide resolved
Comment on lines 37 to 41
const matches = isMarkdown
? findMatchesInMarkdown(rawContent, "image", src)
: findMatchesInText(src, rawContent, {
attribute: "src",
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, indeed, the HTML table on the cursor page is not going away anytime soon, so we should also findMatchesInText() for Markdown files.

testing/content/files/en-us/web/images/bad_src/index.md Outdated Show resolved Hide resolved
build/check-images.ts Outdated Show resolved Hide resolved
@@ -1070,6 +1070,21 @@ test("image flaws with bad images", () => {
"File not present on disk, an empty file, or not an image"
).length
).toBe(4);
// Check the line and column numbers for html <img> tags in markdown
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that the start column of markdown images are the index of ! (+1). Only add line and column test for html elements

@yin1999
Copy link
Member Author

yin1999 commented Feb 8, 2023

Trying to fix the testing error in mdn/content#24265

@yin1999 yin1999 requested a review from caugner February 8, 2023 05:05
@github-actions github-actions bot added the merge conflicts 🚧 Please rebase onto or merge the latest main. label Feb 8, 2023
@github-actions
Copy link
Contributor

github-actions bot commented Feb 8, 2023

This pull request has merge conflicts that must be resolved before it can be merged.

@github-actions github-actions bot removed the merge conflicts 🚧 Please rebase onto or merge the latest main. label Feb 8, 2023
@github-actions github-actions bot added the merge conflicts 🚧 Please rebase onto or merge the latest main. label Feb 8, 2023
@github-actions
Copy link
Contributor

github-actions bot commented Feb 8, 2023

This pull request has merge conflicts that must be resolved before it can be merged.

@github-actions github-actions bot added merge conflicts 🚧 Please rebase onto or merge the latest main. and removed merge conflicts 🚧 Please rebase onto or merge the latest main. labels Feb 8, 2023
@github-actions
Copy link
Contributor

github-actions bot commented Feb 9, 2023

This pull request has merge conflicts that must be resolved before it can be merged.

@github-actions github-actions bot removed the merge conflicts 🚧 Please rebase onto or merge the latest main. label Feb 9, 2023
@caugner caugner removed their request for review May 2, 2023 08:19
@github-actions github-actions bot added the merge conflicts 🚧 Please rebase onto or merge the latest main. label May 18, 2023
@github-actions
Copy link
Contributor

This pull request has merge conflicts that must be resolved before it can be merged.

@github-actions github-actions bot removed the merge conflicts 🚧 Please rebase onto or merge the latest main. label Jun 5, 2023
@github-actions github-actions bot added the merge conflicts 🚧 Please rebase onto or merge the latest main. label Jul 17, 2023
@github-actions
Copy link
Contributor

This pull request has merge conflicts that must be resolved before it can be merged.

@github-actions github-actions bot removed the merge conflicts 🚧 Please rebase onto or merge the latest main. label Jul 17, 2023
@yin1999 yin1999 requested a review from caugner July 18, 2023 10:07
@caugner caugner self-assigned this Sep 4, 2023
@github-actions github-actions bot added the idle label Jan 24, 2024
@yin1999 yin1999 requested a review from a team as a code owner February 29, 2024 15:11
@caugner caugner changed the title fix(build): match bad_src flaws for markdown files fix(build): check for bad_src flaws in markdown files Feb 29, 2024
Copy link
Contributor

@caugner caugner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@yin1999 Can you double-check that it still works as expected?

@github-actions github-actions bot removed the idle label Feb 29, 2024
@yin1999 yin1999 requested a review from mdn-bot as a code owner March 1, 2024 08:40
@github-actions github-actions bot added the dependencies Pull requests that update a dependency file label Mar 1, 2024
@yin1999
Copy link
Member Author

yin1999 commented Mar 1, 2024

LGTM.

@yin1999 Can you double-check that it still works as expected?

Hey @caugner. This still works, I've only found the fix flow functionality does not work for HTML code in markdown file, as the replaceMatchingLinksInMarkdown only replace markdown links. There is still a lot of work to be done in the future to fix the functionality of flaws fixing :)

Addition: I've add dev dependency "mdast-util-directive" to support type inference in 74621d2.

Copy link
Contributor

@caugner caugner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just tested it locally once more. Great work! 🎉

@caugner caugner merged commit a76cc0e into mdn:main Mar 1, 2024
11 checks passed
@yin1999 yin1999 deleted the fix-match-bad_src branch March 1, 2024 23:01
yin1999 added a commit to yin1999/yari that referenced this pull request Mar 3, 2024
According to the documentation [1], type imports can be registered by using `@types/mdast`.
And we have already added this dev-dependency, so remove the import of `mdast-util-directive`.

Removed the dependency added in mdn#8133.

[1]: https://www.npmjs.com/package/mdast-util-directive#types
caugner pushed a commit that referenced this pull request Mar 4, 2024
According to the documentation [1], type imports can be registered by using `@types/mdast`.
And we have already added this dev-dependency, so remove the import of `mdast-util-directive`.

Removed the dependency added in #8133.

[1]: https://www.npmjs.com/package/mdast-util-directive#types
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Pull requests that update a dependency file flaw-system issues and feature requests related to the flaws system
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants