-
-
Notifications
You must be signed in to change notification settings - Fork 8.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(content-blog): links in feed should be absolute #9151
Conversation
Hi @VinceCYLiao! Thank you for your pull request and welcome to our community. Action RequiredIn order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you. ProcessIn order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA. Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks! |
✅ [V2]
To edit notification comments on pull requests, go to your Netlify site configuration. |
⚡️ Lighthouse report for the deploy preview of this PR
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current solution is too error-prone because it includes too much custom logic. Plus it does not work with all relative links. Consider the following:
<a href="another-post">link</a>
If the page is at /blog/2020/09/13/current-post
, then the href should point to /blog/2020/09/13/another-post
, not /another-post
. I.e. the resolver needs to be aware of the current URL, not just the site's URL.
Also, I would prefer not manually joining URLs in any case. Why not elm.attribs.href = String(new URL(elm.attribs.href, currentPageURL))
?
Thanks for your comments. Sorry that I miss clicked the request review button. I'll try to provide a better solution. |
77fff54
to
b607956
Compare
After reading the Nodejs's doc for the URL class, I found that the URL class itself can just handle this issue. @Josh-Cena Please review at your convenience. Thank you! |
Sorry that I forgot to run yarn test. Didn't noticed that so many tests will fail due to the changes. I'll look into how to fix the tests if you find my solution fine. |
I think it looks good! In fact the test changes look expected to me. I reckon you don't need to resolve paths that are just anchor links, only ones that actually point to another page. We would also need test cases (either adding a new test post file, or adding a link to the existing post file) |
I have added a test case and updated the test plan. Please let me know if it's ok for you. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add the tests to https://github.com/facebook/docusaurus/tree/main/packages/docusaurus-plugin-content-blog/src/__tests__/__fixtures__/website instead? No one is going to look at the feed of the dogfooding blog.
…quashed commits) Squashed commits: [2db488373] chore: add a new file to test href resolving [6c18cea] docs: added to test if href resolved correctly in feed
6c18cea
to
a339fc9
Compare
I was thinking to parse the links from the feeds to check if they are correctly resolved, but found it's hard to so and maybe kind of meaningless since the links in feeds are from the object returned by the defaultCreateFeedItems function. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks
The implementation looks good. 👍
But the way it is tested looks surprisingly complex to me, and unit tests are not passing.
We also need to absolutize image URLs.
Let me know if you need help to figure out how to test that.
$(`div#${blogPostContainerID} a`).each((_, elm) => { | ||
const {href} = elm.attribs; | ||
if (href) { | ||
elm.attribs.href = String(new URL(href, link)); | ||
} | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
We will also need to convert image links to absolute, see
#9136 (comment)
https://validator.w3.org/feed/docs/warning/ContainsRelRef.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Image links now are absolutized.
@@ -196,3 +228,95 @@ describe.each(['atom', 'rss', 'json'])('%s', (feedType) => { | |||
fsMock.mockClear(); | |||
}); | |||
}); | |||
|
|||
describe('Test defaultCreateFeedItems', () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That test looks super complex to me and I don't understand why.
Just call createBlogFeedFiles
and take a snapshot: we'll review the snapshot and validate it contains what we expect
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that I don't need to add new test case; instead I just need to update the snapshot. Am I correct ?
function isFullAbsolutePath(str: string) { | ||
const domain = 'https://domain.com'; | ||
const {origin} = new URL(str, domain); | ||
return origin !== domain; | ||
} | ||
|
||
async function generateLinksOfBlogPosts(outDir: string, blogPosts: BlogPost[]) { | ||
const linksOfBlogPosts: {[postId: string]: string[]} = {}; | ||
const pathOfFile = path.join(outDir, 'blog'); | ||
const promises = blogPosts.map(async (post) => { | ||
try { | ||
const content = await readOutputHTMLFile(post.id, pathOfFile, true); | ||
const $ = cheerioLoad(content); | ||
const anchorElements = $(`div#${blogPostContainerID} a`); | ||
if (anchorElements.length > 0) { | ||
const href = anchorElements.map((_, elm) => elm.attribs.href).toArray(); | ||
linksOfBlogPosts[post.id] = href; | ||
} | ||
} catch { | ||
// post is a draft | ||
} | ||
}); | ||
await Promise.all(promises); | ||
return linksOfBlogPosts; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need that complexity inr our tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test case is removed
Thanks! I'll make the image URLs also absolutized. Regarding the test, I'm thinking to create a new mock file which contains anchor elements with absolute/relative/anchor link, and image element with absolute/relative source URLs. And then in the test case just call createBlogFeedFiles and take a snapshot. |
Tested in local and all unit tests are passed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't undersand how it works anymore 😅
@Josh-Cena do you remember what updates the build-snap
folder exactly? Is this updated manually?
@VinceCYLiao how did this PR generate that new src/__tests__/__fixtures__/website/build-snap/blog/blog-with-links/index.html
file?
The CI is failing and snapshots are not easy to review 😓
Surprisingly unit tests are passing locally, but not on GitHub action 🤷♂️
@@ -0,0 +1,31 @@ | |||
<!doctype html> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How was this file generated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I created the blog-with-links.mdx in the website/blog folder and ran yarn:build:website:blogOnly. If this is not the way how the files in fixtures created, please let me know the create way to do it.
import dino from "../static/img/docusaurus.png"; | ||
import useBaseUrl from '@docusaurus/useBaseUrl'; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ES imports are supposed to come after front matter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Order of imports are now correct. I also moved the front matter to the beginning of the file. Misplacing front matter seems to be the reason why tests failed.
elm.attribs.srcset = srcset | ||
.split(',') | ||
.map((s) => { | ||
const [imageURL, ...descriptors] = s.trim().split(/\s+/); | ||
const newImageURL = new URL(imageURL ?? '', link).href; | ||
return [newImageURL, ...descriptors].join(' '); | ||
}) | ||
.join(', '); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks a bit unsafe/risky, maybe introduce a dedicated lib to manipulate srcset reliably instead? see https://www.npmjs.com/package/srcset
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I'll look into it and revise my code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done and since latest version of srcset is pure ESM, so I have to use the previous version.
I think I added this part of test but I don't remember how it works either. My guess is it's manual. |
Just run the tests again and they are all passed. Sorry for the confusion and I'll look into why the tests are failing on github. |
0297631
to
646d22a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implementation looks great to me! Just one stylistic suggestion.
Co-authored-by: Joshua Chen <sidachen2003@gmail.com>
Co-authored-by: Joshua Chen <sidachen2003@gmail.com>
Hold on, I'm fixing the build-snap generation thing in another PR before merging this PR |
Note that you would also want to rebase to get rid of the extra commits |
… absolute
Pre-flight checklist
Motivation
Fix #9136
Test Plan
added two mdx files in dogfood docs
website/_dogfooding/_blog tests/2023-07-19-a.mdx
website/_dogfooding/_blog tests/2023-07-19-b.mdx
Inside 2023-07-19-a.mdx are three links
Visit /tests/blog/feed.json
1st link stays untouched
2nd link resolved as "https://docusaurus.io/tests/blog/2023/07/19/b"
3rd link also resolved as "https://docusaurus.io/tests/blog/2023/07/19/b"
Which are correct.
Test links
Deploy preview: https://deploy-preview-9151--docusaurus-2.netlify.app/blog/feed.json
Related issues/PRs
issue 9136