-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Yarn Berry: run git lfs pull before running yarn #5833
Conversation
Besides storing pre-built plugins using LFS. Another good use case for LFS with yarn projects is the offline cache: https://yarnpkg.com/features/offline-cache |
npm_and_yarn/lib/dependabot/npm_and_yarn/file_updater/yarn_lockfile_updater.rb
Outdated
Show resolved
Hide resolved
Thanks for opening a PR so we have a place to discuss the approach. I am curious if this setup is a common one, do you know of more projects following this approach of storing yarn assets in LFS? |
You can check out renovatebot/renovate#6842 for some other people who are interested in LFS support for bumping dependencies.
This is a good point. However, it seems like it would add a lot of complexity to this otherwise simple patch. Do you think it would make more sense to add a config option to dependabot.yml that allows the user to specify which paths need to be |
Not having a cache does seem to allow yarn to continue unlike not having the plugins. FWIW there are open issues on the yarn repo around installing plugins automatically: yarnpkg/berry#4464 but no PR has been accepted yet. |
No we're really trying to keep the config file as simple as possible and avoid any ecosystem specific configuration, especially if like in this case the setting can be inferred from the repo. We have a bit of code that checks the config file for the cache path already, so I think we could just run some similar code in the repo content path we just fetched, parse the yaml and check what the correct path to pull is?
That doesn't sound unreasonable, but it would be nice if we can make it work out of the box?
And plugins always go in Overall I think the approach here seems reasonable and I can't see it leading to any problems so I'd be inclined to accept this , it would be really nice if you could take a look at inferring the yarn cache path from the config file. Would also be great if we could add a test case for this somehow, but I'd have to dig into how those are set up for cloning currently Going to run CI to ensure this doesn't break things unexpectedly |
@jurre any further thoughts on this change? I've implemented some of your feedback, please let me know if there are any more changes you'd like to see. |
Hey, so first of all thanks for those changes, I think this should do the trick! One last thing I would like to request is that we add some tests around this, here is an example of the existing tests we have around cloning: dependabot-core/common/spec/dependabot/file_fetchers/base_spec.rb Lines 1323 to 1348 in 4bd7127
Let me know if you need any help setting that up! |
@jurre Yeah, some guidance around this would be helpful, since I am not familiar with rspec or this repo's test organization. For instance, are there any existing tests for file fetchers of specific package ecosystems? If I were to add this one would you imagine adding it to the base_spec.rb you linked, or making a new spec file specific to npm_and_yarn? Would you like to see a fixture that actually includes data stored in LFS or just a test that verifies that the |
Yep, they all have ecosystem specific file fetcher specs, they'll be in
I'd add it to the npm_and_yarn one: https://github.com/dependabot/dependabot-core/blob/8d60d39e19bca9ba738e497fae31fbe0323386f3/npm_and_yarn/spec/dependabot/npm_and_yarn/file_fetcher_spec.rb
I think the best way to go would be to set up a fixture that includes lfs data, and then verify that it is included in the fetched files Here's a snippet of useful context on when the file fetcher uses the cloning strategy: dependabot-core/common/lib/dependabot/file_fetchers/base.rb Lines 38 to 42 in 4bd7127
repo_contents_path , it'll clone the repo, otherwise use the API. In the case of Yarn Berry, we already make sure to always set that when actually running Dependabot, but for these tests you'll have to pass it a value (a tmpdir works best) manually
|
Thanks for the pointers, I will try it out tomorrow. Is it okay for the test to require network access? Otherwise I'm not sure if it will be possible to set up a repo that uses LFS but the files are hosted fully locally (probably possible, maybe not easy). |
If we can avoid it then that'll be preferred, but if there's no way around it we do have some other tests accessing the network. Our existing cloning specs do use a local repo I believe, so hopefully we can repurpose that! |
Since the whole point of LFS is that git is supposed to fetch the large files from a file server, making it work offline may require a custom adapter like this: https://github.com/sinbad/lfs-folderstore And setting this up requires some |
Ah I see, well, if it's not feasible to do it locally then we can pull from a remote. I'll move the repo into a dedicated org we have for these sort of fixtures once you have a working setup. Thanks for working on this by the way! 🎉 🙇 |
@jurre I've created a fixture repo at https://github.com/jtbandes/dependabot-yarn-lfs-fixture and added the test. Please critique! 😃 |
Looks good at a cursory glance! It's getting a bit late on my end so I'll give this a more thorough look tomorrow 🙇 |
I've also initiated a transfer of the fixture repo to your account so you can move it to the appropriate place. Feel free to update the PR with any changes necessary. |
@jurre just a quick ping on this, let me know if there is anything else I can do. Thanks for your repeated reviews! |
6f4a274
to
ae11a76
Compare
@jurre The new repo does not have the files properly tracked with Git LFS, so they're just text files: https://github.com/dependabot-fixtures/yarn-berry-lfs-fixture/blob/main/.yarn/releases/yarn-3.2.4.cjs (compare with https://github.com/jtbandes/dependabot-yarn-lfs-fixture/blob/main/.yarn/releases/yarn-3.2.4.cjs) Luckily the test caught it 😃 I'll re-initiate the transfer so you can just use my repo. |
Looks like the rest of the test failures are some intermittent network issues. |
d5e2e0a
to
5e50c4e
Compare
end | ||
path | ||
end | ||
rescue StandardError |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to return the path and carry on for any error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What other options do you have in mind? I’m not sure what meaningful recovery could be done here…
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me first better understand, which errors are you anticipating?
Here are the ones I can think of:
Dependabot::SharedHelpers::HelperSubprocessFailed
raised byrun_shell_command
insuper
or hereNoMethodError
if.yarnrc.yml
is empty, in which caseYAML.load_file
will return false, which does not respond tofetch
Psych::ParseError
if the YAML in the.yarnrc.yml
is invalid
If we encounter an exception outside of these, I think we'd want to let it fail? @jurre might know better 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's reasonable to continue without lfs pull
in those cases rather than fail the update entirely?
You do raise a few good error cases (the empty or invalid .yarnrc.yml for example) that we should probably handle?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Related:
It took a bit to track down why this started failing on main
since the error gets swallowed... that means users who are hitting this failure scenario won't be told why...
@@ -32,6 +32,26 @@ def self.required_files_message | |||
|
|||
private | |||
|
|||
# Overridden to pull any yarn data or plugins which may be stored with Git LFS. | |||
def _clone_repo_contents(target_directory:) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At least at one time in history, methods like this one in Dependabot::FileFetchers::Base
were not meant to be overridden:
dependabot-core/common/lib/dependabot/file_fetchers/base.rb
Lines 176 to 178 in 6a9c745
################################################# | |
# INTERNAL METHODS (not for use by sub-classes) # | |
################################################# |
This would be the first instance of a file fetcher subclass overriding it. Could we override public method #clone_repo_contents
instead?
2aaabad
to
1808542
Compare
1808542
to
d9f4a86
Compare
Co-authored-by: Roman Shtylman <shtylman@gmail.com>
As pointed out by @bdragon, `YAML.load_file` returns false if the file exists but is empty. This handles that, and wraps all the calls up in a little helper method to avoid duplicating the logic everywhere.
d9f4a86
to
e2a9631
Compare
Thanks, this is now live |
Thanks for all your help! |
cc @jurre @defunctzombie
Run
git lfs pull --include .yarn
before using yarn for dependency updating. Yarn plugins stored with LFS can now be used.Resolves LFS issues described in #1297, see example Dependabot PR at https://github.com/foxglove/studio/pull/4528
Tested with
bin/dry-run.rb npm_and_yarn foxglove/studio --clone --dep rc-tree
, which now updates yarn.lock instead of failing with an error.Ran unit test with
cd npm_and_yarn && bundle exec rspec spec/dependabot/npm_and_yarn/file_fetcher_spec.rb -e "with .yarn data stored in git-lfs"