Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pushing changes in progress for git-lfs support adding to entire ecosystem. #9766

Conversation

GarryHurleyJr
Copy link
Contributor

@GarryHurleyJr GarryHurleyJr commented May 18, 2024

Draft pull request. This is to keep my work in progress. The branch name will change before this PR is finalized. I will be adding code to this as time goes on.

@GarryHurleyJr GarryHurleyJr requested a review from a team as a code owner May 18, 2024 02:52
@GarryHurleyJr GarryHurleyJr changed the title Pushing changes in progress Pushing changes in progress for git-lfs support adding to entire ecosystem. May 21, 2024
Copy link
Contributor

@bdragon bdragon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some inline suggestions and here's some general Ruby feedback. I've added links to the docs for our linter, RuboCop, which provide some examples and more detail. Not trying to nitpick, just trying to save you some headache with the linter and code reviews 🙂

  • Naming conventions: method names, method parameter names, block parameter names, and variable names use snake_case.
  • Redundant return. In Ruby the last expression in a block or method is the return value of that block or method. In these cases the explicit return should be omitted. It is common/idiomatic to use guard clauses in methods to reduce if/else nesting, so the return is obviously needed for those cases.
  • unless !!value is unnecessary. Just let the value evaluate as truthy/falsy. Hint: in Ruby false and nil are falsy and literally all other values are truthy. If you need to explicitly check for nil, use value.nil?.
  • You probably haven't gotten to this point in the implementation yet, but if you end up adding methods to a file that has the # typed: strict annotation you're going to want to add Sorbet sig blocks to the methods so the type checker passes. The syntax is weird so I would just look for other examples in the file/codebase and follow what they're doing.

Comment on lines 864 to 869
lfsEnabled = true
commandString = getCommandString(path,lfsEnabled)
/debugger
p commandString/
SharedHelpers.run_shell_command(commandString
).split("\n").filter_map do |line|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this could be simplified a bit without the helper methods and by using Dir.chdir

Suggested change
lfsEnabled = true
commandString = getCommandString(path,lfsEnabled)
/debugger
p commandString/
SharedHelpers.run_shell_command(commandString
).split("\n").filter_map do |line|
lfs_enabled = # ...
git_bin = lfs_enabled ? "git-lfs" : "git"
Dir.chdir(path) do
SharedHelpers.run_shell_command("#{git_bin} ls-files --stage")
end.split("\n").filter_map do |line|

#the HEREDOC command will see any stray spaces.
return "git -C #{path} ls-files --stage" unless lfsEnabled
Dependabot.logger.warn("LFS is enabled in this repo. Please use an LFS enabled client")
commandString = "CWD=\"#{__dir__}\";cd #{path};git-lfs ls-files --stage;cd $CWD"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggested something a little simpler above, but just adding a couple notes:

  • __dir__ is the name of the directory that contains this source code file, not the current working directory of the process
  • Rather than setting an environment variable for the directory, this is a good use case for Dir.chdir. This method changes into the given directory, runs whatever is inside block, and then changes back to the original directory.

@GarryHurleyJr
Copy link
Contributor Author

With some discussion, I have a path towards getting this back on track.
detecting lfs

  • clone shallow
  • check for .gitattributes file
  • if not found, no lfs
  • if found and does not contain "filter=lfs", no lfs
  • else lfs

if lfs detected, use git-lfs
if no lfs use git

@GarryHurleyJr GarryHurleyJr force-pushed the GarryHurleyJr/6039-add-support-in-dependabot-core-for-lfs-as-an-option-for-repositories branch from fed7b21 to e610181 Compare May 29, 2024 15:58
…fs-as-an-option-for-repositories' of https://github.com/dependabot/dependabot-core into GarryHurleyJr/6039-add-support-in-dependabot-core-for-lfs-as-an-option-for-repositories
@@ -890,10 +890,10 @@ def find_submodules(path)
sig { params(path: String).returns(T::Boolean) }
def isLfsEnabled(path)
filepath = File.join(path,".gitattributes")
lfsEnabled = File.exist?(filepath) && File.readable?(filepath) && SharedHelpers.run_shell_command("cat #{filepath} | grep \"filter=lfs\"").include? "#{filepath}"
lfsEnabled = T.let(true, T::Boolean) if File.exist?(filepath) && File.readable?(filepath) && SharedHelpers.run_shell_command("cat #{filepath} | grep \"filter=lfs\"").include? "filter=lfs"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason for copying the same code pattern in 3 different files? Can we centralize it in one location?

@@ -304,7 +304,7 @@ def fetch_files
before do
allow(file_fetcher_instance).to receive(:commit).and_return("sha")
end
#start of lfs testing

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have unit tests covering these new scenarios?

Comment on lines 883 to 886
rescue Exception => e
# this should not be needed, but I don't trust 'should'
lfs_enabled = T.let(false, T::Boolean)
raise e.exception("Message: #{e.message}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're rescuing the exception, setting lfs_enabled and then immediately re-raising the exception. What's the point of doing all of that? The exception will halt the code from executing, so all of this can just be removed and the end result is the same: an exception is raised.

git -C #{path} ls-files --stage
CMD
).split("\n").filter_map do |line|
lfs_enabled = is_lfs_enabled(path) if lfs_enabled.nil?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
lfs_enabled = is_lfs_enabled(path) if lfs_enabled.nil?
lfs_enabled = is_lfs_enabled(path) if lfs_enabled.nil?

Looks like we're calling .nil? on lfs_enabled before it is defined, what exactly are we trying to do here?

sig { params(path: String).returns(T.nilable(T::Boolean)) }
def is_lfs_enabled(path)
filepath = File.join(path, ".gitattributes")
lfs_enabled = T.let(true, T::Boolean) if File.exist?(filepath) && File.readable?(filepath) &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
lfs_enabled = T.let(true, T::Boolean) if File.exist?(filepath) && File.readable?(filepath) &&
T.let(true, T::Boolean) if File.exist?(filepath) && File.readable?(filepath) &&

No need to assign anything to this variable, we're just returning the results of the method calls

Comment on lines 878 to 887
sig { params(path: String).returns(T.nilable(T::Boolean)) }
def is_lfs_enabled(path)
filepath = File.join(path, ".gitattributes")
lfs_enabled = T.let(true, T::Boolean) if File.exist?(filepath) && File.readable?(filepath) &&
SharedHelpers.run_shell_command("cat #{filepath} | grep \"filter=lfs\"").include?("filter=lfs")
rescue Exception => e
# this should not be needed, but I don't trust 'should'
lfs_enabled = T.let(false, T::Boolean)
raise e.exception("Message: #{e.message}")
end
Copy link
Member

@jurre jurre May 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think all of this can be simplified to this:

Suggested change
sig { params(path: String).returns(T.nilable(T::Boolean)) }
def is_lfs_enabled(path)
filepath = File.join(path, ".gitattributes")
lfs_enabled = T.let(true, T::Boolean) if File.exist?(filepath) && File.readable?(filepath) &&
SharedHelpers.run_shell_command("cat #{filepath} | grep \"filter=lfs\"").include?("filter=lfs")
rescue Exception => e
# this should not be needed, but I don't trust 'should'
lfs_enabled = T.let(false, T::Boolean)
raise e.exception("Message: #{e.message}")
end
sig { params(path: String).returns(T::Boolean) }
def lfs_enabled?(path)
gitattributes_path = File.join(path, '.gitattributes')
return false unless File.exist?(gitattributes_path)
File.readlines(gitattributes_path).each do |line|
return true if line =~ /filter=lfs/
end
false
end

Dir.chdir(path) do
run_shell_command("git reset HEAD --hard")
run_shell_command("git clean -fx")
if lfs_enabled?(path)
Copy link
Member

@jurre jurre May 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need 4 different methods in 3 different places to check if lfs is enabled?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we have three different files in common that each call the git command in a different way and none of them refer to the others. To make matters worse, some code never even bothers to use any of them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the could all still share a single, shared way to check if lfs is enabled?

T.let(true, T::Boolean) if File.exist?(filepath) && File.readable?(filepath) &&
SharedHelpers.run_shell_command("cat #{filepath} | grep \"filter=lfs\"")
.include?("filter=lfs")
rescue StandardError => e
Copy link
Contributor

@bdragon bdragon May 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Jurre's suggestion is more idiomatic and won't incur the cost of shelling out, so I think you should incorporate that.

Just to call attention to it, shelling out to grep like this will return a non-zero exit if there's no match. SharedHelpers.run_shell_command will raise a HelperSubprocessFailed exception on a non-zero exit, but here we're swallowing virtually any error that could occur and carrying on. The problem with this is it assumes the only thing that could go wrong is grep not finding a match, when there are other legitimate exceptions that could occur that we would want to propagate.

If you find yourself writing rescue StandardError, it might be a good idea to ask if it's really necessary for what you're trying to do. Doing so will match any subclass of StandardError, which is the vast majority of errors you'll encounter (the Ruby docs on Exception are good resource). If you truly want to check for the HelperSubprocessFailed exception, rescue that class specifically so that other, unexpected exceptions are propagated correctly.

@jeffwidman
Copy link
Member

Garry has moved on from the :dependabot: team. The underlying issue still exists, but I suspect whoever tackles this will likely start from a fresh PR, so I'm closing this.

Note that there's also this related PR:

@jeffwidman jeffwidman closed this Jul 3, 2024
@jeffwidman jeffwidman deleted the GarryHurleyJr/6039-add-support-in-dependabot-core-for-lfs-as-an-option-for-repositories branch July 3, 2024 19:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants