Extract content from git repository #62
Draft
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This captures some unfinished early work of mine on extracting information about content contained within referenced git repos by fetching the latest commit of the default branch into a scratch repo.
What I ran into and didn't finish resolving before I stopped working on this was that some projects reference Git repos outside GitHub that are down and/or have malformed responses and the
git
CLI is not good at telling us about this when we try tofetch
orls-remote
on them, and just hangs indefinitely in some cases.There are two things we could/should do next:
git
protocol to pull the latest commit. Even doing a shallow fetch of just the latest commit, pulling viagit
requires downloading a lot more content and dealing with a lot more failure modes than using the GitHub API. The vast majority of repos are on GitHub, so while we want to ultimately support all sorts of Git hosts, GitHub represents a worthwhile special case to optimize for by using their API insteadgit
more resilient for cases where we can't use the GitHub API. I was thinking we might either:child_process
to invoke the localgit
client to probe repositoriesgit
that just quickly checks that a connection can be established and we see some fingerprint in the response that tells us a git server is responding on the other end