Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GitError: cannot locate local branch 'unlisted' #7351

Closed
muffinresearch opened this issue Feb 27, 2020 · 11 comments · Fixed by mozilla/addons-server#14060
Closed

GitError: cannot locate local branch 'unlisted' #7351

muffinresearch opened this issue Feb 27, 2020 · 11 comments · Fixed by mozilla/addons-server#14060

Comments

@muffinresearch
Copy link
Contributor

muffinresearch commented Feb 27, 2020

There are 90k events in sentry for this see https://sentry.prod.mozaws.net/share/issue/8ea46d4d182144cd8c972708a2ffd633/

GitError: cannot locate local branch 'unlisted'
  File "celery/app/trace.py", line 385, in trace_task
    R = retval = fun(*args, **kwargs)
  File "celery/app/trace.py", line 650, in __protected_call__
    return self.run(*args, **kwargs)
  File "olympia/amo/decorators.py", line 108, in wrapper
    return f(*args, **kw)
  File "olympia/versions/tasks.py", line 130, in extract_version_to_git
    version=version, author=author, note=note)
  File "olympia/lib/git.py", line 329, in extract_and_commit_from_version
    branch = repo.find_or_create_branch(BRANCHES[version.channel])
  File "olympia/lib/git.py", line 394, in find_or_create_branch
    branch = self.git_repository.branches.get(name)
  File "pygit2/repository.py", line 1212, in get
    return self[key]
  File "pygit2/repository.py", line 1200, in __getitem__
    branch = self._repository.lookup_branch(name, GIT_BRANCH_LOCAL)

/cc @EnTeQuAk


For QA: please reach out to the AMO team to test this issue. There is some work to do on the -dev servers to simulate this error. The idea would be to corrupt the git repository of a given add-on on the -dev server and have you (QA) re-upload a new version. In the end, all the versions of this add-on, including the new one, should be successfully extracted.

Also note that this is behind a waffle switch.

@diox
Copy link
Member

diox commented Apr 3, 2020

See also #6671

@willdurand
Copy link
Member

Sentry issue: OLYMPIA-PROD-31G

@willdurand
Copy link
Member

I started to look into this error. This is an error coming from libgit2 directly.

According to the pygit2 lib/docs, lookup_branch should only raise KeyError internally (this error is handled here) or InvalidSpecError (cf. libgit2/pygit2#832) when we use an invalid branch name.

In this case, this is a GitError, which is the most generic error we can have in pygit2. It is obviously not a branch name issue because the name of the branch is unlisted (without any special or forbidden characters).

The cannot locate local branch message comes from retrieve_branch_reference in the underlying libgit2 lib. It is definitely possible to have a GIT_ENOTFOUND error or even -1 returned by git_reference_lookup and not only a GIT_EINVALIDSPEC (according to this header file). The retrieve_branch_reference function is called in git_branch_lookup which is used in pygit2 here.

The pygit2 lib should handle GitError in addition to KeyError I guess but 🤷‍♂ Anyway, git_reference_lookup is defined here.

I believe the git error comes from a -1 value returned somewhere in git_reference_lookup because I cannot find any reference to GIT_ENOTFOUND in the callee functions. So many things can lead to -1 being returned, though. For instance, an OOM error can return -1 via git_buf_oom() (but unlikely in our case). Other than that, it can be related to some "backend API" errors. Ugh.

In https://sentry.prod.mozaws.net/share/issue/8ea46d4d182144cd8c972708a2ffd633/, the issue comes from the local lookup. It is possible that a newer version of libgit2 fixes this issue automagically.

Otherwise we should probably handle this GitError in find_or_create_branch().

FWIW, the Sentry project for -dev does not list this particular error.

@diox
Copy link
Member

diox commented Apr 8, 2020

It would be interesting to see if git command line works correctly in affected repositories on prod. Maybe something bad happened and there is a data corruption on some/a lot of repositories on prod causing the issue ? If that's the case, maybe we can automatically handle that separately, or just nuke the directories and try again.

@willdurand
Copy link
Member

The problem seems to occur very frequently with the unlisted branch but there are very few events related to the listed branch. That might be a good hint!

@willdurand
Copy link
Member

It would be interesting to see if git command line works correctly in affected repositories on prod. Maybe something bad happened and there is a data corruption on some/a lot of repositories on prod causing the issue ? If that's the case, maybe we can automatically handle that separately, or just nuke the directories and try again.

How could we get access to the prod files?

@willdurand
Copy link
Member

willdurand commented Apr 8, 2020

How could we get access to the prod files?

Nevermind, I was able to get files from our production server. I can reproduce the issue locally. The git repository is corrupted with an invalid unlisted branch. To reproduce, one has to run echo '' > .git/refs/heads/unlisted in any git repo created by our extraction task. This command creates an empty file but that's not valid in git. refs/heads files should have commit hashes.

If you use git to list the branches, we can see that git detects the problem and ignores the invalid branch:

$ git branch
warning: ignoring broken ref refs/heads/unlisted
  listed
* master

We don't ignore this error in our code obviously but we can catch GitError and repair the broken branch probably.

@willdurand
Copy link
Member

We don't ignore this error in our code obviously but we can catch GitError and repair the broken branch probably.

This is not feasible because the branch ref might have become broken over time, not necessarily when we tried to create it for the first time. (I found lost/unreachable commits in the production repos I examined).

The new approach to fix this issue is to delete the repo and re-extract everything when we detect a broken branch ref.

@willdurand
Copy link
Member

I am not sure I've mentioned this before: I don't think we can really find the root cause behind the broken ref. It could be caused by a libgit2/pygit2 update or a FS hiccup or anything else really.

@willdurand
Copy link
Member

Once #7453 has landed, the plan is to have a BrokenRefError that we can gracefully handle in the celery error handler. We'll have to:

  1. detect this error
  2. delete the git repo
  3. submit a new git extraction request for this add-on

@ioanarusiczki
Copy link

@willdurand This issue was tested on dev with this addon https://reviewers.addons-dev.allizom.org/en-US/reviewers/review/12roboform-12password-manager
After the git repo was corrupted and a new version was uploaded the git cron ran again and then all the versions were displayed without problems.

@KevinMind KevinMind transferred this issue from mozilla/addons-server May 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment