Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding a remote URL to a local_only repository fails to set the upstream branch for master #424

Open
nyanpasu64 opened this issue Nov 6, 2021 · 3 comments

Comments

@nyanpasu64
Copy link

I tried setting up cargo crev as a reviewer, using 0.21.3 installed from cargo install, on Arch Linux. I was unable to cargo crev publish.

To reproduce this issue, I first removed/renamed ~/.config/crev, then ran the following commands:

cargo crev trust --level high https://github.com/dpc/crev-proofs created ~/.config/crev/proofs/local_only_….

nyanpasu64@ryzen ~> cargo crev trust --level high https://github.com/dpc/crev-proofs
Fetching https://github.com/dpc/crev-proofs... 
https://github.com/dpc/crev-proofs                           no updates
Found proofs from:
     185 FYlr8YoYGVvDwHQxqEIs89reKKDy-oWisoO0qXXEfHE (verified owner)
note: Setting up a default CrevID. Run `cargo crev id new` to customize it.
WARN: URL for … is not known yet
WARN: URL for … is not known yet

Then I tried setting up as a reviewer, using cargo crev id set-url https://github.com/nyanpasu64/crev-proofs (I think cargo crev id new --url https://github.com/nyanpasu64/crev-proofs does the same thing).

nyanpasu64@ryzen ~> cargo crev id set-url https://github.com/nyanpasu64/crev-proofs
Using existing repository `/home/nyanpasu64/.config/crev/proofs/github_com_nyanpasu64_crev-proofs-…`
remote: Enumerating objects: 3, done.
remote: Total 3 (delta 0), reused 0 (delta 0), pack-reused 3
Unpacking objects: 100% (3/3), 341 bytes | 341.00 KiB/s, done.
From github.com:nyanpasu64/crev-proofs
 * branch            HEAD       -> FETCH_HEAD
Successfully rebased and updated refs/heads/master.
…

Then I tried running cargo crev publish, but apparently /home/nyanpasu64/.config/crev/proofs/github_com_nyanpasu64_crev-proofs-… was missing an upstream branch:

nyanpasu64@ryzen ~> cargo crev publish
From github.com:nyanpasu64/crev-proofs
 * branch            HEAD       -> FETCH_HEAD
Successfully rebased and updated refs/heads/master.
fatal: The current branch master has no upstream branch.
To push the current branch and set the remote as upstream, use

    git push --set-upstream origin master

Investigation

I restarted the process, but ran strace -fs999 -o crev-set-url cargo crev id set-url https://github.com/nyanpasu64/crev-proofs instead. The log file shows that crev moves the local_only repository to ~/.config/crev/proofs/github_com_nyanpasu64_crev-proofs-…, adds a remote by editing .git/config directly (not via a git remote child process), then runs git pull --rebase on the remote-tracking branch, but fails to set the upstream.

33003 execve("/usr/bin/cargo", ["cargo", "crev", "id", "set-url", "https://github.com/nyanpasu64/crev-proofs"], 0x7fff5f6d84c8 /* 63 vars */) = 0
…
33003 rename("/home/nyanpasu64/.config/crev/proofs/local_only_…", "/home/nyanpasu64/.config/crev/proofs/github_com_nyanpasu64_crev-proofs-…") = 0
…
33003 rename("/home/nyanpasu64/.config/crev/proofs/github_com_nyanpasu64_crev-proofs-…/.git/config.lock", "/home/nyanpasu64/.config/crev/proofs/github_com_nyanpasu64_crev-proofs-…/.git/config") = 0
…
33003 clone(child_stack=0x7f591300bff0, flags=CLONE_VM|CLONE_VFORK|SIGCHLD <unfinished ...>
33004 execve("/usr/bin/git", ["git", "pull", "--rebase", "-Xours"], 0x7ffc8ca647e8 /* 71 vars */ <unfinished ...>
33003 <... clone resumed>)              = 33004

I think you added the remote unusually. If I look into .git/config, I see the following (missing a fetch tag):

[remote "origin"]
	url = git@github.com:nyanpasu64/crev-proofs.git

When I run git fetch [origin [master]], I get the following output, and no origin/master ref created, and cargo crev publish fails:

nyanpasu64@ryzen ~/.config/crev/proofs/github_com_nyanpasu64_crev-proofs-… (master)> git fetch
From github.com:nyanpasu64/crev-proofs
 * branch            HEAD       -> FETCH_HEAD

nyanpasu64@ryzen ~/.config/crev/proofs/github_com_nyanpasu64_crev-proofs-… (master)> cargo crev publish
From github.com:nyanpasu64/crev-proofs
 * branch            HEAD       -> FETCH_HEAD
Successfully rebased and updated refs/heads/master.
fatal: The current branch master has no upstream branch.
To push the current branch and set the remote as upstream, use

    git push --set-upstream origin master

Is this deliberate?

Fixing the repo

If I recreate the remote using git remote remove origin && git remote add origin git@github.com:nyanpasu64/crev-proofs.git, it adds the following:

[remote "origin"]
	url = git@github.com:nyanpasu64/crev-proofs.git
	fetch = +refs/heads/*:refs/remotes/origin/*

And fetching creates an origin/master ref:

nyanpasu64@ryzen ~/.config/crev/proofs/github_com_nyanpasu64_crev-proofs-… (master)> git fetch
From github.com:nyanpasu64/crev-proofs
 * [new branch]      master     -> origin/master

But then git pull --rebase -Xours doesn't work properly:

nyanpasu64@ryzen ~/.config/crev/proofs/github_com_nyanpasu64_crev-proofs-… (master)> git pull --rebase -Xours
From github.com:nyanpasu64/crev-proofs
 * [new branch]      master     -> origin/master
There is no tracking information for the current branch.
Please specify which branch you want to rebase against.
See git-pull(1) for details.

    git pull <remote> <branch>

If you wish to set tracking information for this branch you can do so with:

    git branch --set-upstream-to=origin/<branch> master

If you instead run git pull --rebase -Xours --set-upstream origin master (I don't know what -Xours does but it's probably not important), it works and I can cargo crev publish:

nyanpasu64@ryzen ~/.config/crev/proofs/github_com_nyanpasu64_crev-proofs-… (master)> git pull --rebase -Xours --set-upstream origin master
From github.com:nyanpasu64/crev-proofs
 * branch            master     -> FETCH_HEAD
 * [new branch]      master     -> origin/master
Successfully rebased and updated refs/heads/master.
nyanpasu64@ryzen ~/.config/crev/proofs/github_com_nyanpasu64_crev-proofs-… (master)> cargo crev publish
Successfully rebased and updated refs/heads/master.
Enumerating objects: 7, done.
Counting objects: 100% (7/7), done.
Delta compression using up to 12 threads
Compressing objects: 100% (5/5), done.
Writing objects: 100% (6/6), 951 bytes | 951.00 KiB/s, done.
Total 6 (delta 1), reused 0 (delta 0), pack-reused 0
remote: Resolving deltas: 100% (1/1), done.
To github.com:nyanpasu64/crev-proofs.git
   56fcc25..79bd3b2  master -> master

Fixing crev

I'm still looking into it.

@nyanpasu64
Copy link
Author

The code for cargo crev id new --url ... or cargo crev id set-url needs to work properly both if there's no user repo (clone the URL), a local-only repo (unsure if it may have commits), or an existing repo with (the same or different) URL. Handling all cases requires care, and handling a default branch name mismatch between the local repo (varies by user) and the forked remote template (currently master) has no clear right answer. yay complexity

Fixing the initial branch name of local repos

The local-only repo is created by

let _ = git2::Repository::init(&proof_dir)?;

and renamed to a URL-based name by
std::fs::rename(&old_path, &new_path)?;

If you run cargo crev id set-url … without having created a local repo, the branch name is taken from the URL (the template's branch name is master, and I assume forks will be as well).

If you run cargo crev trust --level high https://github.com/dpc/crev-proofs etc. first, it creates a local-only repo, taking the initial branch name from .gitconfig's init.defaultBranch (if unset, git currently defaults to master and warns the user). If the name isn't master like the template's branch, then after you run cargo crev id set-url …, git pull succeeds but git push complains that the local/remote branch names don't match.

Do you configure git push to push even when the local and upstream branch names don't match? Do you hard-code the process of creating a local-only repo to use master (which will break if the template repo changes default branch names)? Do you rename the local branch to match the remote branch, when adding a URL? (Which branch do we rename? The checked-out branch? Git has no concept of a "default" branch for local repos, aside from at creation time.)

I think hard-coding local-only repos to have a master branch (or perhaps when setting a URL, renaming the local branch to match the remote) is the best solution. If you instead keep mismatched branch names, then cargo crev id set-url … will produce a different local branch name based on whether you've run cargo crev trust --level high https://github.com/dpc/crev-proofs beforehand or not. And this kind of inconsistency caused by seemingly-unrelated past events (in my case, events from months ago) is a bad thing.

Fixing adding a remote URL

Currently crev adds the origin URL at

repo.remote_set_url("origin", &push_url)?;

Currently crev sets a refspec if it clones a URL-based repo, but not if it renames a pre-existing local-only repo into a URL-based repo. I think that creating a differently-configured repo based on prior state is a bug.

In the current code organization (which I didn't change), clone_proof_dir_from_git branches on proof_dir.exists() (whether it's the first time you ran any crev command) and if so git2::Repository::open(&proof_dir) (whether you deleted .git). (Is it really necessary to handle the case of a missing .git, and is this correct? I don't know if it comes up in practice.) Afterwards it calls git2::Repository::clone(git_https_url, &proof_dir) and checks for errors.

The proof_dir.exists() && git2::Repository::open(&proof_dir) == Ok(_) branch needs to work both if origin exists (update the URL) or not (set the URL and fetch refspec). This is tricky to setup properly, but I've come up with something that works:

diff --git a/crev-lib/src/local.rs b/crev-lib/src/local.rs
index 0cdd201..b3072ac 100644
--- a/crev-lib/src/local.rs
+++ b/crev-lib/src/local.rs
@@ -565,7 +565,13 @@ impl Local {
             info!("Using existing repository `{}`", proof_dir.display());
             match git2::Repository::open(&proof_dir) {
                 Ok(repo) => {
-                    repo.remote_set_url("origin", &push_url)?;
+                    match repo.remote("origin", &push_url) {
+                        Ok(_) => (),
+                        Err(e) if e.code() == git2::ErrorCode::Exists => {
+                            repo.remote_set_url("origin", &push_url)?;
+                        },
+                        Err(e) => return Err(e.into()),
+                    }
                 }
                 Err(_) => {
                     git2::Repository::init_opts(

Fixing the upstream branch and git pull

If you don't add an upstream branch name when adding a remote URL to a repo without a URL, then you need to set it when fn change_locked_id_url() runs git pull (when the URL is being changed, regardless if a remote existed before or not).

The git pull invocation at

let _ = self.run_git(vec!["pull".into(), "--rebase".into(), "-Xours".into()]);
currently fails if you set a refspec but not a remote-tracking branch. Perhaps it could be --set-upstream origin master (which should work, as long as the template continues to call its default branch master). Alternatively you can get the branch name from origin/HEAD. As mentioned in "Fixing the initial branch name of local repos", we may need to rename the local branch to match the template repo's default branch.

How should I obtain the remote's default branch name? Do I hard-code it as master (which works if everyone uses the template repo and the default branch name doesn't change)? Shell out to git rev-parse --abbrev-ref origin/HEAD (which fails if the remote repo isn't a template repo, but rather empty and has no branches)? Or try to figure out how to do it in libgit2 (and if the remote repo has no default branch, keep the local branch name)?

Is supporting non-master remote branches necessary?

I don't understand the web side of crev. Do you require that all repos are forks of the github/gitlab template, and use the fork list to for humans or programs to discover proof repos (meaning that self-hosted non-GitHub/GitLab repos can't be discovered)? Is the list of forks not used for the program to discover repos (but rather other people's repos)?

Are people allowed to create remote repos not based off https://github.com/crev-dev/crev-proofs and b82d8bb22935437787c715c6452b21865b27a3bd? Do remote repos whose default branch isn't master get properly fetched? (I didn't test.)

@Minoru
Copy link
Contributor

Minoru commented Nov 6, 2021

I admit I didn't read your entire posts, but I can answer the questions at the end. Please keep in mind that I'm not an author of the program, just a user of a few years — I might be wrong on some points.

The crev-dev/crev-proofs template exists for autodiscovery; https://gitlab.com/crev-dev/auto-crev-proofs collects its forks, so users can "subscribe" to that one repo and learn about other ones automatically.

It's not a requirement to create your proofs repo from a template, but if you create your own, you won't get auto-discovered by auto-crev-proofs.

Self-hosting is also allowed, but also won't get auto-discovered by auto-crev-proofs.

As I understand it, cargo-crev fetches all the repos of all known identities, so a non-template/self-hosted repo will get known to others once someone in the existing network adds a trust record for this repo (at whatever level).

I don't know if crev requires the main branch to be called master.

@dpc
Copy link
Collaborator

dpc commented Nov 8, 2021

Hi. Thanks for taking a look into it.

The current code is the result of me implementing some basic stuff with libgit2, and them some fixes here and there as the use-cases and problems were reported.

What you were trying to do is nowhere near what I would have anticipated. The supported use case is really: create a repo remotely first, cargo crev id new ... second.

I'm happy to have more use-cases supported, so I will gladly accept a PR.

As for the master branch: for fetching remote stuff, the default branch of the remote should be used. That's still a default, documented and recommended way to create proof repo, github defaults to main and so on. For supporting "local repo" creation, any sane behavior is better than not supporting it at all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants