Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add initial support for Git protocol v2 #1244

Merged
merged 12 commits into from
Jun 28, 2024
Merged

Conversation

stspdotname
Copy link
Contributor

Implement Git protocol version negotiation and use Git protocol v2 for fetches if supported. For now, the observable behaviour of Dulwich is equivalent regardless of protocol version, except that two new features may be used if the server supports Git protocol v2.

The first feature is a reference prefix list which may be used to filter refs server-side. This can be used to reduce the size of the initial reference listing announced by the server. Reducing the size of this announcement was a major design goal for protocol v2 to avoid sending of very large announcements when a repository contains a lot of references. This feature is intended as an optimization which servers are free to ignore depending on available server-side resources. Therefore, users of Dulwich should still be prepared to filter redundant refs manually (this limitation also applies to Git itself).
A new --refspec porcelain option is provided in order to test this feature on the command line.

The second feature is an object filter specification, which corresponds to the --filter option of 'git clone'. This can be used to omit objects while cloning repositories. For instance, the following command will clone a given repsitory without fetching any blob objects:

dulwich clone --filter blob:none --bare REPO_URL

(In this example the --bare option is used because creation of a work tree would fail without any blobs present.)

The test suite now enables protocol v2 and keeps passing for me.

@stspdotname stspdotname requested a review from jelmer as a code owner December 12, 2023 15:47
@stspdotname
Copy link
Contributor Author

stspdotname commented Dec 12, 2023

Not sure what the "unexpected success" failures on Windows are about. Are there more detailed test logs available? I don't have a Windows machine to run tests on myself.

dulwich/porcelain.py Outdated Show resolved Hide resolved
Copy link
Owner

@jelmer jelmer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks! LGTM overall; I left some nits inline, but nothing major.

It would be good to have at least some basic tests; maybe there's a way of forcing the git server to force a particular protocol version and rely on the compat tests?

dulwich/porcelain.py Show resolved Hide resolved
dulwich/cli.py Outdated Show resolved Hide resolved
dulwich/cli.py Outdated
porcelain.pull(".", from_location)
parser = argparse.ArgumentParser()
parser.add_argument('--from_location', type=str)
parser.add_argument('--refspec', type=str, nargs='*')
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
parser.add_argument('--refspec', type=str, nargs='*')
parser.add_argument('refspec', type=str, nargs='*')

for consistency with C Git

dulwich/client.py Show resolved Hide resolved
@jelmer
Copy link
Owner

jelmer commented Dec 14, 2023

Not sure what the "unexpected success" failures on Windows are about. Are there more detailed test logs available? I don't have a Windows machine to run tests on myself.

These are spurious - I've been meaning to look into them and possibly mark them as flapping.

@stspdotname
Copy link
Contributor Author

It would be good to have at least some basic tests; maybe there's a way of forcing the git server to force a particular protocol version and rely on the compat tests?

Reading Git's server code suggests that the server always uses the highest mutually supported version . See determine_protocol_version_server() in Git's protocol.c.

It doesn't make much sense to run dulwich compat tests against ancient versions of Git so in practice the protocol version used during dulwich tests depends entirely on what the client is requesting.

Therefore we should add a configuration knob to dulwich that controls the protocol version the client will request. Git clients already support such an option: protocol.version in git-config.

I will try to add such an option to dulwich as a command-line parameter, and/or environment variable, and/or via the existing git config option -- any preference?

Once this parameter exists, ideally the entire compat test suite would always run twice, once with the old protocol (v0/v1) and again with protocol v2. Any tests which only make sense in one case could be skipped in the other.

@stspdotname stspdotname force-pushed the protov2 branch 2 times, most recently from a615d01 to bddc28e Compare December 19, 2023 11:15
@stspdotname
Copy link
Contributor Author

I am not quite sure yet how to adjust the test suite to run compat tests across both protocol versions. I will look into this next. Meanwhile, feel free to merge this PR if you are happy with the changes so far. I can always open a new PR for the test suite changes later.

@stspdotname stspdotname force-pushed the protov2 branch 4 times, most recently from b572358 to dc11b97 Compare June 5, 2024 09:47
@@ -213,11 +214,21 @@ def nocompat_test_suite():
return result


@patch("dulwich.protocol.DEFAULT_GIT_PROTOCOL_VERSION_FETCH", new=0)
def compat_test_suite_git_protocol_v0():
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this running all compat tests twice, not just the client ones?

This should really just be specific to the client code.

@@ -30,6 +30,10 @@

TCP_GIT_PORT = 9418

GIT_PROTOCOL_VERSIONS = [0, 1, 2]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there actually a version 1? Can you add a comment about these versions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I am adding comments in the next version.

They will appear in an earlier commit which adds this list.
I have rewritten history a bit, also to get rid of the 'ruff format' commit in the middle.

):
cmd, path = command.split(" ")
cmd = cmd.split("-", 1)
path = path.replace("'", "")
env = copy.deepcopy(os.environ)
if protocol_version is None or protocol_version == 2:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do the inverse here, i.e.

If protocol version is not set, set it to 2.

If protocol version != 0 then set this header.

That prepares us a bit better to future versions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we can do that. I don't anticipate a new protocol version in the foreseeable future, but it doesn't hurt to handle things that way.

tests/compat/test_client.py Outdated Show resolved Hide resolved
tests/__init__.py Outdated Show resolved Hide resolved
tests/__init__.py Outdated Show resolved Hide resolved
@stspdotname stspdotname force-pushed the protov2 branch 4 times, most recently from 0dfa109 to 83dde9f Compare June 11, 2024 14:48
@stspdotname stspdotname requested a review from jelmer June 11, 2024 14:48
Copy link
Owner

@jelmer jelmer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please take a look at the test failures.

Implement Git protocol version negotiation and use Git protocol v2 for
fetches if supported. For now, the observable behaviour of Dulwich is
equivalent regardless of protocol version, except that two new features
may be used if the server supports Git protocol v2.

The first feature is a reference prefix list which may be used to filter
refs server-side. This can be used to reduce the size of the initial
reference listing announced by the server. Reducing the size of this
announcement was a major design goal for protocol v2 to avoid sending of
very large announcements when a repository contains a lot of references.
This feature is intended as an optimization which servers are free to
ignore depending on available server-side resources. Therefore, users of
Dulwich should still be prepared to filter redundant refs manually (this
limitation also applies to Git itself).
A new --refspec porcelain option is provided in order to test this
feature on the command line.

The second feature is an object filter specification, which corresponds
to the --filter option of 'git clone'. This can be used to omit objects
while cloning repositories. For instance, the following command will
clone a given repsitory without fetching any blob objects:

   dulwich clone --filter blob:none --bare REPO_URL

(In this example the --bare option is used because creation of a work
tree would fail without any blobs present.)

The test suite now enables protocol v2 and keeps passing for me.
Git prints a warning and proceeds with an unfiltered clone/fetch operation.
Make dulwich behave the same way, for now.
@jelmer jelmer merged commit 101a80c into jelmer:master Jun 28, 2024
25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants