Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flags for find_similar #394

Closed
ArnoVanLumig opened this issue Jul 24, 2014 · 2 comments
Closed

Flags for find_similar #394

ArnoVanLumig opened this issue Jul 24, 2014 · 2 comments

Comments

@ArnoVanLumig
Copy link
Contributor

The libgit2 api allows you to set, among others, the rename_threshold when calling git_diff_find_similar but this doesn't seem to be supported in pygit2.

Would be great if this could be implemented!

I tried to change it myself but I can't seem to do this without getting a "ValueError: Invalid version 0 on git_diff_find_options" error when trying to call it... Probably just my lack of C++ skills. Diff of my changes is below.

diff --git a/src/diff.c b/src/diff.c
index 83fcb51..886f169 100644
--- a/src/diff.c
+++ b/src/diff.c
@@ -438,19 +438,27 @@ Diff_merge(Diff *self, PyObject *args)


 PyDoc_STRVAR(Diff_find_similar__doc__,
-  "find_similar([flags])\n"
+  "find_similar([flags, rename_threshold, copy_threshold, rename_from_rewrite_threshold, break_rewrite_threshold])\n"
   "\n"
   "Find renamed files in diff and updates them in-place in the diff itself.");

 PyObject *
-Diff_find_similar(Diff *self, PyObject *args)
+Diff_find_similar(Diff *self, PyObject *args, PyObject *kwds)
 {
     int err;
     git_diff_find_options opts = GIT_DIFF_FIND_OPTIONS_INIT;

-    if (!PyArg_ParseTuple(args, "|i", &opts.flags))
+    uint16_t rename_threshold, copy_threshold, rename_from_rewrite_threshold, break_rewrite_threshold;
+    char *keywords[] = {"flags", "rename_threshold", "copy_threshold", "rename_from_rewrite_threshold", "break_rewrite_threshold", NULL};
+
+    if (!PyArg_ParseTupleAndKeywords(args, kwds, "|iIIII", keywords, &opts.flags, &rename_threshold, &copy_threshold, &rename_from_rewrite_threshold, &break_rewrite_threshold))
         return NULL;

+    opts.rename_threshold = rename_threshold;
+    opts.copy_threshold = copy_threshold;
+    opts.rename_from_rewrite_threshold = rename_from_rewrite_threshold;
+    opts.break_rewrite_threshold = break_rewrite_threshold;
+
     err = git_diff_find_similar(self->list, &opts);
     if (err < 0)
         return Error_set(err);
@@ -508,7 +516,7 @@ PyMappingMethods Diff_as_mapping = {

 static PyMethodDef Diff_methods[] = {
     METHOD(Diff, merge, METH_VARARGS),
-    METHOD(Diff, find_similar, METH_VARARGS),
+    METHOD(Diff, find_similar, METH_VARARGS | METH_KEYWORDS),
     METHOD(Diff, from_c, METH_STATIC | METH_VARARGS),
     {NULL}
 };
@ArnoVanLumig
Copy link
Contributor Author

Created pull request #396 that relates to this issue, but it doesn't actually solve it.

@ArnoVanLumig
Copy link
Contributor Author

Ok, I've been debugging for a while now and I can't really seem to figure this out.

I checked out libgit2 and added a printf statement in the function that actually does the diffing (in file src/diff_tform.c, function git_diff_find_similar). The parameters printed here match the parameters I pass into pygit2. So my changes from the pull request definitely work.

However, it seems that the rewrite_threshold (and possible other parameters, haven't tested yet) is completely ignored by libgit2: only files that match exactly are considered moves, and rewrite-moves are considered a delete and a create. For reference, command-line git does detect these rewrite-moves as moves when using a parameter such as -M50%

Does anyone have an idea on what to check next?

Edit: Figured it out. The behaviour was correct all along, I was just testing it with files that were too small to compute a similarity score on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant