-
Notifications
You must be signed in to change notification settings - Fork 161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add max weight matching function #229
Conversation
953f364
to
481f99a
Compare
This commit adds a new python function max_weight_function() for computing the maximum-weighted matching of a PyGraph object. The implementation of this function is based on “Efficient Algorithms for Finding Maximum Matching in Graphs”, Zvi Galil, ACM Computing Surveys, 1986. [1] It is basically a porting of the networkx implementation of the algorithm [2] with some additional inspiration from the prototype for the networkx version (it's basically the same code but some aspects were a bit clearer to figure out from the prototype rather than networkx's copy of it). [3][4] Fixes Qiskit#216 [1] https://dl.acm.org/doi/10.1145/6462.6502 [2] https://github.com/networkx/networkx/blob/3351206a3ce5b3a39bb2fc451e93ef545b96c95b/networkx/algorithms/matching.py [3] http://jorisvr.nl/article/maximum-matching [4] http://jorisvr.nl/files/graphmatching/20130407/mwmatching.py
481f99a
to
98fa26e
Compare
This commit fixes all but 2 of the test panics. There were 2 classes of issues that his commit fixes (both were artifacts of sloppy porting from Python). The first is that in add_blossom() we were not updating the global blossom_children or the global blossom_endpoints while iterating over the list. Instead we were just setting a local. The second issue was negative indices for Vecs. The python code was iterating over elements in a vec in expand_blossom() and augment_blossom() by subtracting a length or step from the index which could result in a negative value. In Python this is fine since a negative index is just the from the last element. But, for rust we need to handle this behavior manually because the Index trait only deals with usize.
Pull Request Test Coverage Report for Build 593928485
💛 - Coveralls |
This is ready to review now. The only open question is on the return format. Right now it is returning a dict but networkx returns a set of the matching nodes. It's not hard to adjust it, but I'm not sure what people think is a better choice. |
This commit fixes an issue when running max_weight_matching on a graph with holes in the node indices (i.e. when 'node_count() != max(node_indices)'). The in_blossoms vec was being initialized from the contents of the node_indices iterator of the graph but the algorithm remaps the indices to be a contiguous list for the internals to work. When a hole was encountered accessing in_blossoms would panic because the index returned wouldn't match the expecated state. This is fixed and a test added to make sure we have coverage on input graphs with holes.
It might be good to add a unit test to compare results of networkx and retworkx with random graphs. |
Sure, I added a couple of tests to do that in: 308c82a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this great job. I've just started to read but not yet completed. I have one question so far. Is there any reason to use endpoints
data structure instead of vertex pairs? (Is it just for performance?)
Ok, sorry for the delay I've added a fallback check for the cases where networkx and retworkx return differing matchings to verify the retworkx matching is still correct (I hope I implemented it correctly). I also added a test that assigns a random integer weight to each edge. |
As an aside, these tests are the slowest in the suite (taking 3-4 seconds) so I was curious where the bottleneck was and ran a profile of the test class. The slowdown is from running networkx's
These numbers are over the entire test class including the test methods which do not call networkx (also they have cProfile overhead which will be a bit worse for networkx because it has a deeper python call stack). |
len(set(e1) & set(e2)) == 0 for e1, e2 in combinations(matching, 2)) | ||
|
||
|
||
def is_maximal_matching(graph, rx_matches): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is_matching
and is_maximal_matching
might be useful if they are included as part of retworkx. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I agree, it'll be good thing to add as a retworkx function in a follow up PR (this one is already big enough :) ). I've opened an issue on this here: #255.
This commit adds new functions for checking a provided matching set is valid, is_matching(), and that a provided matching set is valid and maximal, is_maximal_set(). This pairs with Qiskit#229 and can be used to partially check the output from the max_weight_matching function added there. Equivalent functions were implemented in Qiskit#229 using Python for the tests in Qiskit#229 and those tests should be updated to use these functions instead. Fixes Qiskit#255
This commit adds new functions for checking a provided matching set is valid, is_matching(), and that a provided matching set is valid and maximal, is_maximal_set(). This pairs with Qiskit#229 and can be used to partially check the output from the max_weight_matching function added there. Equivalent functions were implemented in Qiskit#229 using Python for the tests in Qiskit#229 and those tests should be updated to use these functions instead. Fixes Qiskit#255
This commit adds new functions for checking a provided matching set is valid, is_matching(), and that a provided matching set is valid and maximal, is_maximal_set(). This pairs with Qiskit#229 and can be used to partially check the output from the max_weight_matching function added there. Equivalent functions were implemented in Qiskit#229 using Python for the tests in Qiskit#229 and those tests should be updated to use these functions instead. Fixes Qiskit#255
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for updating the tests! LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Great job!
* Add is_maximal_matching and is_matching functions This commit adds new functions for checking a provided matching set is valid, is_matching(), and that a provided matching set is valid and maximal, is_maximal_set(). This pairs with #229 and can be used to partially check the output from the max_weight_matching function added there. Equivalent functions were implemented in #229 using Python for the tests in #229 and those tests should be updated to use these functions instead. Fixes #255 * Use retworkx functions in max_weight_matching tests * Remove unused import * Simplify is_matching logic The original implementation was based on networkx's which was getting all the combinations of edges in the matching and checking for unique endpoints on all those pairs. However, doing the combinations adds extra complexity when we can just flatten the input matching to a set of endpoints and if aduplicate is found it's not a valid matching. This commit updates the inner is_matching function to make this change which simplifies the code and should be faster. Co-authored-by: Toshinari Itoko <itoko@jp.ibm.com> * Update docstring Co-authored-by: Toshinari Itoko <itoko@jp.ibm.com>
Since Qiskit#229 our test runs take significantly longer. This is mostly a function of the tests added in that function running over numerous random graphs and comparing the output against networkx to verify it works in all cases. However, this validation comes with a runtime cost. In an attempt to mitigate this partially this commit changes the test runner from stdlib's unittest runner to use stestr. [1] stestr provide a performance improvement by running tests on parallel workers and also provides an easier to use UI that enables test selection a history of previous runs, and many other features. The test suite in retworkx will still need to be strictly unittest compatible so anyone can run with stestr, stdlib unittest, pytest, or any other test runner of their choice. But the default in tox and CI will be to use stestr for the performance.
Since Qiskit#229 our test runs take significantly longer. This is mostly a function of the tests added in that function running over numerous random graphs and comparing the output against networkx to verify it works in all cases. However, this validation comes with a runtime cost. In an attempt to mitigate this partially this commit changes the test runner from stdlib's unittest runner to use stestr. [1][2][3] stestr provides a performance improvement by running tests on parallel workers and also provides an easier to use UI that enables test selection a history of previous runs, and many other features. The test suite in retworkx will still need to be strictly unittest compatible so anyone can run with stestr, stdlib unittest, pytest, or any other test runner of their choice. But the default in tox and CI will be to use stestr for the performance benefit. [1] https://pypi.org/project/stestr/ [2] https://github.com/mtreinish/stestr [3] https://stestr.readthedocs.io/en/stable/README.html
Since #229 our test runs take significantly longer. This is mostly a function of the tests added in that function running over numerous random graphs and comparing the output against networkx to verify it works in all cases. However, this validation comes with a runtime cost. In an attempt to mitigate this partially this commit changes the test runner from stdlib's unittest runner to use stestr. [1][2][3] stestr provides a performance improvement by running tests on parallel workers and also provides an easier to use UI that enables test selection a history of previous runs, and many other features. The test suite in retworkx will still need to be strictly unittest compatible so anyone can run with stestr, stdlib unittest, pytest, or any other test runner of their choice. But the default in tox and CI will be to use stestr for the performance benefit. [1] https://pypi.org/project/stestr/ [2] https://github.com/mtreinish/stestr [3] https://stestr.readthedocs.io/en/stable/README.html
This commit adds a new python function max_weight_matching() for
computing the maximum-weighted matching of a PyGraph object. The
implementation of this function is based on “Efficient Algorithms for
Finding Maximum Matching in Graphs”, Zvi Galil, ACM Computing Surveys,
1986. [1] It is basically a porting of the networkx implementation of
the algorithm [2] with some additional inspiration from the prototype for
the networkx version (it's basically the same code but some aspects were
a bit clearer to figure out from the prototype rather than networkx's
copy of it). [3][4]
Fixes #216
[1] https://dl.acm.org/doi/10.1145/6462.6502
[2] https://github.com/networkx/networkx/blob/3351206a3ce5b3a39bb2fc451e93ef545b96c95b/networkx/algorithms/matching.py
[3] http://jorisvr.nl/article/maximum-matching
[4] http://jorisvr.nl/files/graphmatching/20130407/mwmatching.py
TODO: