-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Filter holes from alpha shape returns #457
Conversation
Codecov Report
@@ Coverage Diff @@
## master #457 +/- ##
========================================
+ Coverage 78.7% 78.8% +0.1%
========================================
Files 122 122
Lines 12884 12910 +26
========================================
+ Hits 10142 10177 +35
+ Misses 2742 2733 -9
|
OK, I've switched to a zorder search method:
|
@cornhundred, this should be good to go, and solves your reference case. Is there any other known pattern you'd want to test on? |
Thanks @ljwolf! The random matrix with holes should cover our use case. Does it return a MultiPolygon or a list of Polygons? Also, I started to scope out how to allow alternate alpha shape metrics (radius, distance, etc) https://github.com/cornhundred/libpysal/tree/alt_alpha_metrics I'll make a pull request when I've cleaned it up more and meet the pull request requirements (testing etc). For our case we would like to set the threshold based on max triangle length. |
It keeps the return type as a GeoSeries. Would you prefer a single (multi)polygon? The |
Ok, keeping it the way it is should be fine. It is easy enough to calculate a MultiPolygon from it. We also want to calculate properties of the individual polygons (area, number of holes, etc) so that would be easier if it returns a list of discontiguous polygons. |
Hello, Thanks for this @ljwolf for the PR and @cornhundred for bringing it up and chipping in! I've looked over and it seems all fine, very clever trick! A couple of thoughts:
|
Hey @darribas, those are great ideas, I'm happy to try and help address them. We definitely care about the holes but I'm not sure if it makes sense to pass them as separate polygons rather than derive them later from polygons. For some background, we're working with single-cell spatial transcriptomics data - see mouse liver data release and example image below The holes in our case can be blood vessels (or artifacts from sectioning) and alpha shapes seem like a great way to quantify them. Also here's a Colab notebook from a brain dataset where we interactively visualize our data. |
That can still be a Polygon as long as it has a single part (with N holes in it but not another part within that hole). |
I've added a |
171f86f
to
e2ae58d
Compare
Hi @ljwolf, I agree with the always filtering logic. Since the holes are properties of specific polygons within the alpha shape we would want users to obtain holes with metadata about their 'parent' polygon, but that should be outside the scope of the alpha shape method. |
Hi, just checking on the status of this pull request. I can try to make an additional pull request to have the option for different alpha shape distance thresholds this week so maybe it makes sense to wait on that before making a potential new release. |
A determination about the proper behavior of the option from a reviewer (e.g. @darribas @martinfleis @sjsrey) would get this moving again. Happy to raise this on our monthly developer meeting friday if there is no assent by then! |
Great thanks @ljwolf appreciate it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @ljwolf. I believe that we should return the polygons that represent the alpha shape and not those that represent holes. As said, those are already included as interiors
so we would be just duplicating the same information.
Apart from that one note in the docstring, this seems to be ready to be merged.
libpysal/cg/alpha_shapes.py
Outdated
no CRS included in the object. Also note that holes are included in addition | ||
to exteriors. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is probably not the case with _filter_holes
anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, so, this is slightly confusing.
There are two functions:
alpha_shape(xys, alpha)
, which returns a GeoSeries for a givenalpha
and input pointsxys
.alpha_geoms(alpha, triangles, radii, xys)
, which does the same, but using the delaunay triangulation and its corresponding circumcircle radii.
I have only filtered the alpha_shape()
output. alpha_geoms()
still can return holes.
I didn't want to add the z-order search into alpha_geoms()
because it's used repeatedly in a performance-critical loop by alpha_shape_auto()
. That function constructs a single output Polygon that covers the input point set. For its particular case, the holes returned by alpha_geoms()
won't matter because of the search procedure used in alpha_shape_auto()
. So, if the z-order search were introduced in alpha_geoms
, it would introduce an unnecessary (but slight) performance penalty.
Thinking about this, the "right" solution would be to make alpha_geoms()
private to alpha_shape_auto()
, but we don't know if anyone's using that function.
I propose to solve this:
- make an unfiltered private version of
alpha_geoms()
to preservealpha_shape_auto()
speed - add a filter to
alpha_geoms()
, too. - deprecate the public
alpha_geoms()
method.
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems reasonable to me to make alpha_geoms()
private since users will be directed to use alpha_shape()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alpha_geoms
is not a part of public API as far as I can tell. It is not mentioned in the API documentation and it is not listed in __all__
in this file. So I'd say that we do not have to deprecate it and can change as we wish.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome. Then I'll add an underscore to the name to be extra careful, and then merge this.
Expect a point release with these changes tomorrow UK time @cornhundred
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!!
tests are now failing in another part of the code. Looks like a networkx issue in Python 3.7. will investigate & merge if I can figure it out. |
OK, well, networkx dropped support for python 3.7; we'll need to drop that too, then, or skip the test. |
Or, I suppose we could pin to networkx 2.6 for Python 3.7. Will investigate. |
Thanks! |
No problem! 4.6.1 should be live on pypi! |
Noted by @cornhundred in #456, we've got some surprising behavior for alpha shapes with holes. This attempts to fix this in the
alpha_shape()
function by (1) checking if there are any "holes" in the shapely polygons and, if there are, (2) we only keep polygons that contain an input point.This solves the problem OK, but I think there's a bit of an issue:
Note that this results in two separate polygons-with-holes on the left, but on the right, we retain one of the holes because a point falls within it... I think this happens because the "correct" polygon for that hole would have a zero-area "cut in" that only includes that point. When we use
shapely.ops.polygonize
, we lose that cut in, and thus omit the vertex from the alpha shape.Consider this box with a "cut in" from [1,1] to [.5,.5]:
upon polygonization, we lose that cut-in:
geom
is now'POLYGON ((0 0, 0 1, 1 1, 1 0, 0 0))'
with no mention of the cut-in.This is good, since a polygon with this kind of cut-in is an invalid polygon, but it complicates our testing here... not all holes are empty.
The other way to do this would be to do an all-pairs
within
check for the polygons usinggeoms.sindex.query_bulk()
. I was hoping the point-in-poly woudl be faster, but... but you never know. I'll check that & re-submit.