Add function to remove near objects in nd-image #4165

lagru · 2019-09-16T08:07:42Z

Description

Replaces #4024 with an implementation based on SciPy's cKDTree.

I propose to add a function that iterates over objects (connected pixels that are not zero) inside a ~~binary~~ image and removes neighboring objects until all remaining ones are more than a minimal euclidean distance from each other.

Furthermore it moves the functions _resolve_neighborhood, _set_edge_values_inplace and _fast_pad into morphology._util. They are used for several nd-algorithms throughout the submodule so that seems like a sensible place to keep them. Done in #4209

This function might be used to implement a distance "filtering" for functions like local_maxima as well (closes #3816).

Checklist

A descriptive but concise pull request title
Docstrings for all functions
Unit tests
A gallery example in ./doc/examples for new features
Contribution guide is followed

Release note

For maintainers and optionally contributors, please refer to the instructions on how to document this PR for the release notes.

{label="New feature"} Add `skimage.morphology.remove_objects_by_distance`,
which removes labeled objects, ordered by size (default), until the remaining objects are a given distance apart.

{label="Documentation"} Add a new gallery example on "Removing objects"
based on their size or distance.

pep8speaks · 2019-09-19T12:32:59Z

Hello @lagru! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

In the file benchmarks/benchmark_morphology.py:

Line 189:80: E501 line too long (88 > 79 characters)

In the file skimage/morphology/misc.py:

Line 327:80: E501 line too long (88 > 79 characters)

In the file skimage/morphology/tests/test_misc.py:

Line 290:80: E501 line too long (82 > 79 characters)
Line 328:80: E501 line too long (82 > 79 characters)
Line 334:80: E501 line too long (82 > 79 characters)
Line 366:80: E501 line too long (82 > 79 characters)
Line 376:80: E501 line too long (84 > 79 characters)
Line 402:80: E501 line too long (81 > 79 characters)
Line 420:80: E501 line too long (84 > 79 characters)

Comment last updated at 2022-04-10 15:22:21 UTC

jni · 2019-09-26T14:26:52Z

@lagru thanks for this! Sorry that I don't have time for a full review just yet, but this is something I've wanted for a long time! =)

My three comments from a brief overview:

couldn't you use flood fill directly rather than reimplement it?
Should we maybe set the default priority to object size rather than C-order priority? Whatever we end up choosing, I would prefer that it be independent of axis direction! If size is too brittle (how to break ties?), then label ID might be a better choice. They might coincide sometimes, but in other times (e.g. watershed), it might work well. =)
Does SciPy's cKDtree have a C API we can use directly? And is it worth it anyway. ;)

Will do a proper review soon!

lagru · 2019-09-26T15:11:22Z

couldn't you use flood fill directly rather than reimplement it?

I think it might be possible to use _flood_fill_equal. However there are two disadvantages compared to the current solution:

The current approach reuses a single queue. This queue reallocates its internal buffer when it grows to small. Reusing that queue allows us to get away with a minimal number of reallocations.
_flood_fill_equal releases the gil. Generally that's a good thing. But I think I read somewhere that this can actually be slower if the time spent in the released state is very short. That's relevant for the current implementation because there might be many maxima which are only 1 sample large which would lead to releasing and acquiring the GIL very often in short order.

But I'll try it out and see what the benchmarks say so that we can make a more informed decision.

Should we maybe set the default priority to object size rather than C-order priority? Whatever we end up choosing, I would prefer that it be independent of axis direction! If size is too brittle (how to break ties?), then label ID might be a better choice. They might coincide sometimes, but in other times (e.g. watershed), it might work well. =)

Good idea! Using the object size sounds more intuitive. Currently the priority actually corresponds to the label ID which is assigned by C-order. I'll see what I can do.

Does SciPy's cKDtree have a C API we can use directly? And is it worth it anyway. ;)

That's something I'd like to know as well and a weak point of the current implementation. query_ball_point is called for every object sample that remains in the final image. Calling Python from Cython for something that's executed that often feels just wrong. 😅 If we could somehow use a C/Cython interface I'd expect significant gains performance wise. Furthermore we could consider releasing the GIL for the whole function. I already looked into doing this earlier and had the following thoughts:

query_ball_point is actually implemented in Cython but only has a Python interface (def and not cpdef). Perhaps we could ask SciPy to change that but then we'd have to wait until we can rely on that feature (minimal required SciPy version).
Internally query_ball_point calls a function of the same name that is declared in C++ here. I think we could potentially use that function directly (I doubt its considered part of SciPy's public API) but I couldn't figure out how to include that header file / use that function. The header file is not available unless SciPy is installed from source. But I'm not sure if we actually need that header and could use the shipped binary of ckdtree directly (ckdtree.pyx also contains a declaration of that function).

However because a KDTree is such a nice data structure for this problem 😍 the performance is already on par with the "brute force" solution in #4024 and the code is way shorter!

lagru · 2019-09-27T22:20:32Z

Note to self: Find out if there is an efficient way to find the surface of objects. That should make significant performance gains possible because only the surface between two objects needs to be evaluated to find their distance to each other (the inside of objects can be ignored)!

This function iterates over all objects (connected pixels that are True) inside an image and removes neighboring objects until all remaining ones are at least a minimal euclidean distance from each other.

_extrema_cy.pyx was cythonized twice for now reason.

Simplify construction of indices, use consistent name for structuring element (selem), some reformatting and improve documentation.

This trick significantly improves the performance by reducing the number of points inside the KDTree whose neighborhood needs to be evaluated for objects that are to close. The greater the checked objects size to surface ratio, the greater the improvement.

numpy.bincount seems to cast its input to dtype('intp') according to the rule safe. Therefore giving a dtype such as uint32 or int64 that can hold a larger number fails on 32-bit platforms.

With recent changes the evalutation order of objects changed.

The latter was added in NumPy 1.15 which is not yet a minimal requirement for scikit-image.

for unity spacing and if p-norm is 1 or 2. Also refactor code structure a little bit.

tylerjereddy · 2024-04-21T20:28:53Z

skimage/morphology/_misc_cy.pyx

+            continue
+
+        neighborhood = kdtree.query_ball_point(
+            kdtree.data[i_indices, ...],


Just cross-reviewing from SciPy since there were some concerns expressed about the list return type here (which we need because of the ragged data structures that are possible) for performance reasons.

A few questions:

Do you have any sample timings/examples demonstrating the bottleneck/% time spent on this line for a typical use case?

Any chance you could use the workers argument to speed up via concurrency? This might be particularly viable if you could accumulate the query points externally first.

How many points are you querying against in a typical use case? The reason I ask is that you seem to be querying the tree against itself, which may benefit from using query_ball_tree instead.

Thanks for reviewing!

Do you have any sample timings/examples demonstrating the bottleneck/% time spent on this line for a typical use case?

Honestly, I haven't. Since the transition to meson, I'm not sure how to profile, debug or linetrace Cython files. Instead I tried to linetrace via a notebook without success. The approach suggested by Cython's docs

Details

import pstats, cProfile import pyximport pyximport.install() import matplotlib.pyplot as plt import skimage as ski # Extract foreground by thresholding an image taken by the Hubble Telescope image = ski.color.rgb2gray(ski.data.hubble_deep_field()) foreground = image > ski.filters.threshold_li(image) objects = ski.measure.label(foreground) cProfile.runctx( "ski.morphology.remove_near_objects(objects, min_distance=5)", globals(), locals(), "Profile.prof", ) s = pstats.Stats("Profile.prof") s.strip_dirs().sort_stats("time").print_stats()

also doesn't work; it doesn't detect the _remove_near_objects stuff. Do you have a solution to profile / debug meson + Cython on SciPy's side? 🙏 I took a quick look at your repo but didn't find anything.

Any chance you could use the workers argument to speed up via concurrency? This might be particularly viable if you could accumulate the query points externally first.

Hmm, I thought about it, but with the current logic this is tricky to parallelize. It may work if only calls for a single object are combined. I'll look into it!

How many points are you querying against in a typical use case? The reason I ask is that you seem to be querying the tree against itself, which may benefit from using query_ball_tree instead.

The current implementation has to query once for every pixel that is on an objects boundary / surface. Objects that were already removed in earlier iterations are skipped. For the case demonstrated in the new gallery example of this PR, I call query_ball_point 2,636 times (out of 20,505 samples, ~13%). This heavily depends on what min_distance is used and how the objects surface to total size ratio is.

I tried query_ball_tree and sparse_distance_matrix, but this approach quickly runs into memory issues.

For the following case

import numpy as np import skimage as ski image = ski.color.rgb2gray(ski.data.hubble_deep_field()) foreground1 = image > ski.filters.threshold_li(image) # many small objects foreground2 = np.ones_like(foreground1) foreground2[400, :] = 0 # two very large objects foreground = np.concatenate([foreground1, foreground2]) objects = ski.measure.label(foreground) %timeit ski.morphology.remove_near_objects(objects, min_distance=5) %timeit ski.morphology.remove_near_objects(objects, min_distance=50) %timeit ski.morphology.remove_near_objects(objects, min_distance=100) %timeit ski.morphology.remove_near_objects(objects, min_distance=300)

Querying and entire object's indices at once with kdtree.query_ball_point(kdtree.data[start:stop], ...],

Patch

Subject: [PATCH] Query indices with the same object ID at once --- Index: skimage/morphology/_misc_cy.pyx IDEA additional info: Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP <+>UTF-8 =================================================================== diff --git a/skimage/morphology/_misc_cy.pyx b/skimage/morphology/_misc_cy.pyx --- a/skimage/morphology/_misc_cy.pyx (revision c8756cae67078c1cfb05763a503792e05e842aef) +++ b/skimage/morphology/_misc_cy.pyx (date 1713872591799) @@ -46,34 +46,45 @@ The shape of the unraveled `image`. """ cdef: - Py_ssize_t i_indices, j_indices # Loop variables to index `indices` + Py_ssize_t i_indices, j_indices, start, stop # Loop variables to index `indices` Py_ssize_t i_out # Loop variable to index `out` - np_anyint object_id, other_id - list neighborhood + np_anyint start_id, stop_id set remembered_ids + list n remembered_ids = set() - for i_indices in range(border_indices.shape[0]): - i_out = border_indices[i_indices] - object_id = out[i_out] - # Skip if sample is part of a removed object - if object_id == 0: + start = 0 + start_id = out[border_indices[start]] + for stop in range(border_indices.shape[0]): + i_out = border_indices[stop] + stop_id = out[i_out] + + if start_id == stop_id: + continue + elif start_id == 0: + start = stop + start_id = stop_id continue neighborhood = kdtree.query_ball_point( - kdtree.data[i_indices, ...], + kdtree.data[start:stop, ...], r=min_distance, p=p_norm, ) - for j_indices in neighborhood: - # Check object IDs in neighborhood - other_id = out[border_indices[j_indices]] - if other_id != 0 and other_id != object_id: - # If neighbor ID wasn't already removed or is the current one - # remove the boundary and remember the ID - _remove_object(out, border_indices, j_indices) - remembered_ids.add(other_id) + + for n in neighborhood: + for j_indices in n: + # Check object IDs in neighborhood + other_id = out[border_indices[j_indices]] + if other_id != 0 and other_id != start_id: + # If neighbor ID wasn't already removed or is the current one + # remove the boundary and remember the ID + _remove_object(out, border_indices, j_indices) + remembered_ids.add(other_id) + start = stop + start_id = out[border_indices[start]] + # Delete inner parts of remembered objects for j_indices in range(inner_indices.shape[0]): object_id = out[inner_indices[j_indices]]

is faster for min_distance=5 and 50 but a lot slower once min_distance get's larger. Without profiling I'm not really sure about the reason. The implementation that queries single points only, still seems to be the one that's fastest overall.

Using workers seems to introduce an overhead that, for most cases, is slower regardless of the specific implementation.

stefanv

Nice implementation!

I left mostly comments around documentation. And a general question: would it make sense to combine object removal by size into a more general filter_objects?

doc/examples/features_detection/plot_remove_objects.py

skimage/morphology/_misc_cy.pyx

skimage/morphology/misc.py

stefanv · 2024-06-10T21:47:42Z

Emscripten failure: https://github.com/scikit-image/scikit-image/actions/runs/9453937550/job/26040373784?pr=4165#step:8:548

That is NumPy 1.26, which I think we also test on in other targets 🤔

Co-authored-by: Stefan van der Walt <sjvdwalt@gmail.com>

lagru · 2024-06-11T03:19:58Z

@stefanv, I think I addressed most of your concerns. Sadly I don't have a clue what the pyodide specific failures are about. Also don't know an easy way to debug this locally yet.

lagru · 2024-06-11T06:19:53Z

Hey @agriyakhetarpal, could I perhaps summon you in case you have an idea at a first glance? I/we can't figure out, why numpy is acting up for the emscripten job but nowhere else.

agriyakhetarpal · 2024-06-11T07:03:49Z

Hi, @lagru – thanks for mentioning me, and please feel free to mention me on further issues that can arise; I fully intend to continue maintenance and help out after establishing this workflow. :)

I had a cursory look and the issue seems to be something that I have encountered before. A stopgap-like fix is to convert the data to int32 before passing it to np.bincount() (maybe under is_wasm() only too). The bitness of the Pyodide interpreter is 32 bits, which is the most likely producer of the problem.

stefanv · 2024-06-11T18:43:58Z

Cell In[3], line 2
      1 x = np.ones((3,), dtype=np.int64)
----> 2 np.bincount(x)

TypeError: Cannot cast array data from dtype('int64') to dtype('int32') according to the rule 'safe'

@seberg I believe this is a bug in NumPy on 32-bit platforms? In the mean time, we can work around it with a special is_wasm clause.

seberg · 2024-06-11T20:18:56Z

Not sure it is a bug, but bincount is indexing, so 64bit integers don't fit anyway, since you are limited to 32bit array sizes. Would it make sens to use intp for the array? Or just add .astype(np.intp, copy=False), I guess if you cannot change that.

stefanv · 2024-06-11T20:34:46Z

Not sure it is a bug, but bincount is indexing, so 64bit integers don't fit anyway, since you are limited to 32bit array sizes. Would it make sens to use intp for the array? Or just add .astype(np.intp, copy=False), I guess if you cannot change that.

Thanks for looking at this, Sebastian. The fix isn't hard, but NumPy does not document the intp requirement as far as I can tell, nor does it throw a helpful error message.

Tried to handle this in a more general sense. Not sure if this is the best option though.

lagru · 2024-06-12T01:31:44Z

skimage/morphology/misc.py

+        if out_raveled.dtype.itemsize > np.intp().itemsize:
+            # bincount expects intp (e.g. 32-bit on WASM), so down-cast to that
+            priority = np.bincount(out_raveled.astype(np.intp, copy=False))
+        else:
+            priority = np.bincount(out_raveled)


Okay, I tried to address the WASM-bincount failure in e6446d6. Not sure if this is the best approach though. I wanted to avoid creating unnecessary up-casting of int8 etc.

Thanks for chiming in @agriyakhetarpal & @seberg.

That didn't work. d749dd3 works though.

Out of curiosity, what if the values are too large to be casts down to int32?

I'm confused — aren't there non-wasm platforms where this would be an issue? Or is it because we no longer support 32-bit win or linux that we can get away with just is_wasm?

Yeah, this just force casts which is unfortunately something that indexing-like operations do in NumPy (they use same-kind), so if you manage to end up with a too large value, that could silently do the wrong (although I guess it's likely that even after the force cast you have huge values that cause an out of memory error).

If floats might come around using casting="same_kind" would be better; you can do this unconditionally really, that is what the copy=False is for.

jni · 2024-06-12T06:18:46Z

skimage/morphology/misc.py

+    if (spacing is None or np.all(spacing[0] == spacing)) and p_norm <= 2:
+        # For unity spacing we can make the borders more sparse by using a
+        # lower connectivity
+        footprint = ndi.generate_binary_structure(out.ndim, 1)
+    else:
+        footprint = ndi.generate_binary_structure(out.ndim, out.ndim)
+    border = (
+        ndi.maximum_filter(out, footprint=footprint)
+        != ndi.minimum_filter(out, footprint=footprint)
+    ).ravel()[indices]


Can you explain why the connectivity is determined by the spacing and p-norm? I actually would have thought that the connectivity should be a user parameter and should default to 1 regardless of spacing...

I can't yet. I wasn't sure if there are edge cases for spacing and p_norm where the using a higher-connected footprint would return the wrong results. So I opted to only use the lower connectivity for cases were I was sure that it was correct.

skimage/morphology/misc.py

jni · 2024-06-12T06:46:22Z

skimage/morphology/_misc_cy.pyx

+        if object_id != 0 and object_id in remembered_ids:
+            _remove_object(out, inner_indices, j_indices)


Is this fast?

I have an alternate implementation for this section that may be worth exploring:

(1) build a coo matrix -> csr matrix mapping object ids (rows) to raveled indices (columns). This can be done with:

sz = label_image.size coo = sparse.coo_matrix( (np.ones(sz), (label_image.ravel(), np.arange(sz))), shape=(np.max(label_image), sz + 1), ) csr = coo.tocsr()

(Bonus: csr.sum(axis=1) is fast and gives you the bincount)

(2) Finding all the indices belonging to label i is then O(1) and they are adjacent in memory: idxs = csr.indices[csr.indptr[i]:csr.indptr[i+1]]

(Indeed I expect you should be able to iterate over removed indices in Python rather than Cython without much performance cost, since typically the size of the indices is much greater than the number of labels to be removed.)

Indeed re-reading the code, it seems you are iterating over all indices and checking whether they are in the remembered objects — even if the object is a set and O(1) lookup, this seems less efficient than only iterating over remembered object ids and instantly recalling the relevant indices. I think?

Sounds interesting! Let me explore this once I find the time!

Hmm, is there a typo in the code you are suggesting? Running

Details

import numpy as np import scipy as sp label_image = np.array( [[8, 0, 0, 0, 0, 0, 0, 0, 0, 9, 9], [8, 8, 8, 0, 0, 0, 0, 0, 0, 9, 9], [0, 0, 0, 0, 0, 0, 0, 0, 9, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 3, 0, 0, 0, 1, 0, 0, 0, 0], [2, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 7, 7]] ) sz = label_image.size coo = sp.sparse.coo_matrix( (np.ones(sz), (label_image.ravel(), np.arange(sz))), shape=(np.max(label_image), sz + 1), ) csr = coo.tocsr()

gives

ValueError: axis 0 index 9 exceeds matrix dimension 9

when calling coo_matrix...?

Thanks, I managed to get it working with

Details

Subject: [PATCH] Use sparse matrix to remove objects --- Index: skimage/morphology/_misc_cy.pyx IDEA additional info: Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP <+>UTF-8 =================================================================== diff --git a/skimage/morphology/_misc_cy.pyx b/skimage/morphology/_misc_cy.pyx --- a/skimage/morphology/_misc_cy.pyx (revision d749dd316d40fd7d1ca71c74d170fd16b2c5143d) +++ b/skimage/morphology/_misc_cy.pyx (date 1718287031000) @@ -18,6 +18,7 @@ cnp.float64_t p_norm, cnp.float64_t min_distance, tuple shape, + csr, ): """Remove objects, in specified order, until remaining are a minimum distance apart. @@ -70,15 +71,9 @@ other_id = out[border_indices[j_indices]] if other_id != 0 and other_id != object_id: # If neighbor ID wasn't already removed or is the current one - # remove the boundary and remember the ID - _remove_object(out, border_indices, j_indices) - remembered_ids.add(other_id) + # remove the boundary + out.base[csr.indices[csr.indptr[other_id]:csr.indptr[other_id + 1]]] = 0 - # Delete inner parts of remembered objects - for j_indices in range(inner_indices.shape[0]): - object_id = out[inner_indices[j_indices]] - if object_id != 0 and object_id in remembered_ids: - _remove_object(out, inner_indices, j_indices) cdef inline _remove_object( Index: skimage/morphology/misc.py IDEA additional info: Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP <+>UTF-8 =================================================================== diff --git a/skimage/morphology/misc.py b/skimage/morphology/misc.py --- a/skimage/morphology/misc.py (revision d749dd316d40fd7d1ca71c74d170fd16b2c5143d) +++ b/skimage/morphology/misc.py (date 1718287524197) @@ -2,6 +2,7 @@ import numpy as np import functools +import scipy as sp from scipy import ndimage as ndi from scipy.spatial import cKDTree @@ -401,18 +402,22 @@ border_indices = border_indices[np.argsort(out_raveled[border_indices])] inner_indices = inner_indices[np.argsort(out_raveled[inner_indices])] - if priority is None: - if is_wasm: - # bincount expects intp (32-bit) on WASM, so down-cast to that - priority = np.bincount(out_raveled.astype(np.intp, copy=False)) - else: - priority = np.bincount(out_raveled) # `priority` can only be indexed by positive object IDs, # `border_indices` contains all unique sorted IDs so check the lowest / first smallest_id = out_raveled[border_indices[0]] if smallest_id < 0: raise ValueError(f"found object with negative ID {smallest_id!r}") + largest_id = out_raveled[border_indices[-1]] + coo = sp.sparse.coo_matrix( + (np.ones(out.size), (out_raveled, np.arange(out.size))), + shape=(largest_id + 1, out.size) + ) + csr = coo.tocsr() + + if priority is None: + priority = csr.sum(axis=1).base.ravel() + try: # Sort by priority second using a stable sort to preserve the contiguous # sorting of objects. Because each pixel in an object has the same @@ -447,6 +452,7 @@ min_distance=min_distance, p_norm=p_norm, shape=label_image.shape, + csr=csr, ) if out_raveled.base is not out:

However, benchmarking with

import skimage as ski import scipy as sp image = ski.data.hubble_deep_field() image = ski.color.rgb2gray(image) objects = image > 0.18 label_image, _ = sp.ndimage.label(objects) %timeit ski.morphology.remove_objects_by_distance(label_image, 1) %timeit ski.morphology.remove_objects_by_distance(label_image, 5) %timeit ski.morphology.remove_objects_by_distance(label_image, 100) %timeit ski.morphology.remove_objects_by_distance(label_image, 500)

Shows that the current Cython-loop based approach is still faster.

Current Cython loop
364 ms ± 12.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
346 ms ± 5.39 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
198 ms ± 2.32 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
236 ms ± 2.83 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Sparse-matrix approach
372 ms ± 2.24 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
351 ms ± 2.46 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
215 ms ± 3.62 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
245 ms ± 2.47 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

🤔

Might still be worth the switch though, as the logic should be way easier to follow...?

the logic should be way easier to follow...?

Yes, and, you're currently doing redundant work, because the csr approach is sorting the indices for you. So you should be able to get rid of the argsort calls. (But it might be a bit tricky to deal with border/inner. But should be doable e.g. with a mask I think.)

Okay, since this may take me some time to figure out. Do you mind if we added that optimization in another PR? That way we can get this into the next release which we want to do soon. This PR also has kind of a bad track record with getting stalled again because attention shifts again. 😅

cc @jarrodmillman

Sure! Might be worth adding a # TODO comment to the code pointing to this discussion, but whatever works for you. 😊

Co-authored-by: Juan Nunez-Iglesias <jni@fastmail.com>

jarrodmillman · 2024-06-15T11:52:42Z

I am merging this and making a 0.24rc0 release. Thanks!

lagru added the ⏩ type: Enhancement Improve existing features label Sep 16, 2019

lagru mentioned this pull request Sep 16, 2019

WIP: Add minimal distance argument to local_maxima #4024

Closed

9 tasks

sciunto added this to the 0.17 milestone Sep 17, 2019

lagru force-pushed the remove-close-objects branch from 0ea6635 to b2bb144 Compare September 19, 2019 12:32

lagru added type: new feature ⏩ type: Enhancement Improve existing features and removed ⏩ type: Enhancement Improve existing features labels Sep 21, 2019

lagru force-pushed the remove-close-objects branch from 7adbb50 to a1a8de2 Compare September 26, 2019 23:03

lagru changed the title ~~Add function to remove close objects in binary nd-image~~ Add function to remove close objects in nd-image Sep 27, 2019

lagru added 14 commits September 29, 2019 20:48

Move shared nd-utility functions into _util module

d929614

Add remove_close_objects function

e35ffa3

This function iterates over all objects (connected pixels that are True) inside an image and removes neighboring objects until all remaining ones are at least a minimal euclidean distance from each other.

Test basic functionality of remove_close_objects

03830d9

Add manual bounds checking

bcac7ee

Improve tests and handle edge cases better

f153fe9

Support boolean views on numeric dtypes

cac1a0c

Simplify Cythonization in morphology.setup.py

0814f91

_extrema_cy.pyx was cythonized twice for now reason.

Sort objects by their size if no priority is given

f3aeb68

Add support for int and float type images

c53a492

Add test for object with unequal values

fbe45c8

Simplify/tweak helper functions in morphology._util

0d08d0f

Simplify construction of indices, use consistent name for structuring element (selem), some reformatting and improve documentation.

Use intp as label dtype

0b36136

numpy.bincount seems to cast its input to dtype('intp') according to the rule safe. Therefore giving a dtype such as uint32 or int64 that can hold a larger number fails on 32-bit platforms.

Add basic benchmarks for remove_close_objects

b44c417

lagru force-pushed the remove-close-objects branch from 15a2195 to b44c417 Compare September 29, 2019 18:48

lagru added 2 commits September 29, 2019 21:40

Correct faulty results in docstring example

27777e3

With recent changes the evalutation order of objects changed.

Use "mergesort" instead of "stable" sort

2a6e09e

The latter was added in NumPy 1.15 which is not yet a minimal requirement for scikit-image.

lagru added 3 commits April 21, 2024 12:24

Test p-norm as well

7a59894

Correct spelling

49ce7c3

Use sparser borders if possible

c8756ca

for unity spacing and if p-norm is 1 or 2. Also refactor code structure a little bit.

tylerjereddy reviewed Apr 21, 2024

View reviewed changes

lagru added 2 commits June 7, 2024 23:54

Merge branch 'main' into remove-close-objects

102c932

Merge branch 'main' into remove-close-objects

380b1b9

stefanv reviewed Jun 10, 2024

View reviewed changes

lagru and others added 3 commits June 11, 2024 01:27

Apply suggestions from code review

007d849

Co-authored-by: Stefan van der Walt <sjvdwalt@gmail.com>

Tweak docs based on Stéfan's suggestions

0552183

Co-authored-by: Stefan van der Walt <sjvdwalt@gmail.com>

Refactor to remove_objects_by_distance

90781f1

stefanv added this to the 0.24 milestone Jun 11, 2024

Handle np.bincount expecting intp on WASM

e6446d6

Tried to handle this in a more general sense. Not sure if this is the best option though.

lagru commented Jun 12, 2024

View reviewed changes

Use is_wasm to handle np.bincount expecting intp

d749dd3

jni reviewed Jun 12, 2024

View reviewed changes

skimage/morphology/misc.py Outdated Show resolved Hide resolved

jni reviewed Jun 12, 2024

View reviewed changes

Apply suggestions from code review

38bae2e

Co-authored-by: Juan Nunez-Iglesias <jni@fastmail.com>

jarrodmillman approved these changes Jun 15, 2024

View reviewed changes

jarrodmillman merged commit 111fff6 into scikit-image:main Jun 15, 2024
26 checks passed

lagru mentioned this pull request Jun 19, 2024

Explore optimization of remove_objects_by_distance with sparse.coo_matrix #7449

Open

lagru deleted the remove-close-objects branch June 19, 2024 12:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add function to remove near objects in nd-image #4165

Add function to remove near objects in nd-image #4165

lagru commented Sep 16, 2019 •

edited by stefanv

Loading

pep8speaks commented Sep 19, 2019 •

edited

Loading

jni commented Sep 26, 2019

lagru commented Sep 26, 2019

lagru commented Sep 27, 2019

tylerjereddy Apr 21, 2024

lagru Apr 23, 2024 •

edited

Loading

lagru Apr 23, 2024

stefanv left a comment

stefanv commented Jun 10, 2024

lagru commented Jun 11, 2024

lagru commented Jun 11, 2024

agriyakhetarpal commented Jun 11, 2024

stefanv commented Jun 11, 2024

seberg commented Jun 11, 2024

stefanv commented Jun 11, 2024

lagru Jun 12, 2024

lagru Jun 12, 2024

lagru Jun 12, 2024

stefanv Jun 12, 2024

jni Jun 12, 2024

seberg Jun 12, 2024 •

edited

Loading

jni Jun 12, 2024

lagru Jun 12, 2024

jni Jun 12, 2024

jni Jun 12, 2024

jni Jun 12, 2024

lagru Jun 12, 2024

lagru Jun 12, 2024

lagru Jun 13, 2024 •

edited

Loading

lagru Jun 13, 2024

jni Jun 14, 2024

lagru Jun 14, 2024

jni Jun 15, 2024

jarrodmillman commented Jun 15, 2024

		if object_id != 0 and object_id in remembered_ids:
		_remove_object(out, inner_indices, j_indices)

Add function to remove near objects in nd-image #4165

Add function to remove near objects in nd-image #4165

Conversation

lagru commented Sep 16, 2019 • edited by stefanv Loading

Description

Checklist

Release note

pep8speaks commented Sep 19, 2019 • edited Loading

Comment last updated at 2022-04-10 15:22:21 UTC

jni commented Sep 26, 2019

lagru commented Sep 26, 2019

lagru commented Sep 27, 2019

Choose a reason for hiding this comment

lagru Apr 23, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stefanv left a comment

Choose a reason for hiding this comment

stefanv commented Jun 10, 2024

lagru commented Jun 11, 2024

lagru commented Jun 11, 2024

agriyakhetarpal commented Jun 11, 2024

stefanv commented Jun 11, 2024

seberg commented Jun 11, 2024

stefanv commented Jun 11, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

seberg Jun 12, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lagru Jun 13, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jarrodmillman commented Jun 15, 2024

lagru commented Sep 16, 2019 •

edited by stefanv

Loading

pep8speaks commented Sep 19, 2019 •

edited

Loading

lagru Apr 23, 2024 •

edited

Loading

seberg Jun 12, 2024 •

edited

Loading

lagru Jun 13, 2024 •

edited

Loading