Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Image analysis workflow #801

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from
Draft

[WIP] Image analysis workflow #801

wants to merge 3 commits into from

Conversation

jrbourbeau
Copy link
Member

This is an initial pass at the example outlined here #751 (comment). Could still use some finishing steps, but this contains (I think) the bulk of the logic. Pushing up now for any early feedback.

cc @GenevieveBuckley @mrocklin for visibility

Comment on lines +8 to +10
images = da.from_zarr(
"s3://coiled-datasets/BBBC039", storage_options={"anon": True}
)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I converted the original .tif dataset to Zarr format and uploaded them to a public S3 bucket. This had the side-effect of bypassing dask/dask-image#84.

label_images, num_features = ndmeasure.label(binary_images)
index = np.arange(num_features)
# FIXME: Only selecting the first few images due to cluster idle timeout.
# Maybe sending large graph? Need to investigate a bit.
Copy link

@GenevieveBuckley GenevieveBuckley Apr 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no useful suggestions, but it's great* that this has possibly already identified some kind of problem your users might occasionally run into.

*You know, great for your users, but admittedly not so great for the person who now needs to investigate it 😆 Have fun with that, James!

from dask_image import ndfilters, ndmeasure, ndmorph


def test_BBBC039(small_client):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest adding a link to those talk slides and/or the repo. That way if anyone needs to work on this benchmark later, they can look there for context/answers there before quizzing James about it.

smoothed = ndfilters.gaussian_filter(images, sigma=[0, 1, 1])
thresh = ndfilters.threshold_local(smoothed, block_size=images.chunksize)
threshold_images = smoothed > thresh
structuring_element = np.array(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest adding a short comment: to this line about why we're not using the default structuring element. Eg:

# Since this image stack appears to be 3-dimensional,
# we sandwich a 2d structuring element in between zeros
# so that each 2d image slice has the binary closing applied independently

Apparently I only ever said that verbally during the talk, may as well write it down.

@crusaderky crusaderky marked this pull request as draft May 2, 2023 10:38
@jrbourbeau jrbourbeau added the workflows Related to representative Dask user workflows label May 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
workflows Related to representative Dask user workflows
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants