Add support for dask distributed scheduler in quantum detector reader #267

ericpre · 2024-05-29T21:11:21Z

Progress of the PR

Add support for dask distributed scheduler in quantum detector reader - similar to Load .mrc using Distributed and with Direct Electron Metadata #162 and Add .seq format for DE 16 and Celeritas Camera #11,
update docstring (if appropriate),
update user guide (if appropriate),
add a changelog entry in the upcoming_changes folder (see upcoming_changes/README.rst),
Check formatting of the changelog entry (and eventual user guide changes) in the docs/readthedocs.org:rosettasciio build of this PR (link in github checks)
add tests,
ready for review.

rsciio/utils/distributed.py

ericpre · 2024-05-29T21:27:21Z

Thank you @CSSFrancis, as you suggested in #266 (comment), #11 was useful to figure out the details with the structured dtype - pretty much copy & paste the relevant bits! 😄

CSSFrancis

This is actually a much smaller change than I thought it would be :). It looks good to me other than it might be nice to return the header as well although a warning that this could increase the loading time would be useful.

CSSFrancis · 2024-06-03T18:21:41Z

rsciio/quantumdetector/_api.py

@@ -344,6 +355,10 @@ def load_mib_data(
        data = data.rechunk(chunks)

    if return_headers:
+        if distributed:


You can still return the header by just setting the key="header" for a second memmap_distributed call. It will add some time onto the saving of the dataset as the entire dataset might get loaded into ram with most of it thrown away.

Really what we should do is add things to a to_store context manager and then call:

rosettasciio/rsciio/hspy/_api.py

Line 111 in 31bd677

da.store(data, dset)

Only once. That will merge taskgraphs as necessary and might reduce the time for saving certain signals. I've thought about it for things like saving lazy markers of possibly creating a hs.save() function for handling mulitple signals if you wanted to save multiple parts of some anaylsis efficently. This is a fairly abstract/higher level concept so maybe it would be seledomly used.

Yes, this will most likely needed to be done at some point! I opened #269 to track it / add more usecases.

github-advanced-security bot found potential problems May 29, 2024

View reviewed changes

rsciio/utils/distributed.py Fixed Show fixed Hide fixed

ericpre added 2 commits May 29, 2024 22:19

Add support for dask distributed scheduler in quantum detector reader

d124752

Add changelog entry

d0ab812

ericpre force-pushed the mib_distributed branch from d1ee9e1 to d0ab812 Compare May 29, 2024 21:20

ericpre added status: needs review type: enhancement labels May 29, 2024

ericpre added this to the v0.5 milestone May 29, 2024

ericpre requested a review from CSSFrancis May 29, 2024 21:24

CSSFrancis approved these changes Jun 3, 2024

View reviewed changes

ericpre mentioned this pull request Jun 5, 2024

Add support for saving multiple lazy signals #269

Open

ericpre merged commit 32ac8fc into hyperspy:main Jun 5, 2024
30 checks passed

ericpre removed the status: needs review label Jun 5, 2024

ericpre mentioned this pull request Oct 29, 2024

Add chunks and distributed arguments to the ripple reader #330

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for dask distributed scheduler in quantum detector reader #267

Add support for dask distributed scheduler in quantum detector reader #267

ericpre commented May 29, 2024 •

edited

Loading

ericpre commented May 29, 2024

CSSFrancis left a comment

CSSFrancis Jun 3, 2024

ericpre Jun 5, 2024

Add support for dask distributed scheduler in quantum detector reader #267

Add support for dask distributed scheduler in quantum detector reader #267

Conversation

ericpre commented May 29, 2024 • edited Loading

Progress of the PR

ericpre commented May 29, 2024

CSSFrancis left a comment

Choose a reason for hiding this comment

CSSFrancis Jun 3, 2024

Choose a reason for hiding this comment

ericpre Jun 5, 2024

Choose a reason for hiding this comment

ericpre commented May 29, 2024 •

edited

Loading