Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deepcell upload loop cleaning #1023

Merged
merged 26 commits into from
Jul 25, 2023
Merged

Deepcell upload loop cleaning #1023

merged 26 commits into from
Jul 25, 2023

Conversation

camisowers
Copy link
Contributor

@camisowers camisowers commented Jul 11, 2023

If you haven't already, please read through our contributing guidelines before opening your PR

What is the purpose of this PR?

Closes #628. Changes how we attempt to re-upload failed fov batches to the deepcell server.
Also refactors create_deepcell_output() with cleanlier helper functions. The code will now also check for previously processed batches, to avoid re-zipping and re-uploading already segmented batches to deepcell.

How did you implement your changes

Currently, we attempt to re-upload a failed fov batches until success or 5 retry attempts are reached. A cleaner, more long term fix is to attempt to process each batch once, check if any batch .zip files are missing, and re-run the failed batches.
The re-running continues until either a successful output or the function times out.

  • Re-run friendly additions. For each batch of fovs:
  1. Check if fovs_batch_1.zip exists, if not generate the input zip file
  2. Check if deepcell_output_fovs_batch_1.zip exists, if not upload input file to deepcell and extract segmentation tiffs

Remaining issues

  • Set a time limit for reprocessing attempts.
  • Tweak the code so it does not re-upload processed fov batches when the cell is re-run

@camisowers camisowers added the enhancement New feature or request label Jul 11, 2023
@camisowers camisowers self-assigned this Jul 11, 2023
@camisowers camisowers marked this pull request as ready for review July 17, 2023 17:49
Copy link
Contributor

@srivarra srivarra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Jus a couple of small things.

For the Retry call at line 245 num_retries is not defined.

src/ark/utils/deepcell_service_utils.py Show resolved Hide resolved
src/ark/utils/deepcell_service_utils.py Outdated Show resolved Hide resolved
@camisowers
Copy link
Contributor Author

Jus a couple of small things.

For the Retry call at line 245 num_retries is not defined.

This PR was actually meant to scrap the retry a certain amount of times logic and instead implement recall run_deepcell_direct() in a loop until success or timeout. Looks like I missed some code that needed to be removed.

Copy link
Contributor

@alex-l-kong alex-l-kong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few additional comments.

src/ark/utils/deepcell_service_utils.py Outdated Show resolved Hide resolved
src/ark/utils/deepcell_service_utils.py Outdated Show resolved Hide resolved
@camisowers camisowers requested a review from srivarra July 24, 2023 20:58
Copy link
Contributor

@alex-l-kong alex-l-kong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of control-flow comments

src/ark/utils/deepcell_service_utils.py Show resolved Hide resolved
src/ark/utils/deepcell_service_utils.py Show resolved Hide resolved
Copy link
Contributor

@srivarra srivarra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks super clean!

Copy link
Contributor

@alex-l-kong alex-l-kong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just one side question.

src/ark/utils/deepcell_service_utils.py Outdated Show resolved Hide resolved
@ngreenwald ngreenwald merged commit b09107e into main Jul 25, 2023
16 checks passed
@ngreenwald ngreenwald deleted the retry_deepcell_upload branch July 25, 2023 20:41
srivarra added a commit that referenced this pull request Jul 28, 2023
Next Release v0.6.4 (#1012)

* 0.6.4

* updated environment.yml

* Update README.md

* updated windows setup docs

* added repo verison in git clone

* file fix

Deepcell upload loop cleaning (#1023)

* loop check for successful deepcell upload

* update test zip paths

* helper functions pls

* zip input, upload, and extract per batch

* cleaning

* docstring indent

* fov -> fovs file name

* remove overwrite warning

* new dir for each test call

* fix bad logic warning testing

* remove arg from docstring

* add helper function test

* batch_num start at 1

* add previously processed warning

* set timeouts and scrap retry

* correct timeout errors and update tests

* timeout 5 mins

* add to unprocessed list after loop closes

* continue loop after extraction

* break instead of continue

* skipped processing print fixes

* adjust print statements

* 3 seconds before second zip call

update
ngreenwald pushed a commit that referenced this pull request Sep 18, 2023
* plot_cluster

Next Release v0.6.4 (#1012)

* 0.6.4

* updated environment.yml

* Update README.md

* updated windows setup docs

* added repo verison in git clone

* file fix

Deepcell upload loop cleaning (#1023)

* loop check for successful deepcell upload

* update test zip paths

* helper functions pls

* zip input, upload, and extract per batch

* cleaning

* docstring indent

* fov -> fovs file name

* remove overwrite warning

* new dir for each test call

* fix bad logic warning testing

* remove arg from docstring

* add helper function test

* batch_num start at 1

* add previously processed warning

* set timeouts and scrap retry

* correct timeout errors and update tests

* timeout 5 mins

* add to unprocessed list after loop closes

* continue loop after extraction

* break instead of continue

* skipped processing print fixes

* adjust print statements

* 3 seconds before second zip call

update

* updated readthedocs python version

* updated plot_utils docstring

* docstrings v2

* use GS [1,1] if cbar_visible is False, otherwise use GS [1, 2] for plot + colorbar

* added cbar_visible T/F for tests

* pycodestyle

* added numpy list to cmap / norm function

* Remove cells that don't have any pixel clusters expressed prior to Pixie (#1051)

* Drop cells without any SOM cluster expressions

* Add testing for dropping non-expressed cells

* Include `img_sub_folder` as parameter in Mantis calls (#1050)

* Ensure mantis can read the image sub folder specified

* Test on example dataset

* Default img_sub_folder to None in create_mantis_dir (standard used elsewhere)

* Doc fix and updated example dataset

* Update segmentation labels again

* Clarify img_sub_folder (mostly for refreshing tests)

* Force .github/get_example_dataset.py to use updated revision

* Update the file names to look for

* Update HuggingFace version (#1053)

* v0.1.10 (#1057)

* Pin to scikit-image to v0.19.3 (#1060)

* pin to 0.19.3

* requirement less than 0.19.3

* Make sure overwrite functionality for pixel clustering propagates into helper functions (#1058)

* Propagate overwrite functionality for pixel clustering

* Ensure re-normalization prevented on overwrite for pixel SOM re-assignment

* normalize_data should be the opposite of overwrite

* Single mantis directory (#1061)

* single mantis dir, no recopying

* post clustering mask suffix

* generic clustering adjustments

* mask prefix

* remove mask prefix

* Mass plotting notebook (#1052)

* added plotting notebook

* updated readmed

* git merge issues

* git merge fixes v2 - docs/landing.md

* type

* Remove cells that don't have any pixel clusters expressed prior to Pixie (#1051)

* Drop cells without any SOM cluster expressions

* Add testing for dropping non-expressed cells

* Include `img_sub_folder` as parameter in Mantis calls (#1050)

* Ensure mantis can read the image sub folder specified

* Test on example dataset

* Default img_sub_folder to None in create_mantis_dir (standard used elsewhere)

* Doc fix and updated example dataset

* Update segmentation labels again

* Clarify img_sub_folder (mostly for refreshing tests)

* Force .github/get_example_dataset.py to use updated revision

* Update the file names to look for

* fixed clipped plots in the notebooks.

* Update HuggingFace version (#1053)

* v0.1.10 (#1057)

* Added continuous variable segmentation plots

* Pin to scikit-image to v0.19.3 (#1060)

* pin to 0.19.3

* requirement less than 0.19.3

* Make sure overwrite functionality for pixel clustering propagates into helper functions (#1058)

* Propagate overwrite functionality for pixel clustering

* Ensure re-normalization prevented on overwrite for pixel SOM re-assignment

* normalize_data should be the opposite of overwrite

* Single mantis directory (#1061)

* single mantis dir, no recopying

* post clustering mask suffix

* generic clustering adjustments

* mask prefix

* remove mask prefix

* strange merge issue

* removed branch filters on CI, so if you branch off of a branch and run GHA, the CI workflow should work

* changed ci.yml, filter push on main branch only

* python 3.9 doesn't support | for Union type overloading

* added fovs parameter for continuous variable plots

* updated notebook to include plots

* updated cell clustering notebooks to add erosion (default to True), cleaned up some notebook outputs

* fixed style imports

* re-added test_generate_summary_stats. Was accidently deleted in a refactoring step.

* newline at eof fiber_segmentation_test.py

* seaborn-paper -> seaborn-v0_8-paper

* added cell to view all plotting styles i mask gen nb

* pycodestyle

---------

Co-authored-by: alex-l-kong <31424707+alex-l-kong@users.noreply.github.com>
Co-authored-by: camisowers <38049893+camisowers@users.noreply.github.com>

* default ero

* added erode=True for generate_cluster_mask

---------

Co-authored-by: alex-l-kong <31424707+alex-l-kong@users.noreply.github.com>
Co-authored-by: camisowers <38049893+camisowers@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Retry missing fov_group zip files in case of DeepCell response error
4 participants