Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

major rework - see mini changelog inside vsi2tif.py #1

Merged
merged 1 commit into from
Dec 7, 2020

Conversation

sekro
Copy link
Contributor

@sekro sekro commented Dec 7, 2020

bit unsure how you add contribution in LICENSE correctly

@andreped andreped merged commit d7f7f9f into andreped:main Dec 7, 2020
@andreped
Copy link
Owner

andreped commented Dec 7, 2020

LICENSE is probably fine.

Overall it looks good. I'm a little too busy to go fully into the code. I was expecting some small edits :P But I can take your word for it ;) I will test it as soon as I have the chance.

I'm assuming you have tested this on your workstation, right? Are you using Ubuntu 18.04?

Also, did you benchmark the CLI vs. Python solution btw? I was interesting in testing something similar, but went with the "simple" approach for convenience. Would be good to test on an open WSI. If you have time, test with OS-1 from here:
http://openslide.cs.cmu.edu/download/openslide-testdata/Olympus/

@sekro
Copy link
Contributor Author

sekro commented Dec 7, 2020

Ja well started with small changes and then suddenly... :D
tested on HUNT cloud with Ubuntu 16.04.7 LTS
libvips-tools 8.2.2-1
python 3.8.3
pyvips 2.1.13
bfconvert 6.5.1

I have not performed a proper benchmark yet. I stumbled across something about speed here: https://libvips.github.io/libvips/API/current/using-cli.html but on second read its apparently only relevant for chaining because of the filesystem operations. Anyway, the step from vsi to btf is much slower so parallel processing is probably the way to go. I just found the pyvips module simpler to use :).

@andreped
Copy link
Owner

andreped commented Dec 8, 2020

Yes, I found something similar regarding pyvips. I have actually tried to use it in one of my pipelines for reading WSIs on the fly during training.

But yes, that is what I experience as well. The first step being the bottleneck. However, I think memory usage became a challenge when I tried to do WSI-level parallel processing. Don't recall. Haven't tried this in about a year :P

For multiprocessing I think I used multiprocessing.map, or something similar like .imap/imap_unordered. Quite easy to use, requiring no real understanding in pooling/queues and stuff.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants