-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[PY-611] Silently ignore "empty" masks during import #741
Conversation
darwin/importer/importer.py
Outdated
_annotation_mask_reference_i = mask_annotation_ids_mapping[_annotation_id] | ||
|
||
if dense_rle is not None: | ||
for i in range(0, len(dense_rle), 2): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this loop may take few seconds, in the end this in extreme case is as long as many pixels we have on the image. Meanwhile, for each mask we're doing this search, so quite high time complexity. Maybe I'm a bit too paranoid, but wdyt about building first some kind of hash lookup table for those IDs instead of looping though same dense_rle
so many times?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! Optimised the code in a new commit - introduced 2 variables (so that we don't have to iterate annotations every single time):
rl
: stores raster_layer annotationrl_dense_rle_ids
: stores a simple hash-set of dense_rle - only pair[0] unique keys. Example:{0, 2}
Iteration on dense_rle is optimised as well, was iterating on every single element in the list before, now moved to a simple set conversion like rl_dense_rle_ids = set(rl.data["dense_rle"][::2])
- this will collect all unique pair[0] keys that we need.
Please have a look and let me know your thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is something we should test. Not here, but e2e should include an import raster mask with/without empty layers. Can you add tickets to linear under Py Grooming and Backlog for this.
Problem
Silently ignore "empty" masks (i.e. masks that do not have a corresponding raster layer) during import
Solution
During import, checking if the mask annotation is present in
raster_layer
- then further checking the same in the wholedense_rle
flat array, if not present, don't import the mask (i.e., remove the mask annotation from import request data and also remove it from raster_layer mask-mapping)Changelog
Silently ignore "empty" masks (i.e. masks that do not have a corresponding raster layer) during import