-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[PLA-585][external] Allow imports where some annotation frames are missing a main type #788
Conversation
…id E2E test flakiness
removed_in="0.8.0", | ||
current_version=__version__, | ||
details="The api_url parameter will be removed.", | ||
) | ||
def download_all_images_from_annotations( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is used by pull()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the deprecation warning points to a api_url
parameter:
details="The api_url parameter will be removed.",
If we're removing the deprecation warning - Should we remove the api_url
parameter from the function download_all_images_from_annotations
?
I found a PR #335 which introduced the deprecation 👀
darwin/exporter/formats/coco.py
Outdated
@@ -401,7 +82,7 @@ def _calculate_tag_categories( | |||
categories[annotation_class.name] = _calculate_category_id( | |||
annotation_class | |||
) | |||
return categories | |||
return dict(sorted(categories.items(), key=lambda item: item[1])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes are to Ensure COCO categories are always in the same, ascending order to avoid E2E flakiness
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for fixing this!! 🥇
We could also replace the lambda function here and use python's operator.itemgetter (it can be a little faster than lambda function, especially for large dictionaries), by something like:
from operator import itemgetter # importing it
- return dict(sorted(categories.items(), key=lambda item: item[1]))
+ return dict(sorted(categories.items(), key=itemgetter(1)))
darwin/importer/importer.py
Outdated
removed_in="0.8.0", | ||
current_version=__version__, | ||
details=DEPRECATION_MESSAGE, | ||
) | ||
def build_main_annotations_lookup_table( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is used when importing annotations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder what was the goal behind deprecating this? 🤔
Ah found it in this PR e034b37#diff-c59e878e89aaf69bda12258e348fc25d69750c2841667a84da53f2d4f5bf60feR26-R28
Should we make it private instead by prefixing it with _
?
darwin/importer/importer.py
Outdated
removed_in="0.8.0", | ||
current_version=__version__, | ||
details=DEPRECATION_MESSAGE, | ||
) | ||
def find_and_parse( # noqa: C901 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is used when importing annotations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RE:
I wonder what was the goal behind deprecating this? 🤔
Ah found it in this PR e034b37#diff-c59e878e89aaf69bda12258e348fc25d69750c2841667a84da53f2d4f5bf60feR26-R28
Should we make it private instead by prefixing it with _
?
darwin/importer/importer.py
Outdated
removed_in="0.8.0", | ||
current_version=__version__, | ||
details=DEPRECATION_MESSAGE, | ||
) | ||
def build_attribute_lookup(dataset: "RemoteDataset") -> Dict[str, Unknown]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is used when importing annotations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RE:
I wonder what was the goal behind deprecating this? 🤔
Ah found it in this PR e034b37#diff-c59e878e89aaf69bda12258e348fc25d69750c2841667a84da53f2d4f5bf60feR26-R28
Should we make it private instead by prefixing it with _
?
darwin/importer/importer.py
Outdated
removed_in="0.8.0", | ||
current_version=__version__, | ||
details=DEPRECATION_MESSAGE, | ||
) | ||
def get_remote_files( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is used when importing annotations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RE:
I wonder what was the goal behind deprecating this? 🤔
Ah found it in this PR e034b37#diff-c59e878e89aaf69bda12258e348fc25d69750c2841667a84da53f2d4f5bf60feR26-R28
Should we make it private instead by prefixing it with _
?
if "polygon" in data: | ||
if len(annotation.data["paths"]) > 1: | ||
data["polygon"] = { | ||
"path": annotation.data["paths"][0], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When importing polygons, they have to be structured as a primary path, and any subsequent paths have to be passed in additional_paths
in the payload
@@ -128,7 +106,6 @@ def is_image_extension_allowed_by_filename(filename: str) -> bool: | |||
return any(filename.lower().endswith(ext) for ext in SUPPORTED_IMAGE_EXTENSIONS) | |||
|
|||
|
|||
@deprecation.deprecated(deprecated_in="0.8.4", current_version=__version__) | |||
def is_image_extension_allowed(extension: str) -> bool: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is used when pulling exports
removed_in="0.8.0", | ||
current_version=__version__, | ||
details="The api_url parameter will be removed.", | ||
) | ||
def download_all_images_from_annotations( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the deprecation warning points to a api_url
parameter:
details="The api_url parameter will be removed.",
If we're removing the deprecation warning - Should we remove the api_url
parameter from the function download_all_images_from_annotations
?
I found a PR #335 which introduced the deprecation 👀
darwin/exporter/exporter.py
Outdated
@@ -36,6 +36,7 @@ def darwin_to_dt_gen( | |||
for d in split_video_annotation(data): | |||
d.seq = count | |||
count += 1 | |||
print(count) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the print for debugging? 🤔
darwin/exporter/formats/coco.py
Outdated
@@ -401,7 +82,7 @@ def _calculate_tag_categories( | |||
categories[annotation_class.name] = _calculate_category_id( | |||
annotation_class | |||
) | |||
return categories | |||
return dict(sorted(categories.items(), key=lambda item: item[1])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for fixing this!! 🥇
We could also replace the lambda function here and use python's operator.itemgetter (it can be a little faster than lambda function, especially for large dictionaries), by something like:
from operator import itemgetter # importing it
- return dict(sorted(categories.items(), key=lambda item: item[1]))
+ return dict(sorted(categories.items(), key=itemgetter(1)))
darwin/exporter/formats/coco.py
Outdated
@@ -476,6 +157,7 @@ def _build_annotations( | |||
) -> Iterator[Optional[Dict[str, Any]]]: | |||
annotation_id = 0 | |||
for annotation_file in annotation_files: | |||
print(annotation_file.filename) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the print for debugging? 🤔
darwin/importer/importer.py
Outdated
removed_in="0.8.0", | ||
current_version=__version__, | ||
details=DEPRECATION_MESSAGE, | ||
) | ||
def build_main_annotations_lookup_table( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder what was the goal behind deprecating this? 🤔
Ah found it in this PR e034b37#diff-c59e878e89aaf69bda12258e348fc25d69750c2841667a84da53f2d4f5bf60feR26-R28
Should we make it private instead by prefixing it with _
?
darwin/importer/importer.py
Outdated
removed_in="0.8.0", | ||
current_version=__version__, | ||
details=DEPRECATION_MESSAGE, | ||
) | ||
def find_and_parse( # noqa: C901 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RE:
I wonder what was the goal behind deprecating this? 🤔
Ah found it in this PR e034b37#diff-c59e878e89aaf69bda12258e348fc25d69750c2841667a84da53f2d4f5bf60feR26-R28
Should we make it private instead by prefixing it with _
?
darwin/importer/importer.py
Outdated
removed_in="0.8.0", | ||
current_version=__version__, | ||
details=DEPRECATION_MESSAGE, | ||
) | ||
def build_attribute_lookup(dataset: "RemoteDataset") -> Dict[str, Unknown]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RE:
I wonder what was the goal behind deprecating this? 🤔
Ah found it in this PR e034b37#diff-c59e878e89aaf69bda12258e348fc25d69750c2841667a84da53f2d4f5bf60feR26-R28
Should we make it private instead by prefixing it with _
?
darwin/importer/importer.py
Outdated
removed_in="0.8.0", | ||
current_version=__version__, | ||
details=DEPRECATION_MESSAGE, | ||
) | ||
def get_remote_files( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RE:
I wonder what was the goal behind deprecating this? 🤔
Ah found it in this PR e034b37#diff-c59e878e89aaf69bda12258e348fc25d69750c2841667a84da53f2d4f5bf60feR26-R28
Should we make it private instead by prefixing it with _
?
Co-authored-by: saurbhc <sc@saurabhchopra.co.uk>
@saurbhc I've actioned all points of feedback |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you 🙏
Problem
This PR solves one main problem (1) but includes lots of clean up and refactoring of old V1 code and concepts as well (2 & 3). It's a very large PR, so I've annotated it heavily to help with reviewing it
only_keyframes
field in the JSON to reflect this. Unfortunately, darwin-py import logic was built on the assumption that every frame has a main annotation type, which this change violates, breaking these importscomplex_polygon
annotation type. In V1, we stored polygons as complex polygons if they had >1 path. We removed this distinction in V2, instead referring to all polygons as typepolygon
. Therefore we should update darwin-py to reflect thisSolution
only_keyframes
set to True, get the first frame with a main type. Store this type & annotation data. Then, iterate through each frame of the annotation. If the main type is missing, use the stored value to populate it for import. Otherwise if the main type is present, update the stored annotation data. Repeat this for the entire annotationpolygon
to reflect how we store polygons in our DBChangelog