-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DAR-1679][External] Made darwin-py ignore the .v7/metadata.json properties manifest when reading JSON annotation files #823
Conversation
Since we're doing the same |
darwin/dataset/local_dataset.py
Outdated
return ( | ||
str(e) | ||
for e in sorted(annotations_dir.glob("**/*.json")) | ||
if "/.v7/" not in str(e) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if "/.v7/" not in str(e)
Do we need this part here? 🤔 as glob("**/*.json")
is anyways fetching only the json files?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we do sadly, and I can reproduce this by instantiating a LocalDataset
- The **
in glob()
is recursively checking the annotations directory. Since it now will contain .v7/metadata.json
, it picks this file up unless we tell it to ignore filepaths containing /.v7/
The code I used to reproduce this is below - Before running it, a dataset release has to be pulled using 0.8.59 so that the metadata file is present in the annotations directory
from pathlib import Path
from darwin.dataset import LocalDataset
my_dataset = LocalDataset(
dataset_path=Path("/path/to/dataset"),
annotation_type="bounding_box",
)
Problem
In a previous bug fix (DAR-1609) it was discovered that pulling a dataset release does not include the properties metadata file if one needs to be downloaded. This was fixed in that ticket. However, there are several functions in darwin-py that assume every JSON file in the annotations directory is annotation file. If there's a properties metadata file, this is no longer true, and this causes issues
Solution
.v7/metadata.json
properties manifest in each caseChangelog
Made darwin-py ignore the .v7/metadata.json properties manifest when reading JSON annotation files