-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[IO-1445][external] Changes to LocalDataset() & get_annotations() to account for local releases pulled with folders #678
Conversation
… pulled with folders
IO-1445 BUG: Classification model failed A potential customer (trial) contacted us saying that their classification model failed
BUG submission from: Patryk Gronostajski A potential customer (trial) contacted us saying that their classification model failed. I tried running one in their team and it failed as well. They want to use the "Formal" and "Informal" tags, and all items in the dataset.
ARR Tier Assigned CSM |
Failing tests, need to investigate |
"Merge master into branch due to divergent flow"
… JSON in some scenarios
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Revisions seem fine with the QA and testing done.
… pulled with folders
Problem
When local releases are pulled with folders, construction of LocalDataset objects & the get_annotations() function fail due to a mismatch between annotation file names and item names. A more in-depth breakdown is available in the IO-1445 Linear ticket
Solution
Instead of comparing the file names of the JSON with items, parse each JSON file to guarantee the correct local path for each item is checked against. This has the side effect of introducing significantly higher runtimes when parsing large JSON files. Therefore, make LocalDataset & get_annotations() use a streaming library to prevent this.
Changelog