-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Export dataset in CVAT format misses frames in tasks with non-default… #8662
Conversation
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the WalkthroughThis update to the CVAT (Computer Vision Annotation Tool) addresses issues related to dataset exports, specifically ensuring that all frames are included when a non-default frame step is used. Key modifications include improvements in the Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant ExportFunction
participant CacheManager
participant FrameProvider
User->>ExportFunction: Request dataset export
ExportFunction->>FrameProvider: Retrieve frames
FrameProvider->>FrameProvider: Iterate frames without skipping
FrameProvider-->>ExportFunction: Return frames
ExportFunction->>CacheManager: Manage export cache
CacheManager-->>ExportFunction: Handle cache errors
ExportFunction-->>User: Return exported dataset
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (2)
cvat/apps/dataset_manager/views.py (2)
Line range hint
213-214
: Enhance the FileIsBeingUsedError exception.While introducing a custom exception is good practice, consider making it more informative by:
- Adding a docstring explaining when this exception is raised
- Including file details in the error message
Consider this enhancement:
class FileIsBeingUsedError(Exception): - pass + """Raised when attempting to clear an export cache file that is still in use.""" + def __init__(self, file_path: str, ttl: timedelta): + self.file_path = file_path + self.ttl = ttl + super().__init__( + f"Export cache file '{file_path}' is still in use. " + f"Will be available for cleanup after {ttl}" + )
Line range hint
31-33
: Consider addressing the TODO comment regarding RQ scheduler.The comment indicates a potential race condition where two jobs with the same name could run in parallel during retries. Consider:
- Implementing proper retries using RQ's built-in scheduler
- Adding a unique suffix to retried job IDs to prevent parallel execution
Would you like me to help design a more robust retry mechanism that prevents this race condition?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (3)
changelog.d/20241107_165701_sekachev.bs_fixed_export.md
(1 hunks)cvat/apps/dataset_manager/views.py
(1 hunks)cvat/apps/engine/frame_provider.py
(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- changelog.d/20241107_165701_sekachev.bs_fixed_export.md
🔇 Additional comments (2)
cvat/apps/dataset_manager/views.py (1)
125-126
: Efficient database query optimization.
The change to use values_list('updated_date', flat=True)
instead of fetching full task instances is a good optimization that reduces memory usage and database load.
Let's verify if this change fully addresses the frame step issue mentioned in the PR objectives:
cvat/apps/engine/frame_provider.py (1)
372-372
: LGTM! The change fixes frame skipping during iteration.
The modification ensures that all frames are processed during iteration, addressing the issue of missing frames in exports with non-default frame steps.
Let's verify that this change doesn't affect other functionality that might depend on frame steps:
✅ Verification successful
Frame step is still properly handled in other code paths
The verification shows that frame steps are correctly maintained throughout the codebase:
- Frame step is used for absolute/relative frame number conversions
- Frame ranges in segments and jobs respect the step
- Frame step is properly applied in data import/export paths
- The change in
TaskFrameProvider.iterate_frames()
only affects the iteration order, while the frame mapping logic remains intact
The removal of frame step from itertools.count()
is safe as it only changes how frames are iterated internally within the provider, while all frame number translations still respect the configured step.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for other references to frame steps to ensure they're not affected
rg "get_frame_step" -A 5
Length of output: 16932
Co-authored-by: Maxim Zhiltsov <zhiltsov.max35@gmail.com>
@@ -369,7 +369,7 @@ def iterate_frames( | |||
quality: FrameQuality = FrameQuality.ORIGINAL, | |||
out_type: FrameOutputType = FrameOutputType.BUFFER, | |||
) -> Iterator[DataWithMeta[AnyFrame]]: | |||
frame_range = itertools.count(start_frame, self._db_task.data.get_frame_step()) | |||
frame_range = itertools.count(start_frame) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably, db_segment_frame_set
below has to be converted to relative ids. There is the get_rel_frame_number()
method for this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Converted idx
to absolute instead as it done in another iterate_frames
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, please call dev/format_python_code.sh
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not to add it to git pre-commit hook?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a problem with determining the right interpreter in a cross-platform manner.
It would be nice to add a test for this to avoid regressions in future. |
Yes, it would be. Hovewer I do not have enough time. Let's keep the card on the agile board. |
Quality Gate passedIssues Measures |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## develop #8662 +/- ##
========================================
Coverage 74.24% 74.25%
========================================
Files 401 401
Lines 43465 43465
Branches 3950 3950
========================================
+ Hits 32270 32273 +3
+ Misses 11195 11192 -3
|
… frame step
Motivation and context
How has this been tested?
Checklist
develop
branch(cvat-canvas,
cvat-core,
cvat-data and
cvat-ui)
License
Feel free to contact the maintainers if that's a concern.
Summary by CodeRabbit
Bug Fixes
Improvements
Changes