Synthetic and real video dataset with temporal logic annotation
Explore the docs »
NSVS-TL Project Webpage
·
NSVS-TL Source Code
The Temporal Logic Video (TLV) Dataset addresses the scarcity of state-of-the-art video datasets for long-horizon, temporally extended activity and object detection. It comprises two main components:
- Synthetic datasets: Generated by concatenating static images from established computer vision datasets (COCO and ImageNet), allowing for the introduction of a wide range of Temporal Logic (TL) specifications.
- Real-world datasets: Based on open-source autonomous vehicle (AV) driving datasets, specifically NuScenes and Waymo.
- Dataset Composition
- Dataset (Release)
- Installation
- Usage
- Data Generation
- Contribution Guidelines
- License
- Acknowledgments
- Source: COCO and ImageNet
- Purpose: Introduce artificial Temporal Logic specifications
- Generation Method: Image stitching from static datasets
- Sources: NuScenes and Waymo
- Purpose: Provide real-world autonomous vehicle scenarios
- Annotation: Temporal Logic specifications added to existing data
Though we provide a source code to generate datasets from different types of data sources, we release a dataset v1 as a proof of concept.
We provide a v1 dataset as a proof of concept. The data is offered as serialized objects, each containing a set of frames with annotations. You can download the dataset from our dataset repository in Hugging Face.
\<tlv_data_type\>:source:\<datasource\>-number_of_frames:\<number_of_frames\>-\<uuid\>.pkl
Each serialized object contains the following attributes:
ground_truth
: Boolean indicating whether the dataset contains ground truth labelsltl_formula
: Temporal logic formula applied to the datasetproposition
: A set of proposition for ltl_formulanumber_of_frame
: Total number of frames in the datasetframes_of_interest
: Frames of interest which satisfy the ltl_formulalabels_of_frames
: Labels for each frameimages_of_frames
: Image data for each frame
You can download a dataset from here. The structure of dataset is as follows: serializer
tlv-dataset-v1/
├── tlv_real_dataset/
├──── prop1Uprop2/
├──── (prop1&prop2)Uprop3/
├── tlv_synthetic_dataset/
├──── Fprop1/
├──── Gprop1/
├──── prop1&prop2/
├──── prop1Uprop2/
└──── (prop1&prop2)Uprop3/
- Total Number of Frames
Ground Truth TL Specifications | Synthetic TLV Dataset | Real TLV Dataset | ||
---|---|---|---|---|
COCO | ImageNet | Waymo | Nuscenes | |
Eventually Event A | - | 15,750 | - | - |
Always Event A | - | 15,750 | - | - |
Event A And Event B | 31,500 | - | - | - |
Event A Until Event B | 15,750 | 15,750 | 8,736 | 19,808 |
(Event A And Event B) Until Event C | 5,789 | - | 7,459 | 7,459 |
- Total Number of datasets
Ground Truth TL Specifications | Synthetic TLV Dataset | Real TLV Dataset | ||
---|---|---|---|---|
COCO | ImageNet | Waymo | Nuscenes | |
Eventually Event A | - | 60 | - | - |
Always Event A | - | 60 | - | - |
Event A And Event B | 120 | - | - | - |
Event A Until Event B | 60 | 60 | 45 | 494 |
(Event A And Event B) Until Event C | 97 | - | 30 | 186 |
python -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip build
python -m pip install --editable ."[dev, test]"
-
ImageNet (ILSVRC 2017):
ILSVRC/ ├── Annotations/ ├── Data/ ├── ImageSets/ └── LOC_synset_mapping.txt
-
COCO (2017):
COCO/ └── 2017/ ├── annotations/ ├── train2017/ └── val2017/
Detailed usage instructions for data loading and processing.
data_root_dir
: Root directory of the datasetmapping_to
: Label mapping scheme (default: "coco")save_dir
: Output directory for processed data
initial_number_of_frame
: Starting frame count per videomax_number_frame
: Maximum frame count per videonumber_video_per_set_of_frame
: Videos to generate per frame setincrease_rate
: Frame count increment rateltl_logic
: Temporal Logic specification (e.g., "F prop1", "G prop1")save_images
: Boolean flag for saving individual frames
python3 run_scripts/run_synthetic_tlv_coco.py --data_root_dir "../COCO/2017" --save_dir "<output_dir>"
python3 run_synthetic_tlv_imagenet.py --data_root_dir "../ILSVRC" --save_dir "<output_dir>"
Note: ImageNet generator does not support '&' LTL logic formulae.
This project is licensed under the MIT License. See the LICENSE file for details.
Feel free to connect with me through these professional channels:
If you find this repo useful, please cite our paper:
@inproceedings{Choi_2024_ECCV,
author={Choi, Minkyu and Goel, Harsh and Omama, Mohammad and Yang, Yunhao and Shah, Sahil and Chinchali, Sandeep},
title={Towards Neuro-Symbolic Video Understanding},
booktitle={Proceedings of the European Conference on Computer Vision (ECCV)},
month={September},
year={2024}
}