Skip to content

[NeurIPS’23] A Multi-modal Global Instance Tracking Benchmark (MGIT): Better Locating Target in Complex Spatio-temporal and Causal Relationship

Notifications You must be signed in to change notification settings

Xuchen-Li/MGIT

 
 

Repository files navigation

VideoCube & MGIT Python Toolkit

UPDATE:
[2024.04.01] Update several missing files for the MGIT train set, based on issue #6.
[2023.12.14] Recently, we have proposed a new multi-modal global instance tracking benchmark named MGIT, which consists of 150 long video sequences (all sequences are the same as those in VideoCube-Tiny, but with additional semantic information). This toolkit has been updated to support the MGIT, and you can find a demo in test_mgit.py.
[2023.02.08] To make it easier for users to research with VideoCube, we have selected 150 representative sequences from the original version (500 sequences) to form VideoCube-Tiny. This toolkit has been updated to support the VideoCube-Tiny, and you can find a demo to select tiny or full version in test_videocube.py.
[2022.06.20] Update the SiamFC based on issue #2 and issue #3.
[2022.04.25] Update the time calculation method based on issue #1.
[2022.03.03] Update the toolkit installation, dataset download instructions and a concise example. Now the basic function of this toolkit has been finished.

This repository contains the official python toolkit for running experiments and evaluate performance on VideoCube & MGIT benchmark. The code is written in pure python and is compile-free.

VideoCube is a high-quality and large-scale benchmark to create a challenging real-world experimental environment for Global Instance Tracking (GIT) task. If you use the VideoCube database or toolkits for a research publication, please consider citing:

@ARTICLE{9720246,
  author={Hu, Shiyu and Zhao, Xin and Huang, Lianghua and Huang, Kaiqi},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence}, 
  title={Global Instance Tracking: Locating Target More Like Humans}, 
  year={2023},
  volume={45},
  number={1},
  pages={576-592},
  doi={10.1109/TPAMI.2022.3153312}}

 [Project][PDF]

MGIT is a multi-modal global instance tracking benchmark to fully represent the complex spatio-temporal and casual relationships coupled in longer narrative content. If you use the MGIT database or toolkits for a research publication, please consider citing:

@inproceedings{hu2023multi,
  title={A Multi-modal Global Instance Tracking Benchmark (MGIT): Better Locating Target in Complex Spatio-temporal and Causal Relationship},
  author={Hu, Shiyu and Zhang, Dailing and Wu, Meiqi and Feng, Xiaokun and Li, Xuchen and Zhao, Xin and Huang, Kaiqi},
  booktitle={Thirty-seventh Conference on Neural Information Processing Systems Datasets and Benchmarks Track},
  year={2023}
}

 [Project][PDF]

Table of Contents

Toolkit Installation

Clone the repository and install dependencies:

git clone https://github.com/huuuuusy/videocube-toolkit.git
pip install -r requirements.txt

Then directly copy the videocube folder to your workspace to use it.

Dataset Download

Please view the Download page in the project website.

The VideoCube dataset includes 500 sequences, divided into three subsets (train/val/test). The content distribution in each subset still follows the 6D principle proposed in the GIT paper.

The MGIT dataset includes 150 sequences, divided into three subsets (train/val/test). All sequences in MGIT are the same as those in VideoCube-Tiny, but with additional semantic information.

The dataset download and file organization process is as follows:

  • Download three subsets (train/val/test) and the info data via Download page in the project website.

  • Note that we have released a tiny version named VideoCube-Tiny. The original full version includes 500 sequences (1.4T), while the tiny version includes 150 sequences (344G). Based on VideoCube-Tiny, MGIT includes 150 sequences (344G) as well.

  • Check the number of files in each subset and run the unzipping script. Before unzipping:

    • the train subset of full version should include 456 files (455 data files and an unzip_train bash); the train subset of tiny version should include 129 files (128 data files and an unzip_train bash)

    • the val subset of full version should include 69 files (68 data files and an unzip_val bash); the val subset of tiny version should include 22 files (21 data files and an unzip_val bash)

    • the test subset of full version should include 140 files (139 data files and an unzip_test bash); the test subset of tiny version should include 41 files (40 data files and an unzip_test bash)

  • Run the unzipping script in each subset folder, and delete the script after decompression.

  • Taking val subset of full version as an example, the folder structure should follow:

|-- val/
|  |-- 005/
|  |  |-- frame_005/
|  |  |  |-- 000000.jpg/
|  |  |      ......
|  |  |  |-- 016891.jpg/
|  |-- 008/
|  |   ......
|  |-- 486/
|  |-- 493/
  • Unzip attribute.zip in info data. Attention that we only provide properties files for train and val subsets. For ground-truth files, we only offer a small number of annotations (restart frames) for sequences that belong to the test subset. Please upload the final results to the server for evaluation.

  • Rename and organize folders as follows (this example is illustrated for full version, while tiny version has the similar data structure):

|-- VideoCube/
|  |-- data/
|  |  |-- train/
|  |  |  |-- 002/
|  |  |  |   ......
|  |  |  |-- 499/
|  |  |-- val/
|  |  |  |-- 005/
|  |  |  |   ......
|  |  |  |-- 493/
|  |  |-- test/
|  |  |  |-- 001/
|  |  |  |   ......
|  |  |  |-- 500/
|  |  |-- train_list.txt
|  |  |-- val_list.txt
|  |  |-- test_list.txt
|  |-- attribute/
|  |  |-- absent/
|  |  |-- color_constancy_tran/
|  |  |   ......
|  |  |-- shotcut/

|-- MGIT/
|  |-- data/
|  |  |-- train/
|  |  |  |-- 002/
|  |  |  |   ......
|  |  |  |-- 480/
|  |  |-- val/
|  |  |  |-- 005/
|  |  |  |   ......
|  |  |  |-- 362/
|  |  |-- test/
|  |  |  |-- 001/
|  |  |  |   ......
|  |  |  |-- 498/
|  |-- attribute/
|  |  |-- absent/
|  |  |-- color_constancy_tran/
|  |  |   ......
|  |  |-- shotcut/

A Concise Example

VideoCube

test_videocube.py is a simple example on how to use the toolkit to define a tracker, run experiments on VideoCube and evaluate performance.

How to Define a Tracker?

To define a tracker using the toolkit, simply inherit and override init and update methods from the Tracker class. You can find an example in this page. Here is a simple example:

from videocube.trackers import Tracker

class IdentityTracker(Tracker):
    def __init__(self):
        super(IdentityTracker, self).__init__(
            name='IdentityTracker',  # tracker name
        )
    
    def init(self, image, box):
        self.box = box

    def update(self, image):
        return self.box
How to Run Experiments on VideoCube?

Instantiate an ExperimentVideoCube object, and leave all experiment pipelines to its run method:

from videocube.experiments import ExperimentVideoCube

# ... tracker definition ...

# instantiate a tracker
tracker = IdentityTracker()

# setup experiment (validation subset)
experiment = ExperimentVideoCube(
  root_dir='SOT/VideoCube', # VideoCube's root directory
  save_dir= os.path.join(root_dir,'result'), # the path to save the experiment result
  subset='val', # 'train' | 'val' | 'test'
  repetition=1,
  version='tiny' # set the version as 'tiny' or 'full' 
)
experiment.run(
  tracker, 
  visualize=False,
  save_img=False, 
  method='restart' # method in run function means the evaluation mechanism, you can select the original mode (set none) or the restart mode (set restart)
  )
How to Evaluate Performance?

For evaluation in OPE mechanism, please use the report method of ExperimentVideoCube for this purpose:

# ... run experiments on VideoCube ...

# report tracking performance
experiment.report([tracker.name],attribute_name)

For evaluation in R-OPE mechanism, please use the report and report_robust method of ExperimentVideoCube for this purpose:

# ... run experiments on VideoCube ...

# report tracking performance
experiment.report([tracker.name],attribute_name)
experiment.report_robust([tracker.name])

Attention, when evaluated on the test subset, you will have to submit your results to the evaluation server for evaluation. The report function will generate a .zip file which can be directly uploaded for submission. For more instructions, see submission instruction.

See public evaluation results on VideoCube's leaderboard (OPE Mechanism) and VideoCube's leaderboard (R-OPE Mechanism).

MGIT

test_mgit.py is a simple example on how to use the toolkit to convert tracking results of algorithms based on pytracking framework e.g. JointNLT to MGIT format and evaluate performance.

How to Convert Tracking Results?

Most of the existing multi-modal tracking algorithms e.g. JointNLT are based on pytracking framework, and you can easily combine MGIT with them for training and testing with the help of mgit.py, without spending a lot of time refactoring the algorithm for MGIT. To reduce the difficulty of development, researchers can convert the results on MGIT of any algorithm developed based on the pytracking framework into a format that is easy to evaluate.

Instantiate an ExperimentMGIT object, and leave all experiment pipelines to its convert_results method:

from mgit.experiments import ExperimentMGIT

# ... path setting ...

# setup experiment (validation subset)
experiment = ExperimentMGIT(
  dataset_dir='/path_to_MGIT', # MGIT's root directory
  save_dir= os.path.join(root_dir,'result'), # the path to save the experiment result
  subset='val', # 'train' | 'val' | 'test'
  repetition=1,
  version='tiny' # temporarily, the toolkit only support tiny version of MGIT
)
experiment.convert_results(
  # the original results path will be "root_dir/tracker_name/original_results_folder", and the converted results will be saved to "root_dir/result"
  root_dir='/path_to_original_results', # the path of tracker e.g. JointNLT results
  tracker_name, # tracker name
  original_results_folder # original result folder name
  )
How to Evaluate Performance?

For evaluation in OPE mechanism, please use the report method of ExperimentMGIT for this purpose:

# ... convert tracking results to MGIT format ...

# report tracking performance
experiment.report([tracker.name],attribute_name)

Attention, when evaluated on the test subset, you will have to submit your results to the evaluation server for evaluation. The report function will generate a .zip file which can be directly uploaded for submission. For more instructions, see submission instruction.

See public evaluation results on MGIT's leaderboard (OPE Mechanism with Tiny Version, Action Granularity), MGIT's leaderboard (OPE Mechanism with Tiny Version, Activity Granularity) and MGIT's leaderboard (OPE Mechanism with Tiny Version, Story Granularity).

Issues

Please report any problems or suggessions in the Issues page.

Contributors

About

[NeurIPS’23] A Multi-modal Global Instance Tracking Benchmark (MGIT): Better Locating Target in Complex Spatio-temporal and Causal Relationship

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%