Predicting the Best of the N Visual Trackers

...Repository still under construction...

Predicting the Best of the N Visual Trackers

This repository provides all codes and resources on our work on Predicting the best of N Trackers (BofN).

Abstract

We observe that the performance of SOTA visual trackers surprisingly strongly varies across different video attributes and datasets. No single tracker remains the best performer across all tracking attributes and datasets. To bridge this gap, for a given video sequence, we predict the “Best of the N Trackers”, called the BofN meta-tracker. At its core, a Tracking Performance Prediction Network (TP2N) selects a predicted best performing visual tracker for the given video sequence using only a few initial frames. We also introduce a frame-level BofN meta-tracker which keeps predicting best performer after regular temporal intervals. The TP2N is based on self-supervised learning architectures MocoV2, SwAv, BT, and DINO; experiments show that the DINO with ViT-S as a backbone performs the best. The video-level BofN meta-tracker outperforms, by a large margin, existing SOTA trackers on nine standard benchmarks – LaSOT, TrackingNet, GOT-10K, VOT2019, VOT2021, VOT2022, UAV123, OTB100, and WebUAV-3M. Further improvement is achieved by the frame-level BofN meta-tracker effectively handling variations in the tracking scenarios within long sequences. For instance, on GOT-10k, BofN meta-tracker average overlap is 88.7% and 91.1% with video and frame-level settings respectively. The best performing tracker, RTS, achieves 85.20% AO. On VOT2022, BofN expected average overlap is 67.88% and 70.98% with video and frame level settings, compared to the best performing ARTrack, 64.12%. This work also presents an extensive evaluation of competitive tracking methods on all commonly used benchmarks, following their protocols.

Methodology

Please find details in our paper which can be accessed here.

This work utilized the following trackers and others:

Results

Environment Setup

Create the python environment

conda create -y --name n_trackers python==3.7.16
conda activate n_trackers

Install pytorch and torchvision

pip install torch==1.10.0+cu111 torchvision==0.11.0+cu111 torchaudio==0.10.0 -f https://download.pytorch.org/whl/torch_stable.html

Install other packages

pip install -r requirements.txt

Training the Tracker Predictor

The following datasets has been utilized to train the tracker predictor

Got10k Train set
LaSOT Train set
TrackingNet Train set

Download the datasets above and put them in the testing_datasets folder.For ease of training, some of the video folders have been renamed, especially for the TrackingNet dataset (see the excel file below).
The excel file all_train_LASOT_GOT10k_TrackingNet_new.xlsx contains the tracking success rate of the trackers on the videos in the dataset. This result is used to generate the output of the predictor where the tracker with the highest success rate is 1 and the others, 0 for each video.
Training Predictor With ResNet Backbone

python classifier_all_data_1.py

Training Predictor With Vision Transformer (ViT) Backbone

python classifier_all_data_2.py

Testing/Tracking with the Predicted Tracker

To track, first download the pre-trained models of the 7 base trackers from the below links and put them in the trained trackers folder.
1. ARDiMP
2. KeepTrack
3. STMTrack
4. TransT
5. ToMP
6. RTS
7. SparseTT
To track, simply run the main_eval.py file. Tracking results will be found in the tracker_results folder.

python main_eval.py

NOTE: This will run the main trackers and also run the best of them on the videos using both ResNet and ViT backbones.

Citation

If you find our work useful for your research, please consider citing:

@article{Alawode2024,
    archivePrefix = {arXiv},
    arxivId = {2407.15707},
    author = {Alawode, Basit and Javed, Sajid and Mahmood, Arif and Matas, Jiri},
    eprint = {2407.15707},
    number = {8},
    pages = {1--12},
    title = {{Predicting the Best of N Visual Trackers}},
    url = {http://arxiv.org/abs/2407.15707},
    volume = {14},
    year = {2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
TransformerTrack		TransformerTrack
ardimp		ardimp
basit_codes		basit_codes
images		images
ltr		ltr
pysot		pysot
pytracking		pytracking
sparsett		sparsett
stmtrack		stmtrack
testing_datasets		testing_datasets
toolkit		toolkit
tracker_classifier		tracker_classifier
tracker_results		tracker_results
trained_trackers		trained_trackers
transt		transt
videoanalyst		videoanalyst
.gitattributes		.gitattributes
LICENSE		LICENSE
README.md		README.md
all_train_LASOT_GOT10k_TrackingNet_new.xlsx		all_train_LASOT_GOT10k_TrackingNet_new.xlsx
classifier_all_data_1.py		classifier_all_data_1.py
classifier_all_data_2.py		classifier_all_data_2.py
main_eval.py		main_eval.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

...Repository still under construction...

Predicting the Best of the N Visual Trackers

Abstract

Methodology

This work utilized the following trackers and others:

Results

Environment Setup

Training the Tracker Predictor

Testing/Tracking with the Predicted Tracker

Citation

About

Releases

Packages

Languages

License

BasitAlawode/Best_of_N_Trackers

Folders and files

Latest commit

History

Repository files navigation

...Repository still under construction...

Predicting the Best of the N Visual Trackers

Abstract

Methodology

This work utilized the following trackers and others:

Results

Environment Setup

Training the Tracker Predictor

Testing/Tracking with the Predicted Tracker

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages