Awesome

Depth Attention

The official implementation for the ACCV 2024 paper [Depth Attention for Robust RGB Tracking] depth_attention (1)-1

Yu Liu, Arif Mahmood, Muhammad Haris Khan

Abstract

RGB video object tracking is a fundamental task in computer vision. Its effectiveness can be improved using depth information, particularly for handling motion-blurred target. However, depth information is often missing in commonly used tracking benchmarks. In this work, we propose a new framework that leverages monocular depth estimation to counter the challenges of tracking targets that are out of view or affected by motion blur in RGB video sequences.

News

:fire::fire::fire:

2024-10 :tada:Our new challenging dataset NT-VOT211 is available now! <sub>Click the link on the right :rewind: to access our full tutorial for benchmarking on this new dataset.</sub>

:fire::fire::fire:

Qualitative Results

DepthEstimationOnDiagram

Demo

alt text

Set up our algorithm

Environment Setup

To ensure the replication of the precise results detailed in the paper, it is crucial to match the software environment closely. Please configure your system with the following specifications:

Python: Version 3.8.10
PyTorch: Version 1.11.0, built with CUDA 11.3 support
CUDA: Version 11.3
NumPy: Version 1.22.3
OpenCV: Version 4.8.0

By adhering to these versions, you will be able to achieve consistency with the experimental setup described in the publication.

Pretrained Model Download

To ensure the accuracy and consistency of the results as reported in our paper, it is essential to use the pretrained depth estimators that have been tested and validated with our algorithm. We have found the following models to be compatible and effective:

Lite-Mono: This is the primary pretrained model we have used in our research. You can download it from the Lite-Mono GitHub repository.
FastDepth: In addition to Lite-Mono, we have also confirmed that the FastDepth model can be used with our algorithm.
Monodepth2: Another option that has been tested is the Monodepth2 model.

We recommend starting with the Lite-Mono model, as it has been extensively used in our experiments.

Setting Up the Trackers

In our quest to create a comprehensive tracking solution, we have meticulously chosen a diverse array of baseline trackers, each with its own unique strengths:

RTS: Engineered for rapid tracking, this system excels in real-time scenarios. Dive deeper
AiATrack: A cutting-edge tracker that harnesses the power of artificial intelligence. Discover more
ARTrack: Optimized for augmented reality, this tracker is a leader in its field. Find out more
KeepTrack: Renowned for its steadfast reliability and precision across a spectrum of conditions. Get the details
MixFormer: A versatile tracker that adapts to various tracking challenges. Check it out
Neighbor: This tracker focuses on proximity-based tracking for enhanced accuracy. Explore here
ODTrack: Designed for object detection and tracking in complex environments. Learn about it
STMTrack: A tracker that offers a seamless tracking experience. Read more

Together, these trackers form a powerful toolkit, adept at handling a wide range of tracking tasks across diverse settings and scenarios.

To set up these trackers, please refer to the comprehensive tutorial.

Citing Us

If you find our work valuable, we kindly ask you to consider citing our paper and starring ⭐ our repository. Our implementation includes mutiple trackers and we hope it make life easier for the VOT research community and Depth Estimation community.

@inproceedings{liu2024depth,
  title={Depth Attention for Robust RGB Tracking},
  author={Yu Liu and Arif Mahmood and Muhammad Haris Khan},
  booktitle={Proceedings of the Asian Conference on Computer Vision (ACCV)},
  pages={to be announced},
  year={2024},
  organization={Springer}
}

Maintenance

Please open a GitHub issue for any help. If you have any questions regarding the technical details, feel free to contact us.

License

MIT License