Home

Awesome

A Comprehensive Survey for Multi-Modal Visual Object tracking

<p align="center"> <img src="https://github.com/Zhangyong-Tang/MultiModal-Visual-Object-tracking/blob/main/mindmap-1114.png" width="800"> </p>

We appreciate any efforts committed to the development of Multi-Modal Visual Object Tracking. Please feel free to connect us (zhangyong_tang_jnu@163.com,tianyang.xu@jiangnan.edu.cn) for discussion or missed works.

💥: Update Log

🏃: Find Results Here

:point_right: : Our contributions to the MMVOT community

:punch: : Find what you want quickly

RGBT Tracking Papers

RGBD Tracking Papers

RGBE Tracking Papers

RGBL Tracking Papers

RGBNIR Tracking Papers

RGBS Tracking Papers

RGB+Multi Tracking Papers

:punch: :Datasets and Benchmarks

RGB-Mutli Datasets

DatasetPublishGitHubIntroduction
UniMod1KIJCV'2024UniMod1KRGB+D+L: UniMod1K: Towards a More Universal Large-Scale Dataset and Benchmark for Multi-modal Learning
WebUAV-3MTPAMI'2023WebUAV-3MRGB+L+Audio: WebUAV-3M: A Benchmark for Unveiling the Power of Million-Scale Deep UAV Tracking

RGBT Datasets

DatasetPublishGitHubIntroduction
MV-RGBTArxiv'2024MV-RGBTRevisiting RGBT Tracking Benchmarks from the Perspective of Modality Validity: A New Benchmark, Problem, and Method
VTUAVCVPR'2022VTUAVVisible-Thermal UAV Tracking: A Large-Scale Benchmark and New Baseline
LasHeRTIP'2021LasHeRLasHeR: A Large-scale High-diversity Benchmark for RGBT Tracking
VOT-RGBT20VOT Community: 2020VOT-RGBT2020 CODE:TZYDThe Eighth Visual Object Tracking VOT2020 Challenge Results
VOT-RGBT19VOT Community:2019VOT-RGBT2019 CODE:TZYDThe Seventh Visual Object Tracking VOT2019 Challenge Results
RGBT234PR'2018RGBT234RGB-T object tracking: Benchmark and baseline
RGBT210ACM MM'2017RGBT210Weighted Sparse Representation Regularized Graph Learning for RGB-T Object Tracking
GTOTTIP'2016GTOTLearning Collaborative Sparse Representation for Grayscale-Thermal Tracking
LITIVCVIU'2012LITIVAn iterative integrated framework for thermal–visible image registration, sensor fusion, and people tracking for video surveillance applications
OTCBVSCVIU'2007OTCBVSBackground-subtraction using contour-based fusionof thermal and visible imagery
LSS-Dataset(from RGB)TCSVT'2021LSS-Dataset ,code(Ye5Q)SiamCDA: Complementarity-and distractor-aware RGB-T tracking based on Siamese network
LSS-Dataset(from TIR)TCSVT'2021LSS-Dataset,code(IHws)SiamCDA: Complementarity-and distractor-aware RGB-T tracking based on Siamese network

RGBD Datasets

DatasetPublishGitHubIntroduction
D2CUBECVRP'2023D2CUBEResource-Efficient RGBD Aerial Tracking
ARKittrackCVPR'2023ARKittrackARKitTrack: A New Diverse Dataset for Tracking Using Mobile RGB-D Data
RGBD1KAAAI'2023RGBD1KRGBD1K: A Large-Scale Dataset and Benchmark for RGB-D Object Tracking
VOT-RGBD2022VOT CommunityVOT-RGBD2022The Tenth Visual Object Tracking VOT2022 Challenge Results
DepthTrackICCV'2021DepthTrackDepthTrack: Unveiling the Power of RGBD Tracking
CDTBICCV'2019CDTBCDTB: A Color and Depth Visual Object Tracking Dataset and Benchmark
STCICCV'2019STC code:TZYDRobust Fusion of Color and Depth Data for RGB-D Target Tracking Using Adaptive Range-Invariant Depth Models and Spatio-Temporal Consistency Constraints
PTBICCV'2013PTBTracking Revisited using RGBD Camera: Unified Benchmark and Baselines
BoBoT-BoBoTBoBot - Bonn benchmark on tracking

RGBE Datasets

DatasetPublishGitHubIntroduction
CRSOTArxiv'2024CRSOTCRSOT: Cross-Resolution Object Tracking using Unaligned Frame and Event Cameras
FELTArxiv'2024FELTLong-term Frame-Event Visual Tracking: Benchmark Dataset and Baseline
COESOTArxiv'2022COESOTRevisiting Color-Event based Tracking: A Unified Network, Dataset, and Metric
VisEventTCYB'2023VisEventVisEvent: Reliable Object Tracking via Collaboration of Frame and Event Flows
FE108/FE240hzICCV'2021FE108/FE240hzObject Tracking by Jointly Exploiting Frame and Event Domain
EEDArxiv'2018EEDEvent-based Moving Object Detection and Tracking

RGBL Datasets

DatasetPublishGitHubIntroduction
DTVLTArxiv'2024DTVLTDTVLT: A Multi-Modal Diverse Text Benchmark For Visual Language Tracking Based On LLM
VLT-MIArxiv'2024VLT-MIVisual Language Tracking with Multi-modal Interaction: A Robust Benchmark
ElysiumTrack-1MArxiv'2024ElysiumTrack-1MElysium: Exploring Object-level Perception in Videos via MLLM
WebUOT-1MArxiv'2024WebUOT-1MWebUOT-1M: Advancing Deep Underwater Object Tracking with A Million-Scale Benchmark
VastTrackArxiv'2024VastTrackVastTrack: Vast Category Visual Object Tracking
MGITNIPS'2023MGITA Multi-modal Global Instance Tracking Benchmark (MGIT): Better Locating Target in Complex Spatio-temporal and Causal Relationship
TNL2KCVPR'2021TNL2KTowards More Flexible and Accurate Object Tracking with Natural Language: Algorithms and Benchmark
LaSOT_EXTIJCV'2021LaSOT_EXTLaSOT: A High-quality Large-scale Single Object Tracking Benchmark
LaSOTCVPR'2019LaSOTLaSOT: A High-quality Benchmark for Large-scale Single Object Tracking
OTB99-LCVPR'2017OTB99-LTracking by Natural Language Specification

RGBNIR Datasets

DatasetPublishGitHubIntroduction
CMOTBTNNLS'2024CMOTBCross-Modal Object Tracking via Modality-Aware Fusion Network and A Large-Scale Dataset

RGBS Datasets

DatasetPublishGitHubIntroduction
RGBS50Arxiv'2024RGBS50RGB-Sonar Tracking Benchmark and Spatial Cross-Attention Transformer Tracker

:star2: :Surveys and Report

:star: :Regular Papers

Unified (Model or Architecture for) Multi-Modal Tracking

RGB-T Tracking

2024

2023

2022

2021

2020

2019

2018

2017

2016

2012

2011

2008

2007

2006

RGB-D Tracking

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

RGB-E Tracking

2024

2023

2022

2021

RGB-L Tracking

2024

2023

2022

2021

2020

2019

2017

RGB-NIR Tracking

RGB-S Tracking

RGB-Multi Tracking

🥇 Competition

  1. AntiUAV 1st The first AntiUAV challenge is a multi-modal challenge. It was held in 2020. The website for the 1st AntiUAV is covered by the newest one. The winner is team xiaobaibai. One of its member is Tianyang Xu
  2. VOT-RGBT2019 The first competition in the RGBT tracking community. It was held in 2019. The winner is SiamDW.
  3. [VOT-RGBT2020] A competition in the RGBT tracking community. It was held in 2020. The winner is DFAT. One of its member is Zhangyong Tang

⚓ Awesome Repositories

🧑‍⚕️ Acknowledgements

This work is supported by PRCI-Lab, which is an outstanding and also fast-developing group. Please feel free to find out more information through its home page.

🫰 Questions

If you have any questions, please feel free to start the issue, or contact me at zhangyong_tang_jnu@163.com (wechat: Tzy18861871359 is also welcomed).