Awesome
Visual_Language_Tracking_Paper_List
Papers
NeurIPS 2024
- MemVLT: Xiaokun Feng, Xuchen Li, Shiyu Hu, Dailing Zhang, Meiqi Wu, Jing Zhang, Xiaotang Chen, Kaiqi Huang<br> "MemVLT: Visual-Language Tracking with Adaptive Memory-based Prompts" NeurIPS 2024
CVPR 2024
-
OneTracker: Lingyi Hong, Shilin Yan, Renrui Zhang, Wanyun Li, Xinyu Zhou, Pinxue Guo, Kaixun Jiang, Yiting Cheng, Jinglun Li, Zhaoyu Chen, Wenqiang Zhang<br> "OneTracker: Unifying Visual Object Tracking with Foundation Models and Efficient Tuning" CVPR 2024<br> [paper]
-
QueryNLT: Yanyan Shao, Shuting He, Qi Ye, Yuchao Feng, Wenhan Luo, Jiming Chen<br> "Context-Aware Integration of Language and Visual References for Natural Language Tracking" CVPR 2024<br> [paper] <br> [code]
ECCV 2024
- Elysium: Han Wang, Yanjie Wang, Yongjie Ye, Yuxiang Nie, Can Huang<br> "Elysium: Exploring Object-level Perception in Videos via MLLM" ECCV 2024<br> [paper] <br> [code]
AAAI 2024
- UVLTrack: Yinchao Ma, Yuyang Tang, Wenfei Yang, Tianzhu Zhang, Jinpeng Zhang, Mengxue Kang<br> "Unifying Visual and Vision-Language Tracking via Contrastive Learning" AAAI 2024<br> [paper] <br> [code]
ACM MM 2024
-
ATTrack: Jiawei Ge, Jiuxin Cao, Xuelin Zhu, Xinyu Zhang, Chang Liu, Kun Wang, Bo Liu<br> "Consistencies are All You Need for Semi-supervised Vision-Language Tracking" ACM MM 2024<br> [paper]
IJCAI 2024
- DMTrack: Guangtong Zhang, Bineng Zhong, Qihua Liang, Zhiyi Mo, Shuxiang Song<br> "Diffusion Mask-Driven Visual-language Tracking" IJCAI 2024<br> [paper]
TCSVT 2024
- OSDT: Guangtong Zhang, Bineng Zhong, Qihua Liang, Zhiyi Mo, Ning Li, Shuxiang Song<br> "One-Stream Stepwise Decreasing for Vision-Language Tracking" TCSVT 2024<br> [paper]
ICASSP 2024
- TTCTrack: Zhongjie Mao, Yucheng Wang, Xi Chen, Jia Yan<br> "Textual Tokens Classification for Multi-Modal Alignment in Vision-Language Tracking" ICASSP 2024<br> [paper]
CVPRW 2024
- DTLLM-VLT: Xuchen Li, Xiaokun Feng, Shiyu Hu, Meiqi Wu, Dailing Zhang, Jing Zhang, Kaiqi Huang<br> "DTLLM-VLT: Diverse Text Generation for Visual Language Tracking Based on LLM" CVPRW 2024<br> [paper]
ArXiv 2024
-
SATracker: Jiawei Ge, Xiangmei Chen, Jiuxin Cao, Xuelin Zhu, Weijia Liu, Bo Liu<br> "Beyond Visual Cues: Synchronously Exploring Target-Centric Semantics for Vision-Language Tracking" ArXiv 2024<br> [paper]
-
VLT-MI: Xuchen Li, Shiyu Hu, Xiaokun Feng, Dailing Zhang, Meiqi Wu, Jing Zhang, Kaiqi Huang<br> "Visual Language Tracking with Multi-modal Interaction: A Robust Benchmark" ArXiv 2024<br> [paper]
CVPR 2023
- JointNLT: Li Zhou, Zikun Zhou, Kaige Mao, Zhenyu He<br> "Joint Visual Grounding and Tracking with Natural Language Specification" CVPR 2023 <br> [paper]<br> [code]
ICCV 2023
- DecoupleTNL: Ding Ma, Xiangqian Wu<br> "Tracking by Natural Language Specification with Long Short-term Context Decoupling" ICCV 2023<br> [paper]
NeurIPS 2023
- MGIT: Shiyu Hu, Dailin Zhang, Meiqi Wu, Xiaokun Feng, Xuchen Li, Xin Zhao, Kaiqi Huang<br> "A Multi-modal Global Instance Tracking Benchmark (MGIT): Better Locating Target in Complex Spatio-temporal and causal Relationship" NeurIPS 2023<br> [paper]<br> [platform]
ACM MM 2023
- All in One: Chunhui Zhang, Xin Sun, Li Liu, Yiqian Yang, Qiong Liu, Xi Zhou, Yanfeng Wang <br> "All in One: Exploring Unified Vision-Language Tracking with Multi-Modal Alignment" ACM MM 2023<br> [paper] <br> [code]
TMM 2023
- OVLM: Huanlong Zhang, Jingchao Wang, Jianwei Zhang, Tianzhu Zhang, Bineng Zhong <br> "One-stream Vision-Language Memory Network for Object Tracking" TMM 2023<br> [paper]
TCSVT 2023
-
MMTrack: Yaozong Zheng, Bineng Zhong, Qihua Liang, Guorong Li, Rongrong Ji, Xianxian Li<br> "Towards Unified Token Learning for Vision-Language Tracking" TCSVT 2023<br> [paper] <br> [code]
-
TransNLT: Rong Wang, Zongheng Tang, Qianli Zhou, Xiaoqian Liu, Tianrui Hui, Quange Tan, Si Liu<br> "Unified Transformer With Isomorphic Branches for Natural Language Tracking" TCSVT 2023<br> [paper]
PRL 2023
- TransVLT: Haojie Zhao, Xiao Wang, Dong Wang, Huchuan Lu, Xiang Ruan<br> "Transformer vision-language tracking via proxy token guided cross-modal fusion" PRL 2023<br> [paper]
ArXiv 2023
- VLT_OST: Mingzhe Guo, Zhipeng Zhang, Liping Jing , Haibin Ling, Heng Fan<br> "Divert More Attention to Vision-Language Object Tracking" ArXiv 2023<br> [paper]<br> [code]
NeuIPS 2022
- VLT_TT: Mingzhe Guo, Zhipeng Zhang, Heng Fan, Liping Jing<br> "Divert More Attention to Vision-Language Tracking" NeurIPS 2022<br> [paper] <br> [code]
CVPRW 2022
- CTRTNL: Yihao L, Jun Yu, Zhongpeng Cai, Yuwen Pan<br> "Cross-Modal Target Retrieval for Tracking by Natural Language" CVPRW 2022<br> [paper]
CVPR 2021
-
SNLT: Qi Feng, Vitaly Ablavsky, Qinxun Bai, Stan Sclaroff<br> "Siamese Natural Language Tracker: Tracking by Natural Language Descriptions With Siamese Trackers" CVPR 2021<br> [paper]<br> [code]
-
TNL2K: Xiao Wang, Xiujun Shu, Zhipeng Zhang, Bo Jiang, Yaowei Wang, Yonghong Tian, Feng Wu<br> "Towards More Flexible and Accurate Object Tracking with Natural Language: Algorithms and Benchmark" CVPR 2021<br> [paper]<br> [platform]
ACM MM 2021
- CapsuleTNL: Ding Ma, Xiangqian Wu<br> "Capsule-based Object Tracking with Natural Language Specification" ACM MM 2021<br> [paper]
TCSVT 2021
- GTI: Zhengyuan Yang, Tushar Kumar, Tianlang Chen, Jingsong Su, Jiebo Luo<br> "Grounding-Tracking-Integration" TCSVT 2021<br> [paper]
WACV 2020
- RTTNLD: Qi Feng, Vitaly Ablavsky, Qinxun Bai, Guorong Li, Stan Sclaroff<br> "Real-time Visual Object Tracking with Natural Language Description" WACV 2020<br> [paper]
ArXiv 2019
- NLRPN: Qi Feng, Vitaly Ablavsky, Qinxun Bai, Stan Sclaroff<br> "Robust Visual Object Tracking with Natural Language Region Proposal Network" ArXiv 2019<br> [paper]
ArXiv 2018
- DAT: Xiao Wang, Chenglong Li, Rui Yang, Tianzhu Zhang, Jin Tang, Bin Luo<br> "Describe and Attend to Track: Learning Natural Language guided Structural Representation and Visual Attention for Object Tracking" ArXiv 2018<br> [paper]