Awesome
Awesome Video Domain Adaptation
This repo is a comprehensive collection of awesome research (papers, codes, etc.) and other items about video domain adaptation.
Our comprehensive survey on Video Unsupervised Domain Adaptation with Deep Learning is now available. Please check our paper on arXiv.
- 2024/09 This survey on Video Unsupervised Domain Adaptation has been accepted and published by ACM Computing Surveys (ACM CSUR, IF 23.8)! Congrats to all the collaborators! The paper is available with Open Access! Click this DOI for details.
Domain adaptation has been a focus of research in transfer learning, enabling models to improve robustness which is crucial to apply models to real-world applications. Despite a long history of domain adaptation research, there has been limited discussions on video domain adaptation. This repo aims to present a collection of research on video domain adaptation including papers, code, etc.
Feel free to star, fork or raise an issue to include your research or to add in more categories! Discussion is most welcomed!
Contents
<!-- - [Survey](#survey) --> <!-- - [Multi-Target VDA](#multi-target-vda) --> <!-- - [Universal VDA](#universal-vda) --> <!-- - [Zero-shot or Few-shot VDA](#zero-shot-or-few-shot-vda) --> <!-- - [Black-box VDA](#black-box-vda) --> <!-- - [Active VDA](#active-vda) -->Explanatory Notes
This repository categorizes video domain adaptation papers according to the domain adaptation scenarios (i.e., closed-set, partial-set, source-free, etc.), sorted by date of publish/public appearance. These include both semi-supervised, weakly-supervised, and unsupervised DA. By default, VDA research focuses on action recognition. For other tasks, the corresponding task would be annotated independently.
Note: This repository is inspired by the ADA repository, a repository with awesome domain adaptation papers. For more research on domain adaptation (with images/point cloud etc.), you may check out that repository.
Papers
Closed-set VDA
Conference
- Unsupervised Video Domain Adaptation: A Disentanglement Perspective Neural Information Processing Systems (NeurIPS) 2023 [Code-PyTorch] [Project Page]
- Synthetic-to-Real Domain Adaptation for Action Recognition: A Dataset and Baseline Performances IEEE International Conference on Robotics and Automation (ICRA) (2023) [Project Page] [ArXiv]
- Recur, Attend or Convolve? On Whether Temporal Modeling Matters for Cross-Domain Robustness in Action Recognition IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) (2023)
- Exploiting Instance-based Mixed Sampling via Auxiliary Source Domain Supervision for Domain-adaptive Action Detection IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) (2023) (Video Action Detection)
- Domain Adaptive Video Semantic Segmentation via Cross-Domain Moving Object Mixing IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) (2023) (Video Semantic Segmentation)
- Domain Adaptive Video Segmentation via Temporal Pseudo Supervision European Conference on Computer Vision (ECCV) (2022) [Code-PyTorch] (Video Segmentation)
- Audio-Adaptive Activity Recognition Across Video Domains IEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR) (2022) [Code-PyTorch] [Project Page]
- Interact before Align: Leveraging Cross-Modal Knowledge for Domain Adaptive Action Recognition IEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR) (2022) [Project Page]
- Multi-Level Attentive Adversarial Learning With Temporal Dilation for Unsupervised Video Domain Adaptation IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) (2022) [Code-PyTorch]
- Dual-Head Contrastive Domain Adaptation for Video Action Recognition IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) (2022) [Code-PyTorch]
- Contrast and mix: Temporal contrastive video domain adaptation with background mixing Conference on Neural Information Processing Systems (NeruIPS) (2021) [Code-PyTorch] [Project Page]
- Learning Cross-Modal Contrastive Features for Video Domain Adaptation IEEE/CVF International Conference on Computer Vision (ICCV) (2021)
- Domain Adaptive Video Segmentation via Temporal Consistency Regularization IEEE/CVF International Conference on Computer Vision (ICCV) (2021) [Code-PyTorch] (Video Segmentation)
- Unsupervised Curriculum Domain Adaptation for No-Reference Video Quality Assessment IEEE/CVF International Conference on Computer Vision (ICCV) (2021) [Code-PyTorch] (Video Quality Assessment (VQA))
- Spatio-temporal Contrastive Domain Adaptation for Action Recognition IEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR) (2021)
- Unsupervised Domain Adaptation for Spatio-Temporal Action Localization British Machine Vision Conference (BMVC) (2020) (Video Action Localization)
- Adversarial Bipartite Graph Learning for Video Domain Adaptation ACM Multimedia (ACM MM) (2020) [Code-PyTorch]
- Shuffle and Attend: Video Domain Adaptation European Conference on Computer Vision (ECCV) (2020)
- Multi-Modal Domain Adaptation for Fine-Grained Action Recognition IEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR) (2020) Oral [Code-TensorFlow] [Project Page]
- Action Segmentation with Joint Self-Supervised Temporal Domain Adaptation IEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR) (2020) [Code-PyTorch] [Project Page] (Action Segmentation)
- Transferring Cross-domain Knowledge for Video Sign Language Recognition IEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR) (2020) Oral (Video Sign Language Recognition)
- Adversarial Cross-Domain Action Recognition with Co-Attention AAAI Conference on Artificial Intelligence (AAAI) (2020)
- Generative Adversarial Networks for Video-to-Video Domain Adaptation AAAI Conference on Artificial Intelligence (AAAI) (2020) (Video Translation (Generation))
- Action Segmentation with Mixed Temporal Domain Adaptation IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) (2020) (Action Segmentation)
- Unsupervised and Semi-Supervised Domain Adaptation for Action Recognition from Drones IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) (2020)
- Temporal Attentive Alignment for Large-Scale Video Domain Adaptation IEEE/CVF International Conference on Computer Vision (ICCV) (2019) Oral [Code-Pytorch] [Project Page]
- Actor and Observer: Joint Modeling of First and Third-Person Videos IEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR) (2018) [Code-PyTorch]
- Deep Domain Adaptation in Action Space British Machine Vision Conference (BMVC) (2018)
Journal
- Aligning Correlation Information for Domain Adaptation in Action Recognition IEEE Transactions on Neural Networks and Learning Systems, Early Access (2022) [Project Page]
- Rescaling Egocentric Vision: Collection, Pipeline and Challenges for EPIC-KITCHENS-100 International Journal of Computer Vision, Volume 130, Pages 33–55 (2022) (Open Access)
- Dynamic video mix-up for cross-domain action recognition Neurocomputing, Volume 471, Pages 358-368 (2022)
- A Novel Multiple-View Adversarial Learning Network for Unsupervised Domain Adaptation Action Recognition IEEE Transactions on Cybernetics, Early Access (2021)
- A Pairwise Attentive Adversarial Spatiotemporal Network for Cross-Domain Few-Shot Action Recognition-R2 IEEE Transactions on Image Processing, Volume 30, Pages 767-782 (2020)
- Pairwise Two-Stream ConvNets for Cross-Domain Action Recognition With Small Data IEEE Transactions on Neural Networks and Learning Systems, Volume 33, Pages 1147-1161 (2020)
- Evaluation of local spatial–temporal features for cross-view action recognition Neurocomputing, Volume 173, Pages 110-117 (2016)
ArXiv and Workshops
- Memory Efficient Temporal & Visual Graph Model for Unsupervised Video Domain Adaptation ArXiv 2208.06554
- Channel-Temporal Attention for First-Person Video Domain Adaptation ArXiv 2108.07846v2
- Unsupervised Domain Adaptation for Video Semantic Segmentation ArXiv 2107.11052
- Temporal Attentive Alignment for Video Domain Adaptation IEEE/CVF Computer Vision and Pattern Recognition Conference Workshop (CVPRW) (2019) [Code-Pytorch] (Highly related to TA3N)
Partial-set VDA
Conference
- Partial Video Domain Adaptation With Partial Adversarial Temporal Attentive Network IEEE/CVF International Conference on Computer Vision (ICCV) (2021) Oral [Code-PyTorch] [Project Page]
- Calibrating Class Weights with Multi-Modal Information for Partial Video Domain Adaptation ACM Multimedia (ACM MM) (2022)
Open-set VDA
Conference
- Conditional Extreme Value Theory for Open Set Video Domain Adaptation ACM Multimedia Asia (MMAsia) (2021) [Code-PyTorch]
- Dual Metric Discriminator for Open Set Video Domain Adaptation IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2021)
Journal
- Open Set Domain Adaptation for Image and Action Recognition IEEE Transactions on Pattern Analysis and Machine Intelligence, Volume 42, Issue 2 (2018)
Multi-Source VDA
ArXiv and Workshops
- Multi-Source Video Domain Adaptation with Temporal Attentive Moment Alignment ArXiv 2109.09964 [Project Page]
Source-Free or Test-time VDA
Conference
- The Unreasonable Effectiveness of Large Language-Vision Models for Source-free Video Domain Adaptation IEEE/CVF International Conference on Computer Vision (ICCV) (2023) [Code-PyTorch]
- Video Test-Time Adaptation for Action Recognition IEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR) (2023) [Code-PyTorch] [Supplementary]
- Source-Free Video Domain Adaptation with Spatial-Temporal-Historical Consistency Learning IEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR) (2023)
- Overcoming Label Noise for Source-free Unsupervised Video Domain Adaptation Indian Conference on Computer Vision, Graphics and Image Processing (ICVGIP) (2022)
- Source-free Video Domain Adaptation by Learning Temporal Consistency for Action Recognition European Conference on Computer Vision (ECCV) (2022) [Code-PyTorch] [Project Page]
- Self-supervised Test-time Adaptation on Video Data IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) (2022)
- Relative Alignment Network for Source-Free Multimodal Video Domain Adaptation ACM Multimedia (ACM MM) (2022)
Target-Free VDA
Conference
- Cross-Domain Video Anomaly Detection without Target Domain Adaptation IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) (2023) (Video Anomaly Detection)
Few-shot VDA
Conference
- Augmenting and Aligning Snippets for Few-Shot Video Domain Adaptation IEEE/CVF International Conference on Computer Vision (ICCV) (2023)
Continual VDA
Conference
- Multi-Modal Continual Test-Time Adaptation for 3D Semantic Segmentation IEEE/CVF International Conference on Computer Vision (ICCV) (2023)
ArXiv and Workshops
- Confidence Attention and Generalization Enhanced Distillation for Continuous Video Domain Adaptation ArXiv 2303.10452
Zero-shot VDA (Video Domain Generalization)
Conference
- Domain Generalization through Audio-Visual Relative Norm Alignment in First Person Action Recognition IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) (2022)
Journal
- VideoDG: generalizing temporal relations in videos to novel domains IEEE Transactions on Pattern Analysis and Machine Intelligence (2021)
Multi-Modal VDA
The different modalities are listed for each listing.
Conference
- Multi-Modal Continual Test-Time Adaptation for 3D Semantic Segmentation (Modalities: RGB + 3D Point Cloud) IEEE/CVF International Conference on Computer Vision (ICCV) (2023)
Other Topics in Video Transfer Learning
Conference
- CycDA: Unsupervised Cycle Domain Adaptation to Learn from Image to Video European Conference on Computer Vision (ECCV) (2022)
- Benchmarking the robustness of Spatial-Temporal Models The Conference on Neural Information Processing Systems (NeurIPS) Datasets and Benchmarks Track (2021) [Code-TensorFlow] (Video Robustness)
- Spatial-temporal causal inference for partial image-to-video adaptation AAAI Conference on Artificial Intelligence (AAAI) (2021) [Code-PyTorch] (Partial-Set Image-to-Video)
- Image to Video Domain Adaptation Using Web Supervision IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) (2020) (Image-to-Video)
- DistInit: Learning Video Representations Without a Single Labeled Video IEEE/CVF International Conference on Computer Vision (ICCV) (2019) (Image-to-Video)
Journal
- Multi-Domain and Multi-Task Learning for Human Action Recognition IEEE Transactions on Image Processing, Volume 28, Pages 853-867 (2019)
- Deep Image-to-Video Adaptation and Fusion Networks for Action Recognition IEEE Transactions on Image Processing, Volume 29, Pages 3168-3182 (2019) [Project Page] (Image-to-Video)
ArXiv
- Recur, Attend or Convolve? On Whether Temporal Modeling Matters for Cross-Domain Robustness in Action Recognition ArXiv 2112.12175
Datasets and Benchmarks
We collect relevant datasets designed for video domain adaptation. Datasets are designed for closed-set video domain adaptation addressing action recognition by default. Note that downloading some datasets may require permission. You are advised to download common action recognition datasets e.g., HMDB51, UCF101, Kinetics, which are commonly used in these cross-domain video datasets.
2024
2023
2021-2022
- Sports-DA
- Daily-DA
- HMDB-ARID (The Full ARID (v1.0))
- ActorShift
- Mixamo→Kinetics
- UCF-HMDB <sub>partial</sub> (Partial-set)
- HMDB-ARID <sub>partial</sub> (Partial-set)
- MiniKinetics-UCF (Partial-set)
- VIPER→Cityscapes-Seq (Video Segmentation)
- SYNTHIA-Seq→Cityscapes-Seq (Video Segmentation)
2018-2020
- Epic-Kitchens (Epic-Kitchens Splits) (The Full Epic-Kitchens)
- UCF-HMDB <sub>full</sub>
- Kinetics-Gameplay (Permission Needed)
- Charades-Ego
- Kinetics→NEC-Drone
Before 2015
Useful Tools and Other Resources
Challenges for Video Domain Adaptation
Note: these are the latest editions of the respective challenges, please check their previous versions through their respective websites
- 5<sup>th</sup> UG2<sup>+</sup> Prize Challenge: Bridging the Gap Between Computational Photography and Visual Recognition (Track 2) IEEE/CVF Computer Vision and Pattern Recognition Conference Workshop (CVPRW) (2022) 2022 UG2+ Workshop
- Epic-Kitchens-100 2022 Challenge: Domain Adaptation for Action Recognition IEEE/CVF Computer Vision and Pattern Recognition Conference Workshop (CVPRW) (2022) EPIC@CVPR2022 Workshop