Awesome

Awesome Video Domain Adaptation

This repo is a comprehensive collection of awesome research (papers, codes, etc.) and other items about video domain adaptation.

Our comprehensive survey on Video Unsupervised Domain Adaptation with Deep Learning is now available. Please check our paper on arXiv.

2024/09 This survey on Video Unsupervised Domain Adaptation has been accepted and published by ACM Computing Surveys (ACM CSUR, IF 23.8)! Congrats to all the collaborators! The paper is available with Open Access! Click this DOI for details.

Domain adaptation has been a focus of research in transfer learning, enabling models to improve robustness which is crucial to apply models to real-world applications. Despite a long history of domain adaptation research, there has been limited discussions on video domain adaptation. This repo aims to present a collection of research on video domain adaptation including papers, code, etc.

Feel free to star, fork or raise an issue to include your research or to add in more categories! Discussion is most welcomed!

Contents
Explanatory Notes
Papers
Datasets and Benchmarks
Useful Tools and Other Resources

Explanatory Notes

This repository categorizes video domain adaptation papers according to the domain adaptation scenarios (i.e., closed-set, partial-set, source-free, etc.), sorted by date of publish/public appearance. These include both semi-supervised, weakly-supervised, and unsupervised DA. By default, VDA research focuses on action recognition. For other tasks, the corresponding task would be annotated independently.

Note: This repository is inspired by the ADA repository, a repository with awesome domain adaptation papers. For more research on domain adaptation (with images/point cloud etc.), you may check out that repository.

Papers

Closed-set VDA

Conference

Unsupervised Video Domain Adaptation: A Disentanglement Perspective Neural Information Processing Systems (NeurIPS) 2023 [Code-PyTorch] [Project Page]
Synthetic-to-Real Domain Adaptation for Action Recognition: A Dataset and Baseline Performances IEEE International Conference on Robotics and Automation (ICRA) (2023) [Project Page] [ArXiv]
Recur, Attend or Convolve? On Whether Temporal Modeling Matters for Cross-Domain Robustness in Action Recognition IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) (2023)
Exploiting Instance-based Mixed Sampling via Auxiliary Source Domain Supervision for Domain-adaptive Action Detection IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) (2023) (Video Action Detection)
Domain Adaptive Video Semantic Segmentation via Cross-Domain Moving Object Mixing IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) (2023) (Video Semantic Segmentation)
Domain Adaptive Video Segmentation via Temporal Pseudo Supervision European Conference on Computer Vision (ECCV) (2022) [Code-PyTorch] (Video Segmentation)
Audio-Adaptive Activity Recognition Across Video Domains IEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR) (2022) [Code-PyTorch] [Project Page]
Interact before Align: Leveraging Cross-Modal Knowledge for Domain Adaptive Action Recognition IEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR) (2022) [Project Page]
Multi-Level Attentive Adversarial Learning With Temporal Dilation for Unsupervised Video Domain Adaptation IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) (2022) [Code-PyTorch]
Dual-Head Contrastive Domain Adaptation for Video Action Recognition IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) (2022) [Code-PyTorch]
Contrast and mix: Temporal contrastive video domain adaptation with background mixing Conference on Neural Information Processing Systems (NeruIPS) (2021) [Code-PyTorch] [Project Page]
Learning Cross-Modal Contrastive Features for Video Domain Adaptation IEEE/CVF International Conference on Computer Vision (ICCV) (2021)
Domain Adaptive Video Segmentation via Temporal Consistency Regularization IEEE/CVF International Conference on Computer Vision (ICCV) (2021) [Code-PyTorch] (Video Segmentation)
Unsupervised Curriculum Domain Adaptation for No-Reference Video Quality Assessment IEEE/CVF International Conference on Computer Vision (ICCV) (2021) [Code-PyTorch] (Video Quality Assessment (VQA))
Spatio-temporal Contrastive Domain Adaptation for Action Recognition IEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR) (2021)
Unsupervised Domain Adaptation for Spatio-Temporal Action Localization British Machine Vision Conference (BMVC) (2020) (Video Action Localization)
Adversarial Bipartite Graph Learning for Video Domain Adaptation ACM Multimedia (ACM MM) (2020) [Code-PyTorch]
Shuffle and Attend: Video Domain Adaptation European Conference on Computer Vision (ECCV) (2020)
Multi-Modal Domain Adaptation for Fine-Grained Action Recognition IEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR) (2020) Oral [Code-TensorFlow] [Project Page]
Action Segmentation with Joint Self-Supervised Temporal Domain Adaptation IEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR) (2020) [Code-PyTorch] [Project Page] (Action Segmentation)
Transferring Cross-domain Knowledge for Video Sign Language Recognition IEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR) (2020) Oral (Video Sign Language Recognition)
Adversarial Cross-Domain Action Recognition with Co-Attention AAAI Conference on Artificial Intelligence (AAAI) (2020)
Generative Adversarial Networks for Video-to-Video Domain Adaptation AAAI Conference on Artificial Intelligence (AAAI) (2020) (Video Translation (Generation))
Action Segmentation with Mixed Temporal Domain Adaptation IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) (2020) (Action Segmentation)
Unsupervised and Semi-Supervised Domain Adaptation for Action Recognition from Drones IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) (2020)
Temporal Attentive Alignment for Large-Scale Video Domain Adaptation IEEE/CVF International Conference on Computer Vision (ICCV) (2019) Oral [Code-Pytorch] [Project Page]
Actor and Observer: Joint Modeling of First and Third-Person Videos IEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR) (2018) [Code-PyTorch]
Deep Domain Adaptation in Action Space British Machine Vision Conference (BMVC) (2018)

Journal

Aligning Correlation Information for Domain Adaptation in Action Recognition IEEE Transactions on Neural Networks and Learning Systems, Early Access (2022) [Project Page]
Rescaling Egocentric Vision: Collection, Pipeline and Challenges for EPIC-KITCHENS-100 International Journal of Computer Vision, Volume 130, Pages 33–55 (2022) (Open Access)
Dynamic video mix-up for cross-domain action recognition Neurocomputing, Volume 471, Pages 358-368 (2022)
A Novel Multiple-View Adversarial Learning Network for Unsupervised Domain Adaptation Action Recognition IEEE Transactions on Cybernetics, Early Access (2021)
A Pairwise Attentive Adversarial Spatiotemporal Network for Cross-Domain Few-Shot Action Recognition-R2 IEEE Transactions on Image Processing, Volume 30, Pages 767-782 (2020)
Pairwise Two-Stream ConvNets for Cross-Domain Action Recognition With Small Data IEEE Transactions on Neural Networks and Learning Systems, Volume 33, Pages 1147-1161 (2020)
Evaluation of local spatial–temporal features for cross-view action recognition Neurocomputing, Volume 173, Pages 110-117 (2016)

ArXiv and Workshops

Memory Efficient Temporal & Visual Graph Model for Unsupervised Video Domain Adaptation ArXiv 2208.06554
Channel-Temporal Attention for First-Person Video Domain Adaptation ArXiv 2108.07846v2
Unsupervised Domain Adaptation for Video Semantic Segmentation ArXiv 2107.11052
Temporal Attentive Alignment for Video Domain Adaptation IEEE/CVF Computer Vision and Pattern Recognition Conference Workshop (CVPRW) (2019) [Code-Pytorch] (Highly related to TA3N)

Partial-set VDA

Conference

Partial Video Domain Adaptation With Partial Adversarial Temporal Attentive Network IEEE/CVF International Conference on Computer Vision (ICCV) (2021) Oral [Code-PyTorch] [Project Page]
Calibrating Class Weights with Multi-Modal Information for Partial Video Domain Adaptation ACM Multimedia (ACM MM) (2022)

Open-set VDA

Conference

Conditional Extreme Value Theory for Open Set Video Domain Adaptation ACM Multimedia Asia (MMAsia) (2021) [Code-PyTorch]
Dual Metric Discriminator for Open Set Video Domain Adaptation IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2021)

Journal

Open Set Domain Adaptation for Image and Action Recognition IEEE Transactions on Pattern Analysis and Machine Intelligence, Volume 42, Issue 2 (2018)

Multi-Source VDA

ArXiv and Workshops

Multi-Source Video Domain Adaptation with Temporal Attentive Moment Alignment ArXiv 2109.09964 [Project Page]

Source-Free or Test-time VDA

Conference

The Unreasonable Effectiveness of Large Language-Vision Models for Source-free Video Domain Adaptation IEEE/CVF International Conference on Computer Vision (ICCV) (2023) [Code-PyTorch]
Video Test-Time Adaptation for Action Recognition IEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR) (2023) [Code-PyTorch] [Supplementary]
Source-Free Video Domain Adaptation with Spatial-Temporal-Historical Consistency Learning IEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR) (2023)
Overcoming Label Noise for Source-free Unsupervised Video Domain Adaptation Indian Conference on Computer Vision, Graphics and Image Processing (ICVGIP) (2022)
Source-free Video Domain Adaptation by Learning Temporal Consistency for Action Recognition European Conference on Computer Vision (ECCV) (2022) [Code-PyTorch] [Project Page]
Self-supervised Test-time Adaptation on Video Data IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) (2022)
Relative Alignment Network for Source-Free Multimodal Video Domain Adaptation ACM Multimedia (ACM MM) (2022)

Target-Free VDA

Conference

Cross-Domain Video Anomaly Detection without Target Domain Adaptation IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) (2023) (Video Anomaly Detection)

Few-shot VDA

Conference

Augmenting and Aligning Snippets for Few-Shot Video Domain Adaptation IEEE/CVF International Conference on Computer Vision (ICCV) (2023)

Continual VDA

Conference

Multi-Modal Continual Test-Time Adaptation for 3D Semantic Segmentation IEEE/CVF International Conference on Computer Vision (ICCV) (2023)

ArXiv and Workshops

Confidence Attention and Generalization Enhanced Distillation for Continuous Video Domain Adaptation ArXiv 2303.10452

Zero-shot VDA (Video Domain Generalization)

Conference

Domain Generalization through Audio-Visual Relative Norm Alignment in First Person Action Recognition IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) (2022)

Journal

VideoDG: generalizing temporal relations in videos to novel domains IEEE Transactions on Pattern Analysis and Machine Intelligence (2021)

Multi-Modal VDA

The different modalities are listed for each listing.

Conference

Multi-Modal Continual Test-Time Adaptation for 3D Semantic Segmentation (Modalities: RGB + 3D Point Cloud) IEEE/CVF International Conference on Computer Vision (ICCV) (2023)

Datasets and Benchmarks

We collect relevant datasets designed for video domain adaptation. Datasets are designed for closed-set video domain adaptation addressing action recognition by default. Note that downloading some datasets may require permission. You are advised to download common action recognition datasets e.g., HMDB51, UCF101, Kinetics, which are commonly used in these cross-domain video datasets.

2024

XOV-Action

2023

RoCoG-v2

2021-2022

Sports-DA
Daily-DA
HMDB-ARID (The Full ARID (v1.0))
ActorShift
Mixamo→Kinetics
UCF-HMDB partial (Partial-set)
HMDB-ARID partial (Partial-set)
MiniKinetics-UCF (Partial-set)
VIPER→Cityscapes-Seq (Video Segmentation)
SYNTHIA-Seq→Cityscapes-Seq (Video Segmentation)

2018-2020

Epic-Kitchens (Epic-Kitchens Splits) (The Full Epic-Kitchens)
UCF-HMDB full
Kinetics-Gameplay (Permission Needed)
Charades-Ego
Kinetics→NEC-Drone

Before 2015

Useful Tools and Other Resources

Challenges for Video Domain Adaptation

Note: these are the latest editions of the respective challenges, please check their previous versions through their respective websites

5th UG2+ Prize Challenge: Bridging the Gap Between Computational Photography and Visual Recognition (Track 2) IEEE/CVF Computer Vision and Pattern Recognition Conference Workshop (CVPRW) (2022) 2022 UG2+ Workshop
Epic-Kitchens-100 2022 Challenge: Domain Adaptation for Action Recognition IEEE/CVF Computer Vision and Pattern Recognition Conference Workshop (CVPRW) (2022) EPIC@CVPR2022 Workshop

Awesome

Awesome Video Domain Adaptation

Contents

Explanatory Notes

Papers

Closed-set VDA

Partial-set VDA

Open-set VDA

Multi-Source VDA

Source-Free or Test-time VDA

Target-Free VDA

Few-shot VDA

Continual VDA

Zero-shot VDA (Video Domain Generalization)

Multi-Modal VDA

Other Topics in Video Transfer Learning

Datasets and Benchmarks

Useful Tools and Other Resources

Challenges for Video Domain Adaptation