Awesome
awesome-neural-adaptation-in-NLP
A curated list of awesome work on neural unsupervised domain adaptation in Natural Language Processing, including links to papers. The focus is on unsupervised neural DA methods and currently on work outside Machine Translation. Feel free to contribute by creating a pull request as outlined in contributing.md.
Please cite our [survey paper] ([Ramponi and Plank, COLING 2020]) if you find it useful in your research:
@inproceedings{ramponi-plank-2020-neural,
title = "Neural Unsupervised Domain Adaptation in {NLP}{---}{A} Survey",
author = "Ramponi, Alan and
Plank, Barbara",
booktitle = "Proceedings of the 28th International Conference on Computational Linguistics",
month = dec,
year = "2020",
address = "Barcelona, Spain (Online)",
publisher = "International Committee on Computational Linguistics",
url = "https://www.aclweb.org/anthology/2020.coling-main.603",
doi = "10.18653/v1/2020.coling-main.603",
pages = "6838--6855"
}
Contents
- Surveys
- Collections
- Unsupervised DA
Surveys
Overview of surveys in other areas and related topics.
Machine Learning
- A Survey on Transfer Learning [Pan & Yang, 2010, IEEE]
Vision
- Deep Visual Domain Adaptation: A Survey [Wang & Deng, Neurocomputing 2018]
- Generalizing to Unseen Domains: A Survey on Domain Generalization Wang et al., 2021 (technical report), ArXiv
Machine Translation
- A Survey of Domain Adaptation for Neural Machine Translation [Chu & Wang, COLING 2018]
Pre-neural surveys on Domain Adaptation in NLP
- Domain Adaptation in Natural Language Processing [Jiang, 2008; PhD dissertation, chapter 2]
- A Literature Survey on Domain Adaptation of Statistical Classifiers [Jiang, 2008, technical report]
- Domain Adaptation for Parsing [Plank, 2011; PhD dissertation, chapter 3]
Transfer Learning for NLP
- Neural Transfer Learning for NLP [Ruder, 2019; PhD dissertation]
Low-resource NLP
- A Survey on Recent Approaches for Natural Language Processing in Low-Resource Scenarios [Hedderich et al., 2020]
Cross-Lingual Learning
- A Survey of Cross-lingual Word Embedding Models [Ruder, Vulić, Søgaard, JAIR 2019]
Multi-task Learning
- An Overview of Multi-Task Learning in Deep Neural Networks [Ruder, arXiv 2017]
Domain Divergence
- Domain Divergences: A Survey and Empirical Analysis [Kashyap et al. 2020]
Collections
Related awesome collections:
- awesome-domain-adaptation
- NLP progress page for domain adaptation
- awesome-datascience
- The-NLP-Pandect
Unsupervised DA
By methods
A list of papers categorized by methods.
Model centric methods
- Neural Structural Correspondence Learning for Domain Adaptation [Ziser and Reichart, CoNLL 2017]
- Deep Pivot-Based Modeling for Cross-language Cross-Domain Transfer with Minimal Guidance [Ziser and Reichart, EMNLP 2018a]
- Pivot Based Language Modeling for Improved Neural Domain Adaptation [Ziser and Reichart, NAACL-HLT 2018b]
- Task Refinement Learning for Improved Accuracy and Stability of Unsupervised Domain Adaptation [Ziser and Reichart, ACL 2019]
- Simplified Neural Unsupervised Domain Adaptation [Miller, NAACL-HLT 2019]
- Domain Adaptation for Large-Scale Sentiment Classification: A Deep Learning Approach [Glorot et al., ICML 2011]
- Marginalized Denoising Autoencoders for Domain Adaptation [Chen et al., ICML 2012]
- Fast Easy Unsupervised Domain Adaptation with Marginalized Structured Dropout [Yang and Eisenstein, ACL 2014]
- A Domain Adaptation Regularization for Denoising Autoencoders [Clinchant et al., ACL 2016]
- Domain-Adversarial Training of Neural Networks [Ganin et al., JMLR 2016]
- End-to-End Adversarial Memory Network for Cross-domain Sentiment Classification [Li et al., IJCAI 2017]
- Adversarial Adaptation of Synthetic or Stale Data [Kim et al., ACL 2017]
- Adversarial Training for Cross-Domain Universal Dependency Parsing [Sato et al., CoNLL 2017]
- Adversarial Training for Relation Extraction [Wu et al., EMNLP 2017]
- Robust Multilingual Part-of-Speech Tagging via Adversarial Training [Yasunaga et al., NAACL-HLT 2018]
- Wasserstein Distance Guided Representation Learning for Domain Adaptation [Shen et al., AAAI 2018]
- What's in a Domain? Learning Domain-Robust Text Representations using Adversarial Training [Li et al., NAACL-HLT 2018]
- Domain Adaptation with Adversarial Training and Graph Embeddings [Alam et al., ACL 2018]
- Adversarial Domain Adaptation for Machine Reading Comprehension [Wang et al., EMNLP-IJCNLP 2019]
- Adversarial Domain Adaptation for Duplicate Question Detection [Shah et al., EMNLP 2018]
- Domain Adaptation for Relation Extraction with Domain Adversarial Neural Network [Fu et al., IJCNLP 2017]
- Generalizing Biomedical Relation Classification with Neural Adversarial Domain Adaptation [Rios et al., Bioinf. 2018]
- Adversarial Domain Adaptation for Stance Detection [Xu et al., arXiv 2019]
- Genre Separation Network with Adversarial Training for Cross-genre Relation Extraction [Shi et al., EMNLP 2018]
- A Comparative Analysis of Unsupervised Language Adaptation Methods [Rocha, Lopes Cardoso, DeepLo 2019]
- KinGDOM: Knowledge-Guided DOMain Adaptation for Sentiment Analysis [Ghosal et al., ACL 2020]
- Towards Open Domain Event Trigger Identification using Adversarial Domain Adaptation [Naik, Rose, ACL 2020]
- Transformer Based Multi-Source Domain Adaptation [Wright and Augenstein, EMNLP 2020]
- Unsupervised Cross-Lingual Adaptation of Dependency Parsers Using CRF Autoencoders [Li and Tu, Findings of EMNLP 2020]
- Effective Unsupervised Domain Adaptation with Adversarially Trained Language Models [Vu et al., EMNLP 2020]
- Inexpensive Domain Adaptation of Pretrained Language Models: Case Studies on Biomedical NER and Covid-19 QA [Poerner et al., Findings of EMNLP 2020]
- Adapting Event Extractors to Medical Data: Bridging the Covariate Shift [Naik et al., EACL 2021]
Data centric methods
- Unsupervised Domain Adaptation of Contextualized Embeddings for Sequence Labeling [Han, Eisenstein, EMNLP-IJCNLP 2019]
- Semi-supervised Domain Adaptation for Dependency Parsing [Li et al., ACL 2019]
- Adaptive Ensembling: Unsupervised Domain Adaptation for Political Document Analysis [Desai et al., EMNLP-IJCNLP 2019]
- Don't Stop Pretraining: Adapt Language Models to Domains and Tasks [Gururangan et al., ACL 2020]
- End-to-End Synthetic Data Generation for Domain Adaptation of Question Answering Systems [Shakeri et al., EMNLP 2020]
- Feature Adaptation of Pre-Trained Language Models across Languages and Domains with Robust Self-Training [Ye et al., EMNLP 2020]
- UDALM: Unsupervised Domain Adaptation through Language Modeling [Karouzos et al., NAACL 2021] [N.B. similar to aux-MLM which was contemporaneously proposed for cross-lingual learning by [van der Goot et al., 2021 NAACL])
- Adapting Event Extractors to Medical Data: Bridging the Covariate Shift [Naik et al., EACL 2021]
Hybrid methods
- Asymmetric Tri-training for Unsupervised Domain Adaptation [Saito et al., ICML 2017]
- Strong Baselines for Neural Semi-Supervised Learning under Domain Shift [Ruder, Plank, ACL 2018]
- Cross-Domain NER using Cross-Domain Language Modeling [Jia et al., ACL 2019]
- Deep Contextualized Self-Training for Low Resource Dependency Parsing [Rotman, Reichart, TACL 2019]
- Self-Adaptation for Unsupervised Domain Adaptation [Cui, Bollegala, RANLP 2019]
- Multi-Task Domain Adaptation for Sequence Tagging [Peng, Dredze, RepL4NLP 2017]
- Multi-Source Domain Adaptation for Text Classification via DistanceNet-Bandits [Guo et al., AAAI(?) 2020]
- PERL: Pivot-Based Domain Adaptation for Pre-Trained Deep Contextualized Embedding Models [Ben-David et al., TACL 2020]
- Unsupervised Adaptation of Question Answering Systems via Generative Self-training [Rennie et al., EMNLP 2020]
By tasks
A list of papers categorized by tasks.
Classification/inference
All papers pertaining to classification and inference tasks are indicated in the respective sections below.
Sentiment analysis
- Neural Structural Correspondence Learning for Domain Adaptation [Ziser and Reichart, CoNLL 2017]
- Deep Pivot-Based Modeling for Cross-language Cross-Domain Transfer with Minimal Guidance [Ziser and Reichart, EMNLP 2018a]
- Pivot Based Language Modeling for Improved Neural Domain Adaptation [Ziser and Reichart, NAACL-HLT 2018b]
- Task Refinement Learning for Improved Accuracy and Stability of Unsupervised Domain Adaptation [Ziser and Reichart, ACL 2019]
- Simplified Neural Unsupervised Domain Adaptation [Miller, NAACL-HLT 2019]
- Domain Adaptation for Large-Scale Sentiment Classification: A Deep Learning Approach [Glorot et al., ICML 2011]
- Marginalized Denoising Autoencoders for Domain Adaptation [Chen et al., ICML 2012]
- A Domain Adaptation Regularization for Denoising Autoencoders [Clinchant et al., ACL 2016]
- Domain-Adversarial Training of Neural Networks [Ganin et al., JMLR 2016]
- End-to-End Adversarial Memory Network for Cross-domain Sentiment Classification [Li et al., IJCAI 2017]
- Wasserstein Distance Guided Representation Learning for Domain Adaptation [Shen et al., AAAI 2018]
- What's in a Domain? Learning Domain-Robust Text Representations using Adversarial Training [Li et al., NAACL-HLT 2018]
- A Comparative Analysis of Unsupervised Language Adaptation Methods [Rocha, Lopes Cardoso, DeepLo 2019]
- KinGDOM: Knowledge-Guided DOMain Adaptation for Sentiment Analysis [Ghosal et al., ACL 2020]
- Don't Stop Pretraining: Adapt Language Models to Domains and Tasks [Gururangan et al., ACL 2020]
- Asymmetric Tri-training for Unsupervised Domain Adaptation [Saito et al., ICML 2017]
- Strong Baselines for Neural Semi-Supervised Learning under Domain Shift [Ruder, Plank, ACL 2018]
- Self-Adaptation for Unsupervised Domain Adaptation [Cui, Bollegala, RANLP 2019]
- Multi-Source Domain Adaptation for Text Classification via DistanceNet-Bandits [Guo et al., AAAI 2020]
- PERL: Pivot-Based Domain Adaptation for Pre-Trained Deep Contextualized Embedding Models [Ben-David et al., TACL 2020]
- Feature Adaptation of Pre-Trained Language Models across Languages and Domains with Robust Self-Training [Ye et al., EMNLP 2020]
- UDALM: Unsupervised Domain Adaptation through Language Modeling [Karouzos et al., NAACL 2021]
Language identification
- What's in a Domain? Learning Domain-Robust Text Representations using Adversarial Training [Li et al., NAACL-HLT 2018]
Binary text classification
- A Domain Adaptation Regularization for Denoising Autoencoders [Clinchant et al., ACL 2016]
- Adversarial Adaptation of Synthetic or Stale Data [Kim et al., ACL 2017]
- Domain Adaptation with Adversarial Training and Graph Embeddings [Alam et al., ACL 2018]
- Don't Stop Pretraining: Adapt Language Models to Domains and Tasks [Gururangan et al., ACL 2020]
- Transformer Based Multi-Source Domain Adaptation [Wright and Augenstein, EMNLP 2020]
Machine reading
- Adversarial Domain Adaptation for Machine Reading Comprehension [Wang et al., EMNLP-IJCNLP 2019]
Duplicate question detection
- Adversarial Domain Adaptation for Duplicate Question Detection [Shah et al., EMNLP 2018]
Stance detection
- Adversarial Domain Adaptation for Stance Detection [Xu et al., arXiv 2019]
Political data identification
- Adaptive Ensembling: Unsupervised Domain Adaptation for Political Document Analysis [Desai et al., EMNLP-IJCNLP 2019]
Question answering
- Unsupervised Adaptation of Question Answering Systems via Generative Self-training [Rennie et al., EMNLP 2020]
- End-to-End Synthetic Data Generation for Domain Adaptation of Question Answering Systems [Shakeri et al., EMNLP 2020]
Summarization
- AdaptSum: Towards Low-Resource Domain Adaptation forAbstractive Summarization [Yu et al., NAACL 2021]
Structured prediction
All papers pertaining to structured prediction tasks are indicated in the respective sections below.
Natural language inference
- A Comparative Analysis of Unsupervised Language Adaptation Methods [Rocha, Lopes Cardoso, DeepLo 2019]
- Don't Stop Pretraining: Adapt Language Models to Domains and Tasks [Gururangan et al., ACL 2020]
Part-of-speech tagging
- Fast Easy Unsupervised Domain Adaptation with Marginalized Structured Dropout [Yang and Eisenstein, ACL 2014]
- Robust Multilingual Part-of-Speech Tagging via Adversarial Training [Yasunaga et al., NAACL-HLT 2018]
- Unsupervised Domain Adaptation of Contextualized Embeddings for Sequence Labeling [Han, Eisenstein, EMNLP-IJCNLP 2019]
- Strong Baselines for Neural Semi-Supervised Learning under Domain Shift [Ruder, Plank, ACL 2018]
- Multi-Task Domain Adaptation for Sequence Tagging [Peng, Dredze, RepL4NLP 2017]
Dependency parsing
- Adversarial Training for Cross-Domain Universal Dependency Parsing [Sato et al., CoNLL 2017]
- Semi-supervised Domain Adaptation for Dependency Parsing [Li et al., ACL 2019]
- Deep Contextualized Self-Training for Low Resource Dependency Parsing [Rotman, Reichart, TACL 2019]
- Unsupervised Cross-Lingual Adaptation of Dependency Parsers Using CRF Autoencoders [Li and Tu, Findings of EMNLP 2020]
Named entity recognition
- Adversarial Adaptation of Synthetic or Stale Data [Kim et al., ACL 2017]
- Towards Open Domain Event Trigger Identification using Adversarial Domain Adaptation [Naik, Rose, ACL 2020]
- Unsupervised Domain Adaptation of Contextualized Embeddings for Sequence Labeling [Han, Eisenstein, EMNLP-IJCNLP 2019]
- Cross-Domain NER using Cross-Domain Language Modeling [Jia et al., ACL 2019]
- Multi-Task Domain Adaptation for Sequence Tagging [Peng, Dredze, RepL4NLP 2017]
- Effective Unsupervised Domain Adaptation with Adversarially Trained Language Models [Vu et al., EMNLP 2020]
- Inexpensive Domain Adaptation of Pretrained Language Models: Case Studies on Biomedical NER and Covid-19 QA [Poerner et al., Findings of EMNLP 2020]
Event extraction
- Adapting Event Extractors to Medical Data: Bridging the Covariate Shift [Naik et al., EACL 2021]
Relation extraction
- Adversarial Training for Relation Extraction [Wu et al., EMNLP 2017]
- Domain Adaptation for Relation Extraction with Domain Adversarial Neural Network [Fu et al., IJCNLP 2017]
- Generalizing Biomedical Relation Classification with Neural Adversarial Domain Adaptation [Rios et al., Bioinf. 2018]
- Genre Separation Network with Adversarial Training for Cross-genre Relation Extraction [Shi et al., EMNLP 2018]
- Don't Stop Pretraining: Adapt Language Models to Domains and Tasks [Gururangan et al., ACL 2020]