Awesome

nlp journey

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. paper
GPT-2: Language Models are Unsupervised Multitask Learners. paper
Transformer-XL: Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context. paper
XLNet: Generalized Autoregressive Pretraining for Language Understanding. paper
RoBERTa: Robustly Optimized BERT Pretraining Approach. paper
DistilBERT: a distilled version of BERT: smaller, faster, cheaper and lighter. paper
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations. paper
T5: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. paper
ELECTRA: pre-training text encoders as discriminators rather than generators. paper
GPT3: Language Models are Few-Shot Learners. paper

LSTM(Long Short-term Memory). paper
Sequence to Sequence Learning with Neural Networks. paper
Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. paper
Residual Network(Deep Residual Learning for Image Recognition). paper
Dropout(Improving neural networks by preventing co-adaptation of feature detectors). paper
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. paper

An overview of gradient descent optimization algorithms. paper
Analysis Methods in Neural Language Processing: A Survey. paper
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. paper
A Review on Generative Adversarial Networks: Algorithms, Theory, and Applications. paper
A Gentle Introduction to Deep Learning for Graphs. paper
A Survey on Deep Learning for Named Entity Recognition. paper
More Data, More Relations, More Context and More Openness: A Review and Outlook for Relation Extraction. paper
Deep Learning Based Text Classification: A Comprehensive Review. paper
Pre-trained Models for Natural Language Processing: A Survey. paper
A Survey on Contextual Embeddings. paper
A Survey on Knowledge Graphs: Representation, Acquisition and Applications. paper
Knowledge Graphs. paper
Pre-trained Models for Natural Language Processing: A Survey. paper

Bag of Tricks for Efficient Text Classification (FastText). paper
Convolutional Neural Networks for Sentence Classification. paper
Attention-Based Bidirectional Long Short-Term Memory Networks for Relation Classification. paper

A Deep Ensemble Model with Slot Alignment for Sequence-to-Sequence Natural Language Generation. paper
SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient. paper

Learning to Rank Short Text Pairs with Convolutional Deep Neural Networks. paper
Learning Text Similarity with Siamese Recurrent Networks. paper
A Deep Architecture for Matching Short Texts. paper

A Question-Focused Multi-Factor Attention Network for Question Answering. paper
The Design and Implementation of XiaoIce, an Empathetic Social Chatbot. paper
A Knowledge-Grounded Neural Conversation Model. paper
Neural Generative Question Answering. paper
Sequential Matching Network A New Architecture for Multi-turn Response Selection in Retrieval-Based Chatbots．paper
Modeling Multi-turn Conversation with Deep Utterance Aggregation．paper
Multi-Turn Response Selection for Chatbots with Deep Attention Matching Network．paper
Deep Reinforcement Learning For Modeling Chit-Chat Dialog With Discrete Attributes. paper

Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation. paper
Neural Machine Translation by Jointly Learning to Align and Translate. paper
Transformer (Attention Is All You Need). paper

Distant Supervision for Relation Extraction via Piecewise Convolutional Neural Networks. paper
Neural Relation Extraction with Multi-lingual Attention. paper
FewRel: A Large-Scale Supervised Few-Shot Relation Classification Dataset with State-of-the-Art Evaluation. paper
End-to-End Relation Extraction using LSTMs on Sequences and Tree Structures. paper