Awesome

Natural Language Processing Tutorial

Tutorial in Chinese can be found in mofanpy.com.

This repo includes many simple implementations of models in Neural Language Processing (NLP).

All code implementations in this tutorial are organized as following:

Search Engine

TF-IDF numpy / TF-IDF skearn

Understand Word (W2V)

Continuous Bag of Words (CBOW)
Skip-Gram

Understand Sentence (Seq2Seq)

seq2seq
CNN language model

All about Attention

seq2seq with attention
Transformer

Pretrained Models

ELMo
GPT
BERT

Thanks for the contribution made by @W1Fl with a simplified keras codes in simple_realize. And the a pytorch version of this NLP tutorial made by @ruifanxu.

Installation

$ git clone https://github.com/MorvanZhou/NLP-Tutorials
$ cd NLP-Tutorials/
$ sudo pip3 install -r requirements.txt

TF-IDF

TF-IDF numpy code

TF-IDF short sklearn code

<a target="_blank" href="https://mofanpy.com/static/results/nlp/tfidf_matrix.png" style="text-align: center"> <img src="https://mofanpy.com/static/results/nlp/tfidf_matrix.png" height="250px" alt="image"> </a>

Word2Vec

Efficient Estimation of Word Representations in Vector Space

Skip-Gram code

CBOW code

<a target="_blank" href="https://mofanpy.com/static/results/nlp/cbow_illustration.png" style="text-align: center"> <img src="https://mofanpy.com/static/results/nlp/cbow_illustration.png" height="250px" alt="image"> </a> <a target="_blank" href="https://mofanpy.com/static/results/nlp/skip_gram_illustration.png" style="text-align: center"> <img src="https://mofanpy.com/static/results/nlp/skip_gram_illustration.png" height="250px" alt="image"> </a> <a target="_blank" href="https://mofanpy.com/static/results/nlp/cbow_code_result.png" style="text-align: center"> <img src="https://mofanpy.com/static/results/nlp/cbow_code_result.png" height="250px" alt="image"> </a>

Seq2Seq

Sequence to Sequence Learning with Neural Networks

Seq2Seq code

<a target="_blank" href="https://mofanpy.com/static/results/nlp/seq2seq_illustration.png" style="text-align: center"> <img src="https://mofanpy.com/static/results/nlp/seq2seq_illustration.png" height="250px" alt="image"> </a>

CNNLanguageModel

Convolutional Neural Networks for Sentence Classification

CNN language model code

<a target="_blank" href="https://mofanpy.com/static/results/nlp/cnn-ml_sentence_embedding.png" style="text-align: center"> <img src="https://mofanpy.com/static/results/nlp/cnn-ml_sentence_embedding.png" height="250px" alt="image"> </a>

Seq2SeqAttention

Effective Approaches to Attention-based Neural Machine Translation

Seq2Seq Attention code

<a target="_blank" href="https://mofanpy.com/static/results/nlp/luong_attention.png" style="text-align: center"> <img src="https://mofanpy.com/static/results/nlp/luong_attention.png" height="250px" alt="image"> </a> <a target="_blank" href="https://mofanpy.com/static/results/nlp/seq2seq_attention_res.png" style="text-align: center"> <img src="https://mofanpy.com/static/results/nlp/seq2seq_attention_res.png" height="250px" alt="image"> </a>

Transformer

Attention Is All You Need

Transformer code

<a target="_blank" href="https://mofanpy.com/static/results/nlp/transformer_encoder_decoder.png" style="text-align: center"> <img src="https://mofanpy.com/static/results/nlp/transformer_encoder_decoder.png" height="250px" alt="image"> </a> <a target="_blank" href="https://mofanpy.com/static/results/nlp/transformer0_decoder_encoder_attention.png" style="text-align: center"> <img src="https://mofanpy.com/static/results/nlp/transformer0_decoder_encoder_attention.png" height="250px" alt="image"> </a> <a target="_blank" href="https://mofanpy.com/static/results/nlp/transformer0_encoder_decoder_attention_line.png" style="text-align: center"> <img src="https://mofanpy.com/static/results/nlp/transformer0_encoder_decoder_attention_line.png" height="250px" alt="image"> </a>

ELMO

Deep contextualized word representations

ELMO code

<a target="_blank" href="https://mofanpy.com/static/results/nlp/elmo_training.png" style="text-align: center"> <img src="https://mofanpy.com/static/results/nlp/elmo_training.png" height="250px" alt="image"> </a> <a target="_blank" href="https://mofanpy.com/static/results/nlp/elmo_word_emb.png" style="text-align: center"> <img src="https://mofanpy.com/static/results/nlp/elmo_word_emb.png" height="250px" alt="image"> </a>

GPT

Improving Language Understanding by Generative Pre-Training

GPT code

<a target="_blank" href="https://mofanpy.com/static/results/nlp/gpt_structure.png" style="text-align: center"> <img src="https://mofanpy.com/static/results/nlp/gpt_structure.png" height="250px" alt="image"> </a> <a target="_blank" href="https://mofanpy.com/static/results/nlp/gpt7_self_attention_line.png" style="text-align: center"> <img src="https://mofanpy.com/static/results/nlp/gpt7_self_attention_line.png" height="250px" alt="image"> </a>

BERT

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

BERT code

My new attempt Bert with window mask

<a target="_blank" href="https://mofanpy.com/static/results/nlp/bert_gpt_comparison.png" style="text-align: center"> <img src="https://mofanpy.com/static/results/nlp/bert_gpt_comparison.png" height="250px" alt="image"> </a> <a target="_blank" href="https://mofanpy.com/static/results/nlp/bert_self_mask4_self_attention_line.png" style="text-align: center"> <img src="https://mofanpy.com/static/results/nlp/bert_self_mask4_self_attention_line.png" height="250px" alt="image"> </a>