Awesome
nlp-tutorial
<p align="center"><img width="100" src="https://upload.wikimedia.org/wikipedia/commons/thumb/1/11/TensorFlowLogo.svg/225px-TensorFlowLogo.svg.png" /> <img width="100" src="https://media-thumbs.golden.com/OLqzmrmwAzY1P7Sl29k2T9WjJdM=/200x200/smart/golden-storage-production.s3.amazonaws.com/topic_images/e08914afa10a4179893eeb07cb5e4713.png" /></p>nlp-tutorial
is a tutorial for who is studying NLP(Natural Language Processing) using Pytorch. Most of the models in NLP were implemented with less than 100 lines of code.(except comments or blank lines)
- [08-14-2020] Old TensorFlow v1 code is archived in the archive folder. For beginner readability, only pytorch version 1.0 or higher is supported.
Curriculum - (Example Purpose)
1. Basic Embedding Model
- 1-1. NNLM(Neural Network Language Model) - Predict Next Word
- Paper - A Neural Probabilistic Language Model(2003)
- Colab - NNLM.ipynb
- 1-2. Word2Vec(Skip-gram) - Embedding Words and Show Graph
- 1-3. FastText(Application Level) - Sentence Classification
- Paper - Bag of Tricks for Efficient Text Classification(2016)
- Colab - FastText.ipynb
2. CNN(Convolutional Neural Network)
- 2-1. TextCNN - Binary Sentiment Classification
3. RNN(Recurrent Neural Network)
- 3-1. TextRNN - Predict Next Step
- Paper - Finding Structure in Time(1990)
- Colab - TextRNN.ipynb
- 3-2. TextLSTM - Autocomplete
- Paper - LONG SHORT-TERM MEMORY(1997)
- Colab - TextLSTM.ipynb
- 3-3. Bi-LSTM - Predict Next Word in Long Sentence
- Colab - Bi_LSTM.ipynb
4. Attention Mechanism
- 4-1. Seq2Seq - Change Word
- 4-2. Seq2Seq with Attention - Translate
- 4-3. Bi-LSTM with Attention - Binary Sentiment Classification
- Colab - Bi_LSTM(Attention).ipynb
5. Model based on Transformer
- 5-1. The Transformer - Translate
- 5-2. BERT - Classification Next Sentence & Predict Masked Tokens
Dependencies
- Python 3.5+
- Pytorch 1.0.0+
Author
- Tae Hwan Jung(Jeff Jung) @graykode
- Author Email : nlkey2022@gmail.com
- Acknowledgements to mojitok as NLP Research Internship.