Awesome
Bert Pretrained Token Embeddings
BERT(BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding) yields pretrained token (=subword) embeddings. Let's extract and save them in the word2vec format so that they can be used for downstream tasks.
Requirements
- pytorch_pretrained_bert
- NumPy
- tqdm
Extraction
- Check
extract.py
.
Bert (Pretrained) Token Embeddings in word2vec format
Models | # Vocab | # Dim | Notes |
---|---|---|---|
bert-base-uncased | 30,522 | 768 | |
bert-large-uncased | 30,522 | 1024 | |
bert-base-cased | 28,996 | 768 | |
bert-large-cased | 28,996 | 1024 | |
bert-base-multilingual-cased | 119,547 | 768 | Recommended |
bert-base-multilingual-uncased | 30,522 | 768 | Not recommended |
bert-base-chinese | 21,128 | 768 |