Awesome
NLP Data Augmentaion
Paper
Overview
Methods
- General
- random insertion, deletion, word, sentence shuffling
- Replacing words with synonyms
- Replace the words from dicitionary of the same label
- Perturbations (letter, word, or sentence level)
- Language model
- Back translation
- Round-trip translation
- Leverage External Data
- Using external data derived from Wikipedia. linking wikipedia articles to arbitrary input text. The idea is that if the input text were on Wikipedia, it would have links to other Wikipedia articles (that are semantically related and provide additional info).
- break the input text into n-grams
- check whether each n-gram exists as a wikipedia article to create a set of ‘candidate links’
- prune the candidate links by computing the similarity of the input text and the abstract of each candidate
- Conversational Systems
- Reading Comprehension
Library