Home

Awesome

Word Prediction using Convolutional Neural Networks—can you do better than iPhone™ Keyboard?

In this project, we examine how well neural networks can predict the current or next word. Language modeling is one of the most important nlp tasks, and you can easily find deep learning approaches to it. Our contribution is threefold. First, we want to make a model that simulates a mobile environment, rather than having general modeling purposes. Therefore, instead of assessing perplexity, we try to save the keystrokes that the user need to type. To this end, we manually typed 64 English paragraphs with a iPhone 7 for comparison. It was super boring, but hopefully it will be useful for others. Next, we use CNNs instead of RNNs, which are more widely used in language modeling tasks. RNNs—even improved types such as LSTM or GRU—suffer from short term memory. Deep layers of CNNs are expected to overcome the limitation. Finally, we employ a character-to-word model here. Concretely, we predict the current or next word, seeing the preceding 50 characters. Because we need to make a prediction at every time step of typing, the word-to-word model dont't fit well. And the char-to-char model has limitations in that it depends on the autoregressive assumption. Our current belief is the character-to-word model is best for this task. Although our relatively simple model is still behind a few steps iPhone 7 Keyboard, we observed its potential.

Requirements

Background / Glossary / Metric

<img src="image/word_prediction.gif" width="200" align="right">

Data

Model Architecture / Hyper-parameters

Work Flow

if you want to use the pretrained model,

Updates

Results

The training took 4-5 2-3 days on my single GPU (gtx 1060). As can be seen below, our model is lower than iPhone in KSR by 8 5 percent points. Details are available in results.csv.

| #FK | #RK: Ours | #RK: iPhone 7 | |--- |--- |--- |--- |--- | | 40,787 | 24,727 (=0.39 ksr) <br>->23,753 (=0.42 ksr) | 21,535 (=0.47 ksr)|

Conclusions

Cited By