Home

Awesome

ID-CNN-CWS

Source codes and corpora of paper "Iterated Dilated Convolutions for Chinese Word Segmentation" published in NNW journal.

2017-10-20_13-23-31

It implements the following 4 models for CWS:

Dependencies

Both CPU and GPU are supported. GPU training is 10 times faster.

Preparation

Run following script to convert corpus to TensorFlow dataset.

$ ./scripts/make.sh

Train and Test

Quick Start

$ ./scripts/run.sh $dataset $model

For example:

$ ./scripts/run.sh pku cnn

It will train a cnn model on pku dataset, then evaluate performance on test set.

CRF Layer

To enable CRF layer, simply append --viterbi to your command, e.g.

$ ./scripts/run.sh pku cnn --viterbi

Accuracy

2017-10-20_13-25-11

Speed

2017-10-20_11-44-42

Acknowledgments