Awesome

Speech-to-Text-WaveNet : End-to-end sentence level Chinese speech recognition using DeepMind's WaveNet

A tensorflow implementation for Chinese speech recognition based on DeepMind's WaveNet: A Generative Model for Raw Audio. (Hereafter the Paper)

Version

Current Version : 0.0.1

Dependencies

python == 3.5
tensorflow == 1.0.0
librosa == 0.5.0

Dataset

清华30小时中文数据集

Directories

cache: save data featrue and word dictionary
data: wav files and related labels
model: save the models

Network model

Data random shuffle per epoch
Xavier initialization
Adam optimization algorithms
Batch Normalization

Train the network

python3 train.py

Test the network

python3 test.py

Other resources