Awesome
Deep Voice 3
Work In Progress
To check the current status, see this.
This is a tensorflow implementation of DEEP VOICE 3: 2000-SPEAKER NEURAL TEXT-TO-SPEECH. For now I'm focusing on single speaker synthesis.
Data
I'm trying with Nick Offerman's audiobook files for fun and The LJ Speech Dataset which in public domain.
File Description
- hyperparams.py: hyper parameters
- prepro.py: creates inputs and targets, i.e., mel spectrogram, magnitude, and dones.
- data_load.py
- utils.py: several custom operational functions.
- modules.py: building blocks for the networks.
- networks.py: encoder, decoder, and converter
- train.py: train
- synthesize.py: inference
- test_sents.txt: some test sentences in the paper.