Home

Awesome

CSS10: A Collection of Single Speaker Speech Datasets for 10 Languages

Abstract

We describe our development of CSS10, a collection of single speaker speech datasets for ten languages. It is composed of short audio clips from LibriVox audiobooks and their aligned texts. To validate its quality we train two neural text-to-speech models on each dataset. Subsequently, we conduct Mean Opinion Score tests on the synthesized speech samples. We make our datasets, pretrained models, and test resources publicly available. We hope they will be used for future speech tasks.

For details, check our paper. Kyubyong gave a talk with this paper at the workshop of 2018 The Korean Society of Speech Sciences.

Environments & Dependencies

Audiobooks & Datasets

CodeLanguageAudiobookRunning TimeReaderDataset
deGerman1. Meister Floh <br>2. Die acht Gesichter am Biwasee <br>3. Auswahl aus Die Serapionsbrüder16:42:45HokuspokusCSS German
elGreekΠαραμύθι χωρίς όνομα (Tale Without Name)04:08:14RapunzelinaCSS Greek
esSpanish1. Bailén <br>2. El 19 de Marzo y el 2 de Mayo<br>3. La Batalla de los Arapiles23:49:49TuxCSS Spanish
fiFinnish1. Gulliverin matkat kaukaisilla mailla <br>2. Ensimmäiset novellit <br>3. Kaleri-orja <br>4. Salmelan heinätalkoot10:32:03Harri Tapani YlilammiCSS Finnish
frFrench1. Les Misérables - tome 5 .<br> 2. Arsène Lupin contre Herlock Sholmès19:09:03Gilles G. Le BlancCSS French
huHungarianEgri csillagok10:00:25Diana MajlingerCSS Hungarian
jaJapanese明暗 (Meian)14:55:36ekzemplaroCSS Japanese
nlDutch20.000 Mijlen onder Zee14:06:40Bart de LeeuwCSS Dutch
ruRussian1. Ice March - Ледяной поход<br>2. Early Short Stories <br>3. Short Stories for Children and Adults21:22:10Mark ChulskyCSS Russian
zhChinese1. 朝花夕拾 (Chao Hua Si She))<bt>2. 呐喊 (Call to Arms)06:27:04Jing LiCSS Chinese

Pretrained Models & Audio Samples

CodeLanuagePretrained ModelsAudio Samples
deGermanDCTTS | TACOTRONDCTTS | TACOTRON
elGreekDCTTSDCTTS
esSpanishDCTTS | TACOTRONDCTTS | TACOTRON
fiFinnishDCTTS | TACOTRONDCTTS | TACOTRON
frFrenchDCTTS | TACOTRONDCTTS | TACOTRON
huHungarianDCTTS | TACOTRONDCTTS | TACOTRON
jaJapaneseDCTTS | TACOTRONDCTTS | TACOTRON
nlDutchDCTTS | TACOTRONDCTTS | TACOTRON
ruRussianDCTTS | TACOTRONDCTTS | TACOTRON
zhChineseDCTTS | TACOTRONDCTTS | TACOTRON

Cite

@article{park2019css10,
  title={CSS10: A Collection of Single Speaker Speech Datasets for 10 Languages},
  author={Park, Kyubyong and Mulc, Thomas},
  journal={Interspeech},
  year={2019}
}

By Kyubyong Park, Tommy Mulc