Home

Awesome

Vāksañcayaḥ - Sanskrit speech corpus has more than 78 hours of data and contains recordings of 45,953 sentences with a sampling rate of 22 KHz. The content is mainly readings of various texts spanning many Śāstras of Saṃskṛt literature and also includes contemporary stories, radio program, extempore discourse, etc. The summary datasheet associated with this corpus can be accessed here - Link. Please download the corpus from https://www.cse.iitb.ac.in/~asr/.

Environments

Recipe

This Kaldi recipe is based on subword - Vowel Split and Byte Pair Encoding. For word based we used Wall Street Journal recipe

Training

Download the vowel splitter (This requires the text to be in SLP1 format)

Download the pre-trained model

Download the processed dataset

Evaluate

From pre-trained model (SLP vowel split)

./decode.sh test
# | WER : 18.12
./decode.sh truetest
# | WER : 34.88

Publications

Devaraja Adiga and Rishabh Kumar and Amrith Krishna and Preethi Jyothi and Ganesh Ramakrishnan and Pawan Goyal, Automatic Speech Recognition in Sanskrit: A New Speech Corpus and Modelling Insights, In ACL 2021.