Awesome
sphinxtrain-ruby
Toolkit for training/adapting CMU Sphinx acoustic models.
The main goal is to help with adapting existing acoustic models to a specific speaker/accent. Currently only the English Voxforge model is supported as a base - in fact an adapted one created by Peter Grasch in 2013 using the most up to date training data available at that time. I can confirm his results of a few percent performance increase over Voxforge 0.4 for my accent at least (British English).
Installation
Please note that Ruby >= 2.1 is required for this gem.
Add this line to your application's Gemfile:
gem 'sphinxtrain-ruby'
And then execute:
$ bundle
Or install it yourself as:
$ gem install sphinxtrain-ruby
Usage
Run sphinxtrain-ruby
from the command line and follow the instructions. It will:
- Download and extract the Grasch Voxforge English 0.4 acoustic model (on first run)
- Download the CMU ARCTIC example sentences (on first run)
- Record the 20 example sentences. Press enter to record, speak sentence, then wait.
- Decode the sentences using the base acoustic model, giving an overall score.
- Duplicate and adapt the base acoustic model using the recorded sentences.
- Decode the sentences using the adapted acoustic model, giving an overall score.
See some example output here. All data is saved in ~/.sphinxtrain-ruby
.
To Do
- Add support for different data sets, not just the example from CPU Sphinx
- Allow re-recording when mistakes are made
- Re-factor code and add specs
- Consider using actual libs rather than command line tools for adapting model
- Make command line wget downloads less verbose
Contributing
- Fork it ( https://github.com/watsonbox/sphinxtrain-ruby/fork )
- Create your feature branch (
git checkout -b my-new-feature
) - Commit your changes (
git commit -am 'Add some feature'
) - Push to the branch (
git push origin my-new-feature
) - Create a new Pull Request