Home

Awesome

Speaker embeddings for Text-independent speaker verification using TensorFlow, with Kaldi

This is a slightly modified TensorFlow implementation of the model presented by David Snyder in Deep Neural Network Embeddings for Text-Independent Speaker Verification.

In the paper, this algorithm is a little worse than i-vector. My test show similar output. Also, in my test, shallow network was a very little worse than deep network (This is dependency of DB). <br />

In this code, there are many hard cording such folder location and some parameter related database. If I have database well-known SR database, I try to it. but I only have private database.<br />

I hope this code helps researcher.

Credits

Original paper:

@unknown{unknown,
author = {Snyder, David and Garcia-Romero, Daniel and Povey, Daniel and Khudanpur, Sanjeev},
title = {Deep Neural Network Embeddings for Text-Independent Speaker Verification},
year = {2017}
}

Also, use the part of code:

Features

Requirements

Usage

Preperation:

  1. Clone the repository recursively to get all folder and subfolders
  2. Prepare Database(I use private DB. If you need, the script needs to be modified)
  3. Use Kaldi-recipe extracing MFCC and VAD in SRE10/v1/run.sh

Running:

  1. run Training_kaldi function in make_dvec.py.<br /> after, run embedding_kaldi function.(Some function was written hard cording. Change you file location)
  2. use kaldi-recipe calculating mean vector and PLDA scoring.<br /> Maybe, you only run after /local/extract_ivectors.sh --stage 2 each folder.

Authors

qqueing@gmail.com( or kindsinu@naver.com)