Awesome

Adversarial Reprogramming on Speech Command Recognition

Environment

Tensorflow 2.2 (CUDA=10.0) and Kapre 0.2.0.

option 1 (from yml)

conda env create -f repr-scr.yml
source activate repr-scr

option 2 (from clean python 3.6)

pip install tensorflow-gpu==2.1.0
pip install kapre==0.2.0
pip install h5py==2.10.0

Dataset

Arabic Speech Commands dataset

Please download the Arabic Speech Commands dataset here.

./prepare_ar_data.sh

Lithuanian Speech Commands dataset

Please download the Lithuanian Speech Commands dataset here.

./prepare_lt_data.sh

Dysarthric Speech Commands dataset

Please download the Lithuanian Speech Commands dataset here.

./prepare_dm_data.sh

Training

For training and evaluating the three speech command recognition results.

./run_ar.sh
./run_lt.sh
./run_dm.sh

For more details please refer to AR-SCR, LT-SCR and DM-SCR

(Optional) Note that in our default setting we use the random mapping strategy. To enable the similarity mapping, please modify the code at utils.py as followed:

def multi_mapping(prob, source_num, mapping_num, target_num):
    
    similarity_mapping = True

And choose lable_map according to your task. You can also see and check mapping results for each task by running the following command:

python AR-SCR/source_target_pairing.py
python LT-SCR/source_target_pairing.py
python DM-SCR/source_target_pairing.py

Please consider to cite this work if you use the provided code or find the idea related to your research. Thank you!

A Study of Low-Resource Speech Commands Recognition Based on Adversarial Reprogramming Paper


@article{yen2023neural,
  title={Neural model reprogramming with similarity based mapping for low-resource spoken command classification},
  author={Yen, Hao and Ku, Pin-Jui and Yang, Chao-Han Huck and Hu, Hu and Siniscalchi, Sabato Marco and Chen, Pin-Yu and Tsao, Yu},
  journal={Proc. of Interspeech},
  year={2023}
}

Related References

Voice2Series: Reprogramming Acoustic Models for Time Series Classification Paper


@InProceedings{pmlr-v139-yang21j,
  title = 	 {Voice2Series: Reprogramming Acoustic Models for Time Series Classification},
  author =       {Yang, Chao-Han Huck and Tsai, Yun-Yun and Chen, Pin-Yu},
  booktitle = 	 {Proceedings of the 38th International Conference on Machine Learning},
  pages = 	 {11808--11819},
  year = 	 {2021},
  volume = 	 {139},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {18--24 Jul},
  publisher =    {PMLR},
}

Database for Arabic Speech Commands Recognition Paper
Voice Activation for Low-Resource Languages Paper
Unsupervised Pre-Training for Voice Activation Paper
A Speech Command Control-Based Recognition System for Dysarthric Patients Based on Deep Learning Technology Paper