Awesome
Adversarial Reprogramming on Speech Command Recognition
<img src="https://github.com/dodohow1011/SpeechAdvReprogram/blob/main/illustration.png" width="500">Environment
Tensorflow 2.2 (CUDA=10.0) and Kapre 0.2.0.
- option 1 (from yml)
conda env create -f repr-scr.yml
source activate repr-scr
- option 2 (from clean python 3.6)
pip install tensorflow-gpu==2.1.0
pip install kapre==0.2.0
pip install h5py==2.10.0
Dataset
Arabic Speech Commands dataset
- Please download the Arabic Speech Commands dataset here.
./prepare_ar_data.sh
Lithuanian Speech Commands dataset
- Please download the Lithuanian Speech Commands dataset here.
./prepare_lt_data.sh
Dysarthric Speech Commands dataset
- Please download the Lithuanian Speech Commands dataset here.
./prepare_dm_data.sh
Training
For training and evaluating the three speech command recognition results.
./run_ar.sh
./run_lt.sh
./run_dm.sh
For more details please refer to AR-SCR, LT-SCR and DM-SCR
(Optional) Note that in our default setting we use the random mapping strategy. To enable the similarity mapping, please modify the code at utils.py as followed:
def multi_mapping(prob, source_num, mapping_num, target_num):
similarity_mapping = True
And choose lable_map according to your task. You can also see and check mapping results for each task by running the following command:
python AR-SCR/source_target_pairing.py
python LT-SCR/source_target_pairing.py
python DM-SCR/source_target_pairing.py
Please consider to cite this work if you use the provided code or find the idea related to your research. Thank you!
- A Study of Low-Resource Speech Commands Recognition Based on Adversarial Reprogramming Paper
@article{yen2023neural,
title={Neural model reprogramming with similarity based mapping for low-resource spoken command classification},
author={Yen, Hao and Ku, Pin-Jui and Yang, Chao-Han Huck and Hu, Hu and Siniscalchi, Sabato Marco and Chen, Pin-Yu and Tsao, Yu},
journal={Proc. of Interspeech},
year={2023}
}
Related References
- Voice2Series: Reprogramming Acoustic Models for Time Series Classification Paper
@InProceedings{pmlr-v139-yang21j,
title = {Voice2Series: Reprogramming Acoustic Models for Time Series Classification},
author = {Yang, Chao-Han Huck and Tsai, Yun-Yun and Chen, Pin-Yu},
booktitle = {Proceedings of the 38th International Conference on Machine Learning},
pages = {11808--11819},
year = {2021},
volume = {139},
series = {Proceedings of Machine Learning Research},
month = {18--24 Jul},
publisher = {PMLR},
}