Home

Awesome

BC learning for sounds

Implementation of Learning from Between-class Examples for Deep Sound Recognition by Yuji Tokozume, Yoshitaka Ushiku, and Tatsuya Harada (ICLR 2018).

This also contains training of EnvNet: Learning Environmental Sounds with End-to-end Convolutional Neural Network (Yuji Tokozume and Tatsuya Harada, ICASSP 2017).<sup>1</sup>

News

Contents

Setup

Training

Results

Error rate (Standard learning → BC learning)

ModelESC-50ESC-10UrbanSound8K
EnvNet29.2 → 24.112.8 → 11.333.7 → 28.9
EnvNet-v225.6 → 18.214.2 → 10.630.9 → 23.4
EnvNet-v2 + <br> strong augment21.2 → 15.110.9 → 8.624.9 → 21.7
Humans [1]18.74.3-

See also

Between-class Learning for Image Clasification (github)


<i id=1></i><sup>1</sup> Training/testing schemes are simplified from those in the ICASSP paper.

<i id=2></i><sup>2</sup> It is higher than that reported in the ICASSP paper (36% error), mainly because here we use 4 out of 5 folds for training, whereas we used only 3 folds in the ICASSP paper.

Reference

<i id=1></i>[1] Karol J Piczak. Esc: Dataset for environmental sound classification. In ACM Multimedia, 2015.

<i id=2></i>[2] Justin Salamon, Christopher Jacoby, and Juan Pablo Bello. A dataset and taxonomy for urban sound research. In ACM Multimedia, 2014.