Home

Awesome

Wasserstein Barycenter Transport for Multi-Source Domain Adaptation

This repository contains the implementation of the so-called Wasserstein Barycenter Transport Algorithm, explored in the following publications,

Eduardo F. Montesuma, Fred-Maurice Ngolè Mboula (2021, June). Wasserstein Barycenter Transport for Multi-Source Domain Adaptation. In 2021 IEEE conference on computer vision and pattern recognition. [Paper] [Supplementary]

Eduardo F. Montesuma, Fred-Maurice Ngolè Mboula, "Wasserstein Barycenter Transport for Domain Adaptation", International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 2021. [IEEE Explore]

News

Intuition

alt text

Modules

In this repo we provide a single package that implements all tested domain adaptation algorithms. Especially, TCA and KMM were implemented using the libtlda toolbox and OT-related methods were implemented through the POT toolbox. The implementations can be found in the ./msda folder.

Data

You can either use pre-extracted featuers (available on ./data folder) or download the samples and run the generation scripts provided in this repo.

Music-Speech Discrimination

  1. Music Speech Recognition Source Direct Link
  2. Noise Dataset Source

Music Genre Recognition

  1. GTZAN Music Genre Recognition Source Direct Link
  2. Noise Dataset Source

Face Recognition

  1. Caltech-Office Decaf features Source Direct Link

Object Recognition

  1. PIE Dataset Source

NOTE: on the ICASSP publication we explore solely Music-Speech Discrimination and Music-Genre Recognition. In the CVPR publication, we explore all four.

Results

Results for Music Genre Recognition (MGR)

MethodBuccaneer2DestroyerengineF16Factory2
Baseline22.90 ± 0.8438.25 ± 0.9151.57 ± 1.1147.80 ± 0.34
KMM21.75 ± 0.9939.25 ± 0.6649.81 ± 1.6947.37 ± 0.71
TCA<ins>58.95 ± 1.27</ins>60.67 ± 2.07<ins>68.75 ± 2.11</ins>59.82 ± 0.50
SinT56.35 ± 0.84<ins>61.92 ± 1.64</ins>66.72 ± 1.8661.77 ± 1.65
SinT<sub>reg</sub>58.02 ± 1.4560.47 ± 1.7566.55 ± 1.60<ins>63.87 ± 1.51</ins>
JCPOT35.87 ± 0.4148.47 ± 2.9751.92 ± 3.2551.95 ± 1.75
JCPOT-LP36.40 ± 0.3952.92 ± 1.3256.30 ± 0.3751.52 ± 2.28
WBT21.37 ± 2.2524.30 ± 2.7125.30 ± 6.0222.70 ± 2.25
WBT<sub>reg</sub>70.60 ± 1.3383.10 ± 1.6483.92 ± 1.0190.00 ± 0.86
Target-only67.43 ± 1.4367.96 ± 2.9166.86 ± 2.0068.37 ± 1.87

Results for Music-Speech Discrimination (MSD)

MethodBuccaneer2DestroyerengineF16Factory2
Baseline82.43 ± 1.7551.57 ± 2.5688.89 ± 2.7250.02 ± 2.21
KMM87.12 ± 2.7952.35 ± 2.9474.86 ± 5.5850.41 ± 2.17
TCA90.43 ± 1.4087.14 ± 4.9995.12 ± 2.0284.76 ± 3.30
SinT89.26 ± 1.5682.84 ± 2.7884.97 ± 3.09<ins>91.21 ± 2.04</ins>
SinT<sub>reg</sub>87.28 ± 2.9784.38 ± 1.7686.14 ± 2.7990.61 ± 1.68
JCPOT<ins>92.55 ± 2.11</ins><ins>87.89 ± 1.39</ins>88.67 ± 1.6782.41 ± 2.22
JCPOT-LP89.06 ± 1.3884.97 ± 3.2390.24 ± 1.7186.13 ± 1.88
WBT56.88 ± 9.5456.63 ± 6.8856.63 ± 6.5659.38 ± 2.61
WBT<sub>reg</sub>96.42 ± 1.4892.79 ± 2.95<ins>93.75 ± 0.97</ins>95.31 ± 1.11
Target-only90.51 ± 3.9893.07 ± 3.8189.23 ± 4.2592.30 ± 3.62

Citation

If you find this work useful in your research, please consider citing us using the bibtex below,

CVPR

@InProceedings{montesuma2021cvpr,
    author    = {Montesuma, Eduardo Fernandes and Mboula, Fred Maurice Ngole},
    title     = {Wasserstein Barycenter for Multi-Source Domain Adaptation},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2021},
    pages     = {16785-16793}
}

ICASSP

@INPROCEEDINGS{montesuma2021icassp,
  author={Montesuma, Eduardo F. and Ngolè Mboula, Fred-Maurice},
  booktitle={ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)}, 
  title={Wasserstein Barycenter Transport for Acoustic Adaptation}, 
  year={2021},
  volume={},
  number={},
  pages={3405-3409},
  doi={10.1109/ICASSP39728.2021.9414199}}