Home

Awesome

MuLANN

Code and data for Multi-domain adversarial learning by Schoenauer-Sebag A., Heinrich L., Schoenauer M., Sebag M., Wu L. and Altschuler S., accepted at ICLR 2019. Manuscript and reviews

Multi-domain learning (MDL) aims at obtaining a model with minimal average risk across multiple domains. Our empirical motivation is automated microscopy data, where cultured cells are imaged after being exposed to known and unknown chemical perturbations, and each dataset displays significant experimental bias. This paper presents a multi-domain adversarial learning approach, MuLANN, to leverage multiple datasets with overlapping but distinct class sets, in a semi-supervised setting. Our contributions include: i) a bound on the average- and worst-domain risk in MDL, obtained using the H-divergence; ii) a new loss to accommodate semi-supervised multi-domain learning and domain adaptation; iii) the experimental validation of the approach, improving on the state-of-the-art on three standard image benchmarks, and a novel bioimage dataset, Cell.

Note

I'm currently translating this repo from Torch7 to PyTorch. PyTorch code is currently available for the MNIST <> MNIST-M shift only. If your favorite run is not yet in PyTorch, it should be very soon. Feel free to open up an issue in the meanwhile, so that I can make it a priority.

Table of contents

  1. Mnist runs
  2. Office runs
  3. Cell runs

MNIST runs

Dependencies

You need PyTorch and Python2 or 3. Specific packages:

Get MNIST and MNIST-M

Datasets will be downloaded by default, in the <data_dir> of your choice, when you launch training.

Launch a run

$ cd <code_folder>
$ python pysrc/mnist_exp.py --data_folder <data_dir> --result_folder <result_dir>

Other options:

Office runs

Dependencies

You need Torch7 and Python2 or 3. Specific packages:

Get the Office data and VGG-16 model

  1. Download Office from here. Let us call <office_dir> where you unpacked it.

  2. Prepare everything (down-scaling images to 3x256x256, creating folders, and dl VGG-16):

$ cd <code_folder>
$ python pysrc/data/preparation_office.py --data_folder <office_dir>

If you already have the Office dataset downscaled somewhere, just add office_folder=<office_folder> at the end of luasrc/settings.lua. You still need to download VGG-16 one way or another, e.g. by commenting l. 76 of prepation_office.py and still running it.

Launch a run

$ cd <code_folder>/luasrc
$ THC_CACHING_ALLOCATOR=0 th office_script.lua eta0 source target lambda fold fully_transductive method zeta p

where

For example, to reproduce the paper results for MuLANN on A>W for the first fold:

$ cd <code_folder>/luasrc
$ THC_CACHING_ALLOCATOR=0 th office_script.lua 0.0001 amazon webcam 0.1 0 true MuLANN 0.1 0.7

Launch a run with class asymmetry

$ cd <code_folder>/luasrc
$ THC_CACHING_ALLOCATOR=0 th office_script.lua eta0 source target lambda fold fully_transductive method zeta p asymmetry_type

where the arguments are the same as above, and

For example, to reproduce the paper results for MuLANN on A>W for the first fold, in the case of full asymmetry:

$ cd <code_folder>/luasrc
$ THC_CACHING_ALLOCATOR=0 th office_script.lua 0.0001 amazon webcam 0.1 0 true MuLANN 0.1 0.7 full

Cell dataset and runs

Dependencies

You need Torch7, FiJi (if England dataset will be used) and Python2 or 3. Specific packages:

Get the data and VGG-16 model, pre-process the data

  1. Download California and Texas from here. Let us call <cell_dir> where you unpacked it.

  2. Prepare (creating folders, and dl VGG-16):

$ cd <code_folder>
$ python pysrc/data/preparation_Texas_California.py --data_folder <cell_dir>
  1. If you just want Texas and California, you can stop here. If you also want England, follow me. We will download the dataset from the database where it is stored [Ljosa et al., 2012], then we will stitch the images together, and finally down-scale them to the same size and scale as the others.
$ cd <code_folder>
$ python pysrc/data/dl_England.py --data_folder <cell_dir>
$ python pysrc/data/stitch_England.py --data_folder <cell_dir> --fiji <location of your FiJi executable, eg link to ImageJ-linux64>
$ python pysrc/data/scale_England.py  --data_folder <cell_dir>

Launch a run

$ cd <code_folder>/luasrc
$ THC_CACHING_ALLOCATOR=0 th bio_script.lua eta0 source target lambda fold method zeta p

where

License

Copyright 2018-2021, University of California, San Francisco

Author: Alice Schoenauer Sebag for the Altschuler and Wu Lab

All rights reserved. This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 3 of the License.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.