Awesome
Addressing Parameter Choice Issues in Unsupervised Domain Adaptation by Aggregation [Paper]
A PyTorch suite to systematically evaluate different domain adaptation methods.
Requirmenets:
- Python3
- Pytorch==1.7
- Numpy==1.20.1
- scikit-learn==0.24.1
- Pandas==1.2.4
- skorch==0.10.0 (For DEV risk calculations)
- openpyxl==3.0.7 (for classification reports)
- Wandb=0.12.7
- Hydra=1.2.0
- OmegaConf=2.2.3
Installing
- Clone repository
git clone git@github.com:<repo>
cd bpda
- Create a python 3 conda environment
conda env create -f environment.yml
- Ensure that all required temp directories are available
data
Datasets
Available Datasets
We used four public datasets in this study. We also provide the preprocessed versions as follows:
Adding New Dataset
Structure of data
To add new dataset (e.g., NewData), it should be placed in a folder named: NewData in the datasets directory.
Since "NewData" has several domains, each domain should be split into train/test splits with naming style as "train_x.pt" and "test_x.pt".
The structure of data files should in dictionary form as follows:
train.pt = {"samples": data, "labels: labels}
, and similarly for test.pt
.
Configurations
Next, you have to add a class with the name NewData in the configs/data_model_configs.py
file.
You can find similar classes for existing datasets as guidelines.
Also, you have to specify the cross-domain scenarios in self.scenarios
variable.
Last, you have to add another class with the name NewData in the configs/hparams.py
file to specify
the training parameters.
Domain Adaptation Algorithms
Existing Algorithms
Adding New Algorithm
To add a new algorithm, place it in algorithms/algorithms.py
file.
Training procedure
To train the models run:
./run.sh
To collect the results run:
./collect_results.sh
Upper and Lower bounds
Main trainer file is trainer.py
and includes also source-only results when executed.
Results
- Each run will have all the cross-domain scenarios results in the format
runx_src_to_trg
, wherex
is the run_id. - Under each directory, you will find the classification report, a log file, checkpoint, and the different risks scores.
- By the end of the all the runs, you will find the overall average and std results in the run directory.
References
- Moment Matching for Multi-Source Domain Adaptation
- Amazon product data
- Unsupervised Domain Adaptation by Backpropagation
- Towards Accurate Model Selection in Deep Unsupervised Domain Adaptation
- The balancing principle for parameter choice in distance-regularized domain adaptation
Citation
@inproceedings{
IWA23,
title={Addressing Parameter Choice Issues in Unsupervised Domain Adaptation by Aggregation},
author={Dinu, Marius-Constantin and Beck, Maximilian and Nguyen, Duc Hoan and Huber, Andrea and Eghbal-zadeh, Hamid and Moser, Bernhard A. and Pereverzyev, Sergei V. and Hochreiter, Sepp and Zellinger, Werner},
booktitle={Submitted to The Eleventh International Conference on Learning Representations },
year={2023},
url={https://openreview.net/forum?id=M95oDwJXayG},
note={under review}
}