Awesome
Optimal Off-Policy Evaluation from Multiple Logging Policies
Overview
This repository contains the code for replicating the experiments of the paper "Optimal Off-Policy Evaluation from Multiple Logging Policies" (ICML2021, proceedings.mlr.press/v139/kallus21a.html)
If you find this code useful in your research then please cite:
@inproceedings{kallus2021optimal,
title={Optimal Off-Policy Evaluation from Multiple Logging Policies},
author={Kallus, Nathan and Saito, Yuta and Uehara, Masatoshi},
booktitle = {Proceedings of the 38th International Conference on Machine Learning},
pages={5247-5256},
year={2021},
volume = {139},
publisher={PMLR},
}
Dependencies
- python==3.7.3
- numpy==1.18.1
- pandas==0.25.1
- scikit-learn==0.23.1
- tensorflow==1.15.4
- pyyaml==5.1
- seaborn==0.10.1
- matplotlib==3.2.2
Running the code
To run the simulations with the multi-class classification datasets, run the following commands in the ./src/
directory:
for data in optdigits pendigits
do
python run_sims.py --num_sims 200 --data $data --is_estimate_pi_b
done
Nota that the configurations used in the experiments can be found in ./conf/policy_params.yaml
.
Once the simulations have finished running, the summarized results can be found in the ../log/{data}
directory for each data.