


This repository contains the code for evaluating the Traversal Perceptual Length, a model selection method introduced in paper Where and What? Examining Interpretable Disentangled Representations, on the pretrained 1,800 checkpoints provided in disentanglemen_lib on DSprites dataset.



Put the pretrained checkpoints on DSprites dataset of a model (e.g. BetaVAE) into a directory (e.g. pretrained_models/beta_vae). There are multiple checkpoints for a model with different hyper-parameters and random seeds. Each checkpoint is identified by a number (from 0 to 1799). There are 300 checkpoints for a model.

To evaluate the TPL for a model (taking beta_vae as an example), use the following script:

    python sweep_evaluate.py \
    --parent_dir pretrained_models/beta_vae \
    --overwrite False

This compute the TPL scores for each checkpoint in the pretrained_models/beta_vae folder, and put the results in the pretrained_models/beta_vae/N/metrics/tpl directory.

Use the same method to compute TPL for other models like factor_vae, dip-i, dip-ii, beta_tc_vae, and annealed_vae.

Use the following script to collect each metric results (e.g. TPL) for each model (e.g. beta_vae):

python collect_results.py \
    --results_dir pretrained_models/beta_vae \
    --metric tpl

Use the following script to plot the correlation figures between the TPL and other metrics.

python collect_stats.py --parent_parent_dir pretrained_models


author={Xinqi Zhu and Chang Xu and Dacheng Tao},
title={Where and What? Examining Interpretable Disentangled Representations},