Home

Awesome

Firth Bias Reduction with Standard Feature Backbones

This repository contains the core experiments with the standard ResNet feature backbones conducted in our paper "On the Importance of Firth Bias Reduction in Few-Shot Classification". This is one of the three code repositories of our paper, and is a sub-module of the the main "Firth Bias Reduction in Few-Shot Learning" repository.

For a concise and informal description of our work, check out our paper's website: https://ehsansaleh.github.io/firthfsl

Here is the effect of Firth bias reduction campared to typical L2 regularization in 16-way few-shot classification tasks using basic feature backbones and 1-layer logistic classifiers.

<img src="./opt/static_figures/dacc_vs_nshots_firth_1layer_mini.svg" alt="drawing" width="48%"/> <img src="./opt/static_figures/dacc_vs_nshots_l2_1layer_mini.svg" alt="drawing" width="46%"/>

Similar results can also be achieved using 3-layer logistic classifiers:

<img src="./opt/static_figures/dacc_vs_nshots_firth_3layer_mini.svg" alt="drawing" width="48%"/> <img src="./opt/static_figures/dacc_vs_nshots_l2_3layer_mini.svg" alt="drawing" width="46%"/>

<details> <summary><h2>Quick Q&A Rounds</h2></summary>
  1. Question: Give me a quick-starter code to start reproducing the paper trainings on a GPU?
    git clone --recursive https://github.com/ehsansaleh/firth_bias_reduction.git
    cd ./firth_bias_reduction/code_firth
    ./features/download.sh
    ./main.sh
    

  1. Question: Give me a simple python command to run?

    python main.py --device cuda:0 --configid "01_firth_1layer/firth_1layer"
    
    <details> <summary><strong>More Information</strong></summary>

    This will

    </details>

  1. Question: How can I reproduce the paper figures/tables?

    make summary
    make figures
    make tables
    
    <details> <summary><strong>More Information</strong></summary>
    1. If you have run new classifier trainings by either ./main.sh or python main.py, then run make summary. Otherwise, skip this step. This command will collect the csv files from the ./results directory, and process them into a single summarized file at ./summary/val2test.csv.
    2. Run make figures to refresh the PDF figures at the ./figures directory.
    3. Run make tables in case you're interested about the raw numbers at the ./tables directory.
    </details>

  1. Question: I have my own code and I do not want to use your code. How can I apply the Firth bias reduction to my own loss?

    ce_loss = nn.CrossEntropyLoss()
    ce_term = ce_loss(logits, target)
    
    log_probs = logits - torch.logsumexp(logits, dim=-1, keepdim=True)
    firth_term = -log_probs.mean()
    
    loss = ce_term + lam * firth_term
    loss.backward()
    
    • Alternatively, you can use the label_smoothing keyword argument in nn.CrossEntropyLoss.

    • Remember that this Firth formulation is only true for 1-layer logistic and cosine classifiers. For more complex networks, the FIM's log-determinant must be worked out.


<details> <summary>4. <strong>Question:</strong> You seem to have too many directories, files, and a fancy structure. Explain the whole thing as simple as possible?</summary>
                 ./main.sh or
                python main.py                  make summary                  make figures    
configs/*.json ================> results/*.csv =============> summary/*.csv ===============> figures/*.pdf
                      /\                                                       make tables      tables/*
                      ||
         (below is    ||
          optional    ||
           parts)     ||
                      ||                      python utils/datasets.py                   
                      ======= features/*.pt <========================== backbones/*.pt
                                                                              /\
                                                                              ||
                                                                              ||
                                           python opt/train_backbone.py ============== datasets/*
                                            or ./opt/train_backbone.sh

The top horizontal line is the important one.

</details>
<details> <summary>5. <strong>Question:</strong> What are the python environment package requirements?</summary> </details> </details> <details open> <summary><h2>Step-by-Step Guide to the Code</h2></summary> </details> <details> <summary><h2>Configurations and Arguments</h2></summary> </details> <details> <summary><h2>Extra Inner-working Details</h2></summary> </details> <details> <summary><h2>Nice Code Features</h2></summary>

We tried to structure the code as user-friendly as possible. Following features are worth considerations:

  1. GPU Acceleration: Even the 1-layer classifier trainings are batched along the RNG seed dimension, and are accelerated to run on GPUs.

  2. Reproducibility and Random Effects Matching: All the randomization effects (such as the batch ordering, the parameter initializations, etc.) are controlled through rigorous seeding of the random generators. The results are tested to be deterministically reproducible (i.e., running the same code 10 times will give you the same exact result every time). This can be useful if you want to make a slight algorithmic change, and observe the difference; all the randomized effects will be matched between the two runs.

  3. De-coupled Configurations from the Code: You don't need to specify long lines of argparse argument specifications in a bash file. Instead, just take a quick look at ./configs/01_firth_1layer/firth_1layer.json for an example. The running settings are specified in json files in the configs directory. You won't need to personally keep track of the arguments you passed to generate different results, since the settings will be permanently stored in the configs directory.

  4. Code Quality: We have used and tested this code rigirously in our work. There is even code to compute the maximum number of seeds in one batch when running each setting to avoid cuda out-of-memory errors. All this is being done automatically behind the scenes.

  5. Data Inclusion: All the data needed to produce the figures and tables, including

    1. the extracted features,
    2. the feature backbone parameters,
    3. the datasets,
    4. the experimental results and data,
    5. the generated figures and tables, etc.

    are either included in the repository themselves, or a google-drive link to them with automated downloading scripts is included.

  6. Download Automation: Downloading heavy feature files, datasets, or backbone parameters manually, and then transferring them to a cluster storage can be difficult and time-consuming. To alleviate this, we included automated downloading scripts for each of these elements. Just take a look at ./features/download.sh; all the google drive links are included, a script will download them for you, and verify their correctness using the md5 checksums included in the repo. These scripts were tested multiple times at the time of writing, and if a breaking update happens to the google-drive api in the future, we will modify the download code to fix the issue as soon as you let us know!

  7. Python Environment Specification: Not only we provide our exact python library dependencies and versions in the requirements.txt file, we also offer some automated helper scripts to create virtual environments. If you'd rather run your code in an environment of your choosing, that is totally fine as well.

Just give this code a try; it won't take much of your time to set up. You may even find it a good starting point for your own FSL projects :)

</details>

References

@inproceedings{ghaffari2022fslfirth,
    title={On the Importance of Firth Bias Reduction in Few-Shot Classification},
    author={Saba Ghaffari and Ehsan Saleh and David Forsyth and Yu-Xiong Wang},
    booktitle={International Conference on Learning Representations},
    year={2022},
    url={https://openreview.net/forum?id=DNRADop4ksB}
}