Home

Awesome

Firth Bias Reduction on Cosine Classifiers with S2M2R Features

This repository contains the firth bias reduction experiments with S2M2R feature backbones and cosine classifiers. The theoretical derivation of the Firth bias reduction term on cosine classifiers is shown in our paper "On the Importance of Firth Bias Reduction in Few-Shot Classification". This is one of the three code repositories of our paper, and is a sub-module of the the main "Firth Bias Reduction in Few-Shot Learning" repository.

For a concise and informal description of our work, check out our paper's website: https://ehsansaleh.github.io/firthfsl

Here is the effect of Firth bias reduction on cosine classifiers and S2M2R feature backbones.

<img src="./figures/dacc_vs_nways_miniImagenet.svg" alt="drawing" width="47.2%"/> <img src="./figures/dacc_vs_nways_cifar.svg" alt="drawing" width="46.8%"/>

<img src="./figures/dacc_vs_nways_tieredImagenet.svg" alt="drawing" width="94%"/> <details> <summary><h2>Quick Q&A Rounds</h2></summary>
  1. Question: Give me a quick-starter code to start reproducing the paper trainings on a GPU?
    git clone --recursive https://github.com/ehsansaleh/firth_bias_reduction.git
    cd ./firth_bias_reduction/code_s2m2rf
    ./features/download.sh
    ./main_firth.sh
    

  1. Question: Give me a simple python command to run?

    python main_firth.py --proc_rank 0 --proc_size 1 --configid "1_mini_co_part0"
    
    <details> <summary><strong>More Information</strong></summary>
    • This will run the configuration specifed at ./configs/1_mini_co_part0.json.
    • This will store the generated outputs periodically at ./results/1_mini_co_part0/1_mini_co_part0_r0.csv.
    • Alternatively, you can launch N parallel processes and distribute the work-load among them:
      python main_firth.py --proc_rank 0 --proc_size 4 --configid "1_mini_co_part0" &
      python main_firth.py --proc_rank 1 --proc_size 4 --configid "1_mini_co_part0" &
      python main_firth.py --proc_rank 2 --proc_size 4 --configid "1_mini_co_part0" &
      python main_firth.py --proc_rank 3 --proc_size 4 --configid "1_mini_co_part0" &
      wait
      
    </details>

  1. Question: How can I reproduce the paper figures/tables?

    make summary
    make figures
    make tables
    
    <details> <summary><strong>More Information</strong></summary>
    1. If you have run new classifier trainings by either ./main_firth.sh or python main_firth.py, then run make summary. Otherwise, skip this step. This command will collect the csv files from the ./results directory, and process them into a single summarized file at ./summary/val2test.csv.
    2. Run make figures to refresh the PDF figures at the ./figures directory.
    3. Run make tables in case you're interested about the raw numbers at the ./tables directory.
    </details>

  1. Question: I have my own code and I do not want to use your code. How can I apply the Firth bias reduction to my own loss?

    ce_loss = nn.CrossEntropyLoss()
    ce_term = ce_loss(logits, target)
    
    log_probs = logits - torch.logsumexp(logits, dim=-1, keepdim=True)
    firth_term = -log_probs.mean()
    
    loss = ce_term + lam * firth_term
    loss.backward()
    
    • Alternatively, you can use the label_smoothing keyword argument in nn.CrossEntropyLoss.

    • Remember that this Firth formulation is only true for 1-layer logistic and cosine classifiers. For more complex networks, the FIM's log-determinant must be worked out.


<details> <summary>4. <strong>Question:</strong> You seem to have too many directories, files, and a fancy structure. Explain the whole thing as simple as possible?</summary>
               ./main_firth.sh or
              python main_firth.py              make summary                  make figures    
configs/*.json ================> results/*.csv =============> summary/*.csv ===============> figures/*.pdf
                      /\                                                       make tables      tables/*
                      ||
         (below is    ||
          optional    ||
           parts)     ||
                      ||                        python save_features.py                   
                      ======= features/*.hdf5 <========================== checkpoints/*.tar
                                                                                /\
                                                                                ||
                                                                                ||
                                                           python train.py ============== Datasets/*
                                                       or python train_cifar.py

The top horizontal line is the important one for our work.

</details>
<details> <summary>5. <strong>Question:</strong> What are the python environment package requirements?</summary> </details> </details> <details open> <summary><h2>Step-by-Step Guide to the Code</h2></summary> </details> <details> <summary><h2>Configurations and Arguments</h2></summary> </details> <details> <summary><h2>Extra Inner-working Details</h2></summary> </details> <details> <summary><h2>Nice Code Features</h2></summary>

We tried to structure the code as user-friendly as possible. Following features are worth considerations:

  1. Data Inclusion: All the data needed to produce the figures and tables, including

    1. the extracted features,
    2. the feature backbone parameters,
    3. the datasets,
    4. the experimental results and data,
    5. the generated figures and tables, etc.

    are either included in the repository themselves, or a google-drive link to them with automated downloading scripts is included.

  2. Download Automation: Downloading heavy feature files, datasets, or backbone parameters manually, and then transferring them to a cluster storage can be difficult and time-consuming. To alleviate this, we included automated downloading scripts for each of these elements. Just take a look at ./features/download.sh or ./Datasets/download.sh or ./checkpoints/download.sh; all the google drive links are included, a script will download them for you, and verify their correctness using the md5 checksums included in the repo. These scripts were tested multiple times at the time of writing, and if a breaking update happens to the google-drive api in the future, we will modify the download code to fix the issue as soon as you let us know!

  3. Python Environment Specification: Not only we provide our exact python library dependencies and versions in the requirements.txt file, we also offer some automated helper scripts to create virtual environments. If you'd rather run your code in an environment of your choosing, that is totally fine as well.

  4. Reproducibility and Random Effects Matching: All the randomization effects (such as the batch ordering, the parameter initializations, etc.) are controlled through rigorous seeding of the random generators. The results are tested to be deterministically reproducible (i.e., running the same code 10 times will give you the same exact result every time). This can be useful if you want to make a slight algorithmic change, and observe the difference; all the randomized effects will be matched between the two runs.

Just give this code a try; it won't take much of your time to set up. You may even find it a good starting point for your own project :)

</details>

Code Origins

References

@inproceedings{ghaffari2022fslfirth,
    title={On the Importance of Firth Bias Reduction in Few-Shot Classification},
    author={Saba Ghaffari and Ehsan Saleh and David Forsyth and Yu-Xiong Wang},
    booktitle={International Conference on Learning Representations},
    year={2022},
    url={https://openreview.net/forum?id=DNRADop4ksB}
}