Awesome
FastIF: Scalable Influence Functions for Efficient Model Interpretation and Debugging
Requirements
Please see requirements.txt
and Dockerfile
for detailed dependencies. The major ones include
python 3.6 or later
(for type annotations and f-string)pytorch==1.5.1
transformers==3.0.2
Setup
Docker Setup
To build the docker image, run the following script.
DOCKER_BUILDKIT=1 docker build \
-t ${TAG} \
-f Dockerfile .
Data Setup
- Download the data following the examples from here and here.
- Mount the data into
/export/home/Data/Glue
and/export/home/Data/HANS
inside the image.
Experiments
- To train the base models, please use
scripts/run_MNLI.sh
andscripts/run_HANS.sh
. - To build FAISS indices, please see the function
create_FAISS_index
inexperiments/hans.py
. - Modify the paths in
experiments/constants.py
based on your setup. - To run the experiments, please follow the instructions in
run_experiments.py
where we have provided most of the default configurations/hyper-parameters.
Code Structure
experiments/
- This directory contains code that are used to conduct experiments.
- However, the entry-point for experiments is
run_experiments.py
.
influence_utils/
This directory contains the core components of the influence functions. Most of the codes are designed to be independent of the experiments so could be adapted for others downstream needs. Two of the most important ones are:
influence_utils/nn_influence_utils.py
contains the code for influence functions.influence_utils/parallel.py
contains the code for the parallel variant. Note that when running the parallel variant, make sure to turn offwandb
(see here for details) as the current codebase does not work well withwandb
turned on.