Awesome
NLP4Code
Repository for the NLP4Code project at the LILY lab.
Installation
[Recommended] Create a virtualenv or conda enviroment
conda create -n nlp4code python=3.8
conda activate nlp4code
Then, install the dependencies:
pip install -r requirements.txt
(Optional) At any point, if you met with the Python import problem (e.g., ModuleNotFoundError
), try doing this in the main (NLP4Code
) directory:
export PYTHONPATH=`pwd`
To run LLAMA-based model, you need to install the development version of transformers
library:
pip install git+https://github.com/huggingface/transformers
Wandb
We use Wandb for experiment tracking. Please register ask Ansong for an invitation to the Wandb Yale-LILY team before running experiments. When you are ready to run the exps and log it to the cloud, do the following:
wandb login
Paste your API key and the login is complete. When start running experiments, you should see something like
wandb: Tracking run with wandb version 0.12.11
wandb: Run data is saved locally in /home/ansongni/Code/NLP4Code/wandb/run-20220309_150158-1ebacxm4
wandb: Run `wandb offline` to turn off syncing.
wandb: Syncing run mathqa-gpt-finetuning
wandb: ⭐️ View project at https://wandb.ai/yale-lily/unified-codegen
wandb: 🚀 View run at https://wandb.ai/yale-lily/unified-codegen/runs/1ebacxm4
If you want to do some test runs without logging to the cloud, run wandb offline
first as suggested above.
Naming of the experiments
In the $*.yaml$ configuration file, you should see a line like
default_root_dir: &exp_name results/mathqa-gpt_neo_1.3B-finetuning
We automatically get the experiment name by the string after /
, the tags for the experiments are automatically
generated by spliting that string by -
. In this case, the experiment will be named mathqa-gpt_neo_1.3B-finetuning
and the tags will be ["mathqa", "gpt_neo_1.3B", "finetuning"]
. Please follow this convention so that we can write all
of this in one place.
Fine-tuning
(Read the previous sections first if you are ready to run experiments) For fine-tuning, in the main directory, do:
python finetuning/trainer.py fit --config finetuning/training_configs/*.yaml
Testing
There are some basic tests in the tests
folder, to run all the tests (follow this link for more):
To run tests, do
python -m unittest discover <test_directory>
# or
python -m unittest discover -s <directory> -p '*_test.py'