Awesome
Can GNN be Good Adapter for LLMs?
This repository is an implementation of GraphAdapter - Can GNN be Good Adapter for LLMs? in WWW 2024.
Requirements
- python = 3.8
- numpy >= 1.19.5
- pytorch = 1 .10.2
- pyg = 2.3.1
- transformers >= 4.28.1
For the largest dataset Arxiv, 300G storage is required
How to use our code
The datasets this paper used can be downloaded from here, please download them and put them in datasets to unzip.
Step 1. Preprocess data for training
python3 preprocess.py --dataset_name instagram --gpu 0 --plm_path llama2_path --type pretrain
The preprocess.py will load the textual data of Instagram, and next transform them to token embedding by Llama 2, which will be saved into saving_path. The saved embeddings will used in the training of GraphAdapter.
Step 2. Training GraphAdapter
python3 pretrain.py --dataset_name instagram --hiddensize_gnn 64 --hiddensize_fusion 64 --learning_ratio 5e-4 --batch_size 32 --max_epoch 15 --save_path your_model_save_path
Step 3. Finetuning for downstream task
GraphAdapter requires prompt embedding for finetuning,
python3 preprocess.py --dataset_name instagram --gpu 0 --plm_path llama2_path --type prompt
After preprocessing the dataset, now you can finetune to downstream tasks.
python3 finetune.py --dataset_name instagram --gpu 0 --metric roc --save_path your_model_save_path
Note: keep your_model_save_path consistent in both pretrain.py and finetune.py.
Citation
If you find our work or dataset useful, please consider citing our work:
@article{huang2024can,
title={Can GNN be Good Adapter for LLMs?},
author={Huang, Xuanwen and Han, Kaiqiao and Yang, Yang and Bao, Dezheng and Tao, Quanjin and Chai, Ziwei and Zhu, Qi},
journal={WWW},
year={2024}
}