Awesome

GRENADE: Graph-Centric Language Model for Self-Supervised Representation Learning on Text-Attributed Graphs

This repository contains the code for the paper "GRENADE: Graph-Centric Language Model for Self-Supervised Representation Learning on Text-Attributed Graphs" accepted at EMNLP Findings 2023.

MainModel.pdf

Requirements

Dependencies

Install the required packages using the following command:

pip install -r requirements.txt

Dataset

The dataset is available at here. Inside each project (ogbl-citation2, ogbn-arxiv and ogbn-products) folder, there are several key files:

{project}-ogbn.torch: The dataset file including adjacency matrix, node classification labels, and split information.
{project}_text.csv/X.all.txt: The raw text content for each node.
mrr_edges.torch: The file containing the edges for link prediction task.

Usage

Graph-Centric Language Model for Self-Supervised Pretraining

cd scripts
sh ssl_train.sh

Downstream Evaluation

The evaluation includes the following tasks:

MLP node classification
GraphSage node classification
Link Prediction

cd scripts
sh eval.sh

Model Checkpoint

The node embeddings checkpoint is available at here.

Citation

If you use this code for your research, please cite our paper:

@misc{li2023grenade,
      title={GRENADE: Graph-Centric Language Model for Self-Supervised Representation Learning on Text-Attributed Graphs}, 
      author={Yichuan Li and Kaize Ding and Kyumin Lee},
      year={2023},
      eprint={2310.15109},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}