Home

Awesome

MSGC

Code for KBS paper "Multiple Sparse Graphs Condensation".

Abstract—The growing graph scale presents challenges for applying graph neural networks (GNNs) practically. Recently, Graph condensation (GCond) was proposed to condense the original large-scale graph into a small-scale graph. The goal is to make GNNs trained on the condensed graph achieve the performance close to those trained on the original graph. GCond achieves satisfactory performance on some datasets. However, GCond uses a single fully connected graph to model the condensed graph, which limits the diversity of embeddings obtained, especially when there are few synthetic nodes. We propose Multiple Sparse Graphs Condensation (MSGC), which condenses the original large-scale graph into multiple small-scale sparse graphs. MSGC takes standard neighborhood patterns as the basic substructures and can construct various connection schemes. Correspondingly, GNNs can obtain multiple sets of embeddings, which significantly enriches the diversity of embeddings. Experiments show that compared with GCond and other baselines, MSGC has significant advantages under the same condensed graph scale. MSGC can retain nearly 100% performance on Flickr and Citeseer datasets while reducing their graph scale by over 99.0%.

Environment configuration

Dataset

In utils_data.py, change argument root in function load_data to your path.

The dataset will be downloaded automatically. If the download fails, you can view the source code of torch_geometric.datasets and update the url.

Run

python -u main.py --dataset=cora --basic-model=SGC --val-model=GCN --n-syn=35 --batch-size-syn=16--num-run=5 --val-run=5

Modify parameters.py for more detailed setup.