


PyPI Latest Release Build Status Documentation Status Downloads Coverage Status License Code Style

Homepage | Paper | Documentation | Discussion Forum | Dataset | δΈ­ζ–‡

CogDL is a graph deep learning toolkit that allows researchers and developers to easily train and compare baseline or customized models for node classification, graph classification, and other important tasks in the graph domain.

We summarize the contributions of CogDL as follows:

❗ News

<details> <summary> News History </summary> <br/> </details>

Getting Started

Requirements and Installation

Please follow the instructions here to install PyTorch (https://github.com/pytorch/pytorch#installation).

When PyTorch has been installed, cogdl can be installed using pip as follows:

pip install cogdl

Install from source via:

pip install git+https://github.com/thudm/cogdl.git

Or clone the repository and install with the following commands:

git clone git@github.com:THUDM/cogdl.git
cd cogdl
pip install -e .


API Usage

You can run all kinds of experiments through CogDL APIs, especially experiment. You can also use your own datasets and models for experiments. A quickstart example can be found in the quick_start.py. More examples are provided in the examples/.

from cogdl import experiment

# basic usage
experiment(dataset="cora", model="gcn")

# set other hyper-parameters
experiment(dataset="cora", model="gcn", hidden_size=32, epochs=200)

# run over multiple models on different seeds
experiment(dataset="cora", model=["gcn", "gat"], seed=[1, 2])

# automl usage
def search_space(trial):
    return {
        "lr": trial.suggest_categorical("lr", [1e-3, 5e-3, 1e-2]),
        "hidden_size": trial.suggest_categorical("hidden_size", [32, 64, 128]),
        "dropout": trial.suggest_uniform("dropout", 0.5, 0.8),

experiment(dataset="cora", model="gcn", seed=[1, 2], search_space=search_space)

Command-Line Usage

You can also use python scripts/train.py --dataset example_dataset --model example_model to run example_model on example_data.

For example, if you want to run GCN and GAT on the Cora dataset, with 5 different seeds:

python scripts/train.py --dataset cora --model gcn gat --seed 0 1 2 3 4

Expected output:

('cora', 'gcn')0.8050Β±0.00470.7940Β±0.0063
('cora', 'gat')0.8234Β±0.00420.8088Β±0.0016

If you have ANY difficulties to get things working in the above steps, feel free to open an issue. You can expect a reply within 24 hours.


<details> <summary> How to contribute to CogDL? </summary> <br/>

If you have a well-performed algorithm and are willing to implement it in our toolkit to help more people, you can first open an issue and then create a pull request, detailed information can be found here.

Before committing your modification, please first run pre-commit install to setup the git hook for checking code format and style using black and flake8. Then the pre-commit will run automatically on git commit! Detailed information of pre-commit can be found here.

</details> <details> <summary> How to enable fast GNN training? </summary> <br/> CogDL provides a fast sparse matrix-matrix multiplication operator called [GE-SpMM](https://arxiv.org/abs/2007.03179) to speed up training of GNN models on the GPU. The feature will be automatically used if it is available. Note that this feature is still in testing and may not work under some versions of CUDA. </details> <details> <summary> How to run parallel experiments with GPUs on several models? </summary> <br/>

If you want to run parallel experiments on your server with multiple GPUs on multiple models, GCN and GAT, on the Cora dataset:

$ python scripts/train.py --dataset cora --model gcn gat --hidden-size 64 --devices 0 1 --seed 0 1 2 3 4

Expected output:

('cora', 'gcn')0.8236Β±0.0033
('cora', 'gat')0.8262Β±0.0032
</details> <details> <summary> How to use models from other libraries? </summary> <br/> If you are familiar with other popular graph libraries, you can implement your own model in CogDL using modules from PyTorch Geometric (PyG). For the installation of PyG, you can follow the instructions from PyG (https://github.com/rusty1s/pytorch_geometric/#installation). For the quick-start usage of how to use layers of PyG, you can find some examples in the [examples/pyg](https://github.com/THUDM/cogdl/tree/master/examples/pyg/). </details> <details> <summary> How to make a successful pull request with unit test </summary> <br/> To have a successful pull request, you need to have at least (1) your model implementation and (2) a unit test.

You might be confused why your pull request was rejected because of 'Coverage decreased ...' issue even though your model is working fine locally. This is because you have not included a unit test, which essentially runs through the extra lines of code you added. The Travis CI service used by Github conducts all unit tests on the code you committed and checks how many lines of the code have been checked by the unit tests, and if a significant portion of your code has not been checked (insufficient coverage), the pull request is rejected.

So how do you do a unit test?


CogDL Team

CogDL is developed and maintained by Tsinghua, ZJU, DAMO Academy, and ZHIPU.AI.

The core development team can be reached at cogdlteam@gmail.com.

Citing CogDL

Please cite our paper if you find our code or results useful for your research:

    title={CogDL: A Comprehensive Library for Graph Deep Learning},
    author={Yukuo Cen and Zhenyu Hou and Yan Wang and Qibin Chen and Yizhen Luo and Zhongming Yu and Hengrui Zhang and Xingcheng Yao and Aohan Zeng and Shiguang Guo and Yuxiao Dong and Yang Yang and Peng Zhang and Guohao Dai and Yu Wang and Chang Zhou and Hongxia Yang and Jie Tang},
    booktitle={Proceedings of the ACM Web Conference 2023 (WWW'23)},