Awesome

An official code for paper "Synergistic Deep Graph Clustering Network".

Reproducibility

To reproduce the results in our paper locally, you should follow these steps:

Step :one: Download the repository to SynC.

git clone https://github.com/Marigoldwu/SynC SynC

Step :two: Create a python virtual environment (conda) and install the dependencies:

conda create --name sync python=3.8
conda activate sync
pip install -r requirements.txt

PyTorch is required (You can choose a suitable version according to your device):

pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --index-url https://download.pytorch.org/whl/cu118

Step :three: Prepare datasets (unzip the .rar file). All the datasets can be fetched from Liu's repository [Link]. The data files are organized as follows:

SynC/
├── dataset/
│   ├── acm/
│   │   ├── acm_adj.npy    	# Dense adjacency matrix.
│   │   ├── acm_feat.npy   	# Feature matrix.
│   │   └── acm_label.npy  	# Ground-truth labels.
│   └── dataset_info.py    	# Dataset information, e.g. number of clusters, number of nodes...

Step :four: Train with our provided pre-training weights.

cd SynC
python main.py -M SYNC -D acm -LS 10 -S 325
# Use the following command to view optional configuration information.
python main.py --help

If you want to train SynC on other datasets (provided in dataset_info.py , others need to manually add dataset information in this file), you can pre-train with the following command:

python main.py -P -M pretrain_tigae_for_sync -D acm -LS 1 -S 325

Results

The results in our paper.

	ACC	NMI	ARI	F1
ACM	92.73±0.04	73.58±0.22	79.58±0.11	92.74±0.04
DBLP	83.48±0.13	55.11±0.24	61.70±0.27	82.90±0.17
CITE	71.77±0.27	46.37±0.42	48.09±0.45	65.72±0.36
CORA	78.58±0.38	58.13±0.52	57.90±1.06	77.65±0.30
AMAP	82.48±0.04	69.70±0.23	65.02±0.11	80.69±0.11
UAT	57.33±0.13	28.58±0.24	26.60±0.17	57.34±0.23
Wisconsin	59.64±1.02	32.79±1.42	26.86±1.30	38.19±1.30
Texas	64.37±0.80	27.61±1.00	32.65±1.91	39.49±1.53

The results on the eight datasets recorded in the console output.txt.

The results reproduced by code ocean.

The code ocean capsule link: https://codeocean.com/capsule/8085961/tree.

To reproduce our results, you can click the Reproducible Run button.

Please note that the environment is slightly different from what is described in the article. Even so, the results in our paper are easy to reproduce.

The code ocean Dockerfile:

# hash:sha256:fe8085911e8d9a8c9a97a82e9bff996f985a0243f467d1578c08a7a47bfa0654
FROM registry.codeocean.com/codeocean/pytorch:2.1.0-cuda11.8.0-mambaforge23.1.0-4-python3.10.12-ubuntu22.04

ARG DEBIAN_FRONTEND=noninteractive

RUN apt-get update \
    && apt-get install -y --no-install-recommends \
        unrar=1:6.1.5-1 \
    && rm -rf /var/lib/apt/lists/*

RUN pip install -U --no-cache-dir \
    matplotlib==3.7.5 \
    munkres==1.1.4 \
    numpy==1.24.1 \
    scikit-learn==1.3.2 \
    scipy==1.9.1