

Data Repository for PyGOD

The statistics of the available dataset (#Con. means the number of contextual outliers, while #Strct. means the number of structural outliers. The number of outliers is slightly less than the sum of two types of outliers because of the intersection between two types of outliers.):

DatasetType#Nodes#Edges#FeatAvg. Degree#Con.#Strct.#OutliersOutlier Ratio

To use the datasets:

from pygod.utils import load_data
data = load_data('weibo') # in PyG format

Alternative download source in Baidu Disk (Chinese): https://pan.baidu.com/s/1afEZaygCRUYWJPtVbzuRYw Access Code: bond

For injected/generated datasets, the labels meanings are as follows.

Examples to convert the labels are as follows:

y = data.y.bool()    # binary labels (inlier/outlier)
yc = data.y >> 0 & 1 # contextual outliers
ys = data.y >> 1 & 1 # structural outliers