Home

Awesome

KD-DAGFM

This is the official PyTorch implementation for the paper:

Zhen Tian, Ting Bai, Zibin Zhang, Zhiyuan Xu, Kangyi Lin, Ji-Rong Wen and Wayne Xin Zhao. Directed Acyclic Graph Factorization Machines for CTR Prediction via Knowledge Distillation. WSDM 2023.

Overview

we propose a Directed Acyclic Graph Factorization Machine (KD-DAGFM) to learn the high-order feature interactions from existing complex interaction models for CTR prediction via Knowledge Distillation. The proposed lightweight student model DAGFM can learn arbitrary explicit feature interactions from teacher networks, which achieves approximately lossless performance and is proved by a dynamic programming algorithm.

Requirements

tensorflow==2.4.1
python==3.7.3
cudatoolkit==11.3.1
pytorch==1.11.0

Download Datasets and Processing

Please download the datasets from Criteo, Avazu and MovieLens-1M, put them in the /DataSource folder.

Pre-process the data.

python DataSource/[dataset]_parse.py

Then divide the dataset.

python DataSource/split.py

Quick Start

Train the teacher model

python train.py --config_files=[dataset]_kd_dagfm.yaml --phase=teacher_training

Distillation

python train.py --config_files=[dataset]_kd_dagfm.yaml --phase=distillation --warm_up=/Saved/[teacher_file]

Finetuning

python train.py --config_files=[dataset]_kd_dagfm.yaml --phase=finetuning --warm_up=/Saved/[Student_file]

Maintainers

Zhen Tian. If you have any questions, please contact 1204216974@qq.com.

Cite

If you find DAGFM useful for your research or development, please cite the following papers: DAGFM.

@inproceedings{tian2023directed,
  title={Directed Acyclic Graph Factorization Machines for CTR Prediction via Knowledge Distillation},
  author={Tian, Zhen and Bai, Ting and Zhang, Zibin and Xu, Zhiyuan and Lin, Kangyi and Wen, Ji-Rong and Zhao, Wayne Xin},
  booktitle={Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining},
  pages={715--723},
  year={2023}
}