Awesome
DE-RRD: A Knowledge Distillation Framework for Recommender System
1. Overview
This repository provides the source code of our paper: DE-RRD: A Knowledge Distillation Framework for Recommender System, accepted in CIKM'20 as a full research paper.
In the paper, we propose two distillation methods:
-
Distillation Experts (DE) that distills the teacher's latent knowledge.
-
Relaxed Ranking Distillation (RRD) that distills ranking information from the teacher's predictions.
2. Evaluation
2.1. Leave-One-Out (LOO) protocol
We provide the leave-one-out evaluation protocol used in the paper. The protocol is as follows:
- For each test user
- randomly sample two positive (observed) items
- each of them is used for test/validation purpose.
- randomly sample 499 negative (unobserved) items
- evaluate how well each method can rank the test item higher than these sampled negative items.
- randomly sample two positive (observed) items
2.2. Metrics
We provide three ranking metrics broadly adopted in the recent papers: HR@N, NDCG@N, MRR@N. The hit ratio simply measures whether the test item is present in the top-N list, which is defined as follows:
where δ is the indicator function, U<sub>test</sub> is the set of the test users, p<sub>u</sub> is the hit ranking position of the test item for the user u. On the other hand, the normalized discounted cumulative gain and the mean reciprocal rank are ranking position-aware metrics that put higher scores to the hits at upper ranks. N@N and M@N are defined as follows:
3. Usage
A. For DE, run "main_DE.py"
B. For RRD, run "main_URRD.py"
We also provide the training log and the learning curve of each method. You can find them in /logs folder and the attached jupyter notebook.
4. Other Work
Please note that Topology Distillation (KDD'21), which is a follow-up study of DE, is available in https://github.com/SeongKu-Kang/Topology_Distillation_KDD21.
Also, IR-RRD (Information Sciences'21), which is a follow-up study of RRD, is available in https://github.com/SeongKu-Kang/IR-RRD_INS21.
5. Update
We found that the sampling processes for top-ranked unobserved items are unnecessary, and removing the processes gave considerable performance improvements for the ranking matching KD methods including RRD [CIKM'20], DCD [CIKM'21]. We provide more detailed explanation and experiment results on our new paper: Distillation from Heterogeneous Models for Top-K Recommendation [WWW'23].