Home

Awesome

A Hybrid Matrix Factorization via Graph SignalProcessing

Abstract

Predicting whether a user would like a particular item has become significantly important for the success of commercial applications like Netflix, YouTube, Spotify, etc. One of the techniques that has become predominant for this task is formulating the problem as a matrix factorization one and trying to explain each rating as a numerical representation of the corresponding user and item. However, it is a standard problem that could be optimized with stochastic gradient descent (SGD) and there is no novelty in that approach.

In this project, we try to extend this approach by utilizing well known techniques from graph signal processing. The extension is consisted of two parts: 1) utilizing graph Fourier transform to switch from vertex graph domain to spectral graph domain, and 2) applying graph filters to denoise the obtained features. Using these techniques we have achieved RMSE of 0.877, an improvement of 4% over our baseline model -- the standard matrix factorization technique with SGD.

Project Structure

The project is organized as follows:

.
├── read_the_data.ipynb             # Jupyter Notebook for exploring the dataset
├── main.ipynb                      # A clear Jupyter Notebook which can be used for new testings
├── main_results.ipynb              # Jupyter Notebook that has the run models, and the figures used in the report
├── oimdb_metadata_crawled.pickle   # Open Movie Database (OMDb) crawled data
└── ml-100k                         # MovieLens 100k dataset

Prerequisites

The project was created and tested with the following dependencies:

Libraries

- Anaconda Python 3.6.5
- pickle
- numpy
- pandas
- sklearn
- matplotlib
- networkx

License

This project is licensed under the terms of the MIT license.

Authors