Home

Awesome

UniRec

Introduction

UniRec is an easy-to-use, lightweight, and scalable implementation of recommender systems. Its primary objective is to enable users to swiftly construct a comprehensive ecosystem of recommenders using a minimal set of robust and practical recommendation models. These models are designed to deliver scalable and competitive performance, encompassing a majority of real-world recommendation scenarios.

It is important to note that this goal differs from those of other well-known public libraries, such as Recommender and RecBole, which include missions of providing an extensive range of recommendation algorithms or offering various datasets.

The term "Uni-" carries several implications:

Installation

Installation from PyPI

  1. Ensure that PyTorch with CUDA supported (version 1.10.0-1.13.1) is installed:

    pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu113
    
    python -c "import torch; print(torch.__version__)"
    
  2. Install unirec with pip:

    pip install unirec
    

Installation from Wheel Locally

  1. Ensure that PyTorch with CUDA supported (version 1.10.0-1.13.1) is installed:

    pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu113
    
    python -c "import torch; print(torch.__version__)"
    
  2. Clone Git Repo

    git clone https://github.com/microsoft/UniRec.git
    
  3. Build

    cd UniRec
    pip install --user --upgrade setuptools wheel twine
    python setup.py sdist bdist_wheel
    

    After building, the wheel package could be found in UniRec/dist.

  4. Install

    pip install dist/unirec-*.whl 
    

    The specific package name could be find in UniRec/dist.

    Check if unirec is installed sucessfully:

    python -c "from unirec.utils import general; print(general.get_local_time_str())"
    

Algorithms

AlgorithmTypePaperCode
MFCollaborative FilteringBPRunirec/model/cf/mf.py
UserCFCollaborative Filtering-unirec/model/cf/usercf.py
SLIMCollaborative FilteringSLIMunirec/model/cf/slim.py
AdmmSLIMCollaborative FilteringADMMSLIMunirec/model/cf/admmslim.py
SARCollaborative FilteringItemCF, SARunirec/model/cf/sar.py
EASECollaborative FilteringEASEunirec/model/cf/ease.py
MultiVAECollaborative FilteringMultiVAEunirec/model/cf/multivae.py
SVDPlusPlusSequential ModelSVD++unirec/model/sequential/svdplusplus.py
AvgHistSequential Model-unirec/model/sequential/avghist.py
AttHistSequential Model-unirec/model/sequential/atthist.py
GRUSequential Model-unirec/model/sequential/gru.py
SASRecSequential ModelSASRecunirec/model/sequential/sasrec.py
ConvFormerSequential ModelConvFormerunirec/model/sequential/convformer.py
FastConvFormerSequential ModelConvFormerunirec/model/sequential/fastconvformer.py
FMRanking ModelFactorization Machineunirec/model/rank/fm.py
BSTRanking ModelBehavior sequence transformerunirec/model/rank/bst.py
MoRecMulti-objectiveMoRecunirec/facility/morec

Examples

To go through all the examples listed below, we provide a script for downloading and split for ml-100k dataset. Run:

python download_split_ml100k.py

The files for the raw dataset would be saved in your home dir: ~/.unirec/dataset/ml-100k

Next, it is essential to convert the raw dataset into a format compatible with UniRec. Use the script to process and save the files in UniRec/data/ml-100k.

cd examples/preprocess
bash preprocess_ml100k.sh

General Training

To train an existing model in UniRec, for instance, training SASRec with ml-100k dataset, refer to the script provided in examples/training/train_ml100k.sh.

Multi-GPU Training

UniRec supports multi-GPU training with the integration of Accelerate. An example script is available at examples/training/multi_gpu_train_ml100k.sh. The key arguments in the script could be found in line 3-12 in the script:

GPU_INDICES="0,1" # e.g. "0,1"

# Specify the number of nodes to use (one node may have multiple GPUs)
NUM_NODES=1

# Specify the number of processes in each node (the number should equal the number of GPU_INDICES)
NPROC_PER_NODE=2

For more details about the launching command, please refer to Accelerate Docs.

Hyperparameter Tuning with wandb

UniRec supports hyperparameter tuning (or hyperparameter optimization, HPO) with the intergration of WandB. There are three major steps to start a wandb experiment.

  1. Compose a training script and enable wandb. An example is provided in examples/training/train_ml100k_with_wandb.sh. The key arguments are:

    • --use_wandb=1: enable wandb in process
    • --wandb_file=/path/to/configuration_file: the configuration file for wandb, including command, metrics, method, and search space.
  2. Define sweep configuration. Write a YAML-format configuration file to set the command, monitor metrics, tuning method and search space.An example is available at examples/training/wandb.yaml. For more details about the configuration file, refer to WandB Docs

  3. Initialize sweeps and start sweep agents. To start an experiment with wandb, first, initialize a sweep controller for selecting hyperparameters and issuing intructions; then an agent would actually perform the runs. An example for launching wandb experiments is provided in examples/training/wandb_start.sh. Note that we offer a pipeline command in the script to start the agent automatically after sweep initialization. However, we recommend the simpler manual two-step process:

## Step 1. Initialize sweeps with CLI using configuration file. 
## For more details, please refer to https://docs.wandb.ai/guides/sweeps/initialize-sweeps

wandb sweep config.yaml

## Step 2. After `wandb sweep`, you would get a sweep id and the hint to use `sweep agent`, like:

## wandb: Creating sweep from: ./wandb.yaml
## wandb: Created sweep with ID: xxx
## wandb: View sweep at: https://wandb.ai/xxx/xxx/xxx/xxx
## wandb: Run sweep agent with: wandb agent xxx/xxx/xxx/xxx

wandb agent entity/project/sweep_ID

Serving with C# and Java

UniRec supports C# and Java inference based on ONNX format. We provide inference for user embedding, item embedding, and user-item score.

For more details, please refer to examples/serving/README

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

Trademarks

This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.