Awesome
<p align="center"> <img width="90%" src="https://github.com/AstraZeneca/rexmex/blob/main/rexmex_small.jpg?raw=true?sanitize=true" /> </p>reXmeX is recommender system evaluation metric library.
Please look at the Documentation and External Resources.
reXmeX consists of utilities for recommender system evaluation. First, it provides a comprehensive collection of metrics for the evaluation of recommender systems. Second, it includes a variety of methods for reporting and plotting the performance results. Implemented metrics cover a range of well-known metrics and newly proposed metrics from data mining (ICDM, CIKM, KDD) conferences and prominent journals.
Citing
If you find RexMex useful in your research, please consider adding the following citation:
@inproceedings{rexmex,
title = {{rexmex: A General Purpose Recommender Metrics Library for Fair Evaluation.}},
author = {Benedek Rozemberczki and Sebastian Nilsson and Piotr Grabowski and Charles Tapley Hoyt and Gavin Edwards},
year = {2021},
}
An introductory example
The following example loads a synthetic dataset which has the mandatory y_true
and y_score
keys. The dataset has binary labels and predictied probability scores. We read the dataset and define a defult ClassificationMetric
instance for the evaluation of the predictions. Using this metric set we create a score card and get the predictive performance metrics.
from rexmex import ClassificationMetricSet, DatasetReader, ScoreCard
reader = DatasetReader()
scores = reader.read_dataset()
metric_set = ClassificationMetricSet()
score_card = ScoreCard(metric_set)
report = score_card.get_performance_metrics(scores["y_true"], scores["y_score"])
An advanced example
The following more advanced example loads the same synthetic dataset which has the source_id
, target_id
, source_group
and target group
keys besides the mandatory y_true
and y_score
. Using the source_group
key we group the predictions and return a performance metric report.
from rexmex import ClassificationMetricSet, DatasetReader, ScoreCard
reader = DatasetReader()
scores = reader.read_dataset()
metric_set = ClassificationMetricSet()
score_card = ScoreCard(metric_set)
report = score_card.generate_report(scores, grouping=["source_group"])
Scorecard
A rexmex score card allows the reporting of recommender system performance metrics, plotting the performance metrics and saving those. Our framework provides 7 rating, 38 classification, 18 ranking, and 2 coverage metrics.
Metric Sets
Metric sets allow the users to calculate a range of evaluation metrics for a label - predicted label vector pair. We provide a general MetricSet
class and specialized metric sets with pre-set metrics have the following general categories:
- Ranking
- Rating
- Classification
- Coverage
Ranking Metric Set
- Normalized Distance Based Performance Measure (NDPM)
- Discounted Cumulative Gain (DCG)
- Normalized Discounted Cumulative Gain (NDCG)
- Reciprocal Rank
- Mean Reciprocal Rank (MRR)
- Spearmanns Rho
- Kendall Tau
- HITS@k
- Novelty
- Average Recall @ k
- Mean Average Recall @ k
- Average Precision @ k
- Mean Average Precision @ k
- Personalisation
- Intra List Similarity
Rating Metric Set
These metrics assume that items are scored explicitly and ratings are predicted by a regression model.
- Mean Squared Error (MSE)
- Root Mean Squared Error (RMSE)
- Mean Absolute Error (MAE)
- Mean Absolute Percentage Error (MAPE)
Classification Metric Set
These metrics assume that the items are scored with raw probabilities (these can be binarized).
- Precision (or Positive Predictive Value)
- Recall (Sensitivity, Hit Rate, or True Positive Rate)
- Area Under the Precision Recall Curve (AUPRC)
- Area Under the Receiver Operating Characteristic (AUROC)
- F-1 Score
- Average Precision
- Specificty (Selectivity or True Negative Rate )
- Matthew's Correlation
- Accuracy
- Balanced Accuracy
- Fowlkes-Mallows Index
Coverage Metric Set
These metrics measure how well the recommender system covers the available items in the catalog and possible users. In other words measure the diversity of predictions.
Documentation and Reporting Issues
Head over to our documentation to find out more about installation and data handling, a full list of implemented methods, and datasets.
If you notice anything unexpected, please open an issue and let us know. If you are missing a specific method, feel free to open a feature request. We are motivated to constantly make RexMex even better.
Installation via the command line
RexMex can be installed with the following command after the repo is cloned.
$ pip install .
Use -e/--editable
when developing.
Installation via pip
RexMex can be installed with the following pip command.
$ pip install rexmex
As we create new releases frequently, upgrading the package casually might be beneficial.
$ pip install rexmex --upgrade
Running tests
Tests can be run with tox
with the following:
$ pip install tox
$ tox -e py
Citation
If you use RexMex in a scientific publication, we would appreciate citations. Please see GitHub's built-in citation tool.
License