Best Practices on Recommendation Systems

artificial-intelligence azure data-science deep-learning jupyter-notebook kubernetes machine-learning microsoft operationalization python ranking rating recommendation recommendation-algorithm recommendation-engine recommendation-system recommender tutorial

Перейти к файлу

Miguel González-Fierro 654f459c95 fix		2018-11-15 18:26:26 +01:00
.github	TEMPLATE: issue template update	2018-11-01 07:25:29 +00:00
notebooks	ALS notebook update for Databricks	2018-11-15 05:25:12 +00:00
reco_utils	ALS notebook update for Databricks	2018-11-15 05:25:12 +00:00
scripts	memory profiler from my codebase 💥	2018-11-09 11:58:47 +01:00
tests	ALS notebook smoke and integration test	2018-11-12 17:12:19 +00:00
.gitignore	updated instructions for databricks	2018-11-13 18:28:31 +01:00
CONTRIBUTING.md	fix	2018-11-08 11:40:57 +01:00
LICENSE	Initial commit	2018-09-19 03:06:13 -07:00
README.md	README: a few minor edits	2018-11-13 11:38:21 +08:00
SETUP.md	fix	2018-11-15 18:26:26 +01:00

README.md

Build Type	Branch	Status		Branch	Status
Linux CPU	master			staging
Linux Spark	master			staging

NOTE: the tests are executed every night, we use pytest for testing python utilities and papermill for testing notebooks.

Recommenders

This repository provides examples and best practices for building recommendation systems, provided as Jupyter notebooks. The examples detail our learning to illustrate four key tasks:

Preparing and loading data for each recommender algorithm.
Using different algorithms such as Smart Adapative Recommendation (SAR), Alternating Least Square (ALS), etc., for building recommender models.
Evaluating algorithms with offline metrics.
Operationalizing models in a production environment. The examples work across Python + CPU and PySpark environments, and contain guidance as to which algorithm to run in which environment based on scale and other requirements.

Several utilities are provided in reco_utils which will help accelerate experimenting with and building recommendation systems. These utility functions are used to load datasets (i.e., via Pandas DataFrames in python and via Spark DataFrames in PySpark) in the manner expected by different algorithms, evaluate different model outputs, split training data, and perform other common tasks. Reference implementations of several state-of-the-art algorithms are provided for self-study and customization in your own applications.

Environment Setup

Please see the setup guide.

Notebooks Overview

The Quick-Start Notebooks detail how you can quickly get up and run with state-of-the-art algorithms such as the SAR algorithm.
The Data Notebooks detail how to prepare and split data properly for recommendation systems
The Modeling Notebooks deep dive into implemetnations of different recommender algorithms
The Evaluate Notebooks discuss how to evaluate recommender algorithms for different ranking and rating metrics
The Operationalize Notebooks discuss how to deploy models in production systems

Notebook	Description
als_pyspark_movielens	Utilizing the ALS algorithm to power movie ratings in a PySpark environment.
sar_python_cpu_movielens	Utilizing the SAR algorithm to power movie ratings in a Python + CPU environment.
sar_pyspark_movielens	Utilizing the SAR algorithm to power movie ratings in a PySpark environment.
sarplus_movielens	Utilizing the SAR+ algorithm to power movie ratings in a PySpark environment.
data_split	Details on splitting data (randomly, chronologically, etc).
als_deep_dive	Deep dive on the ALS algorithm and implementation.
sar_deep_dive	Deep dive on the SAR algorithm and implementation.
evaluation	Examples of different rating and ranking metrics in Python + CPU and PySpark environments.

Benchmarks

Here we benchmark all the algorithms available in this repository.

NOTES:

Time for training and testing is measured in second.
Ranking metrics (i.e., precision, recall, MAP, and NDCG) are evaluated with k equal to 10. They are not applied to SAR-family algorithms (SAR PySpark, SAR+, and SAR CPU) because these algorithms do not predict explicit ratings that have the same scale with those in the original input data.
The machine we used is an Azure DSVM Standard NC6s_v2 with 6 vcpus, 112 GB memory and 1 K80 GPU.

Dataset	Algorithm	Training time	Testing time	Precision	Recall	MAP	NDCG	RMSE	MAE	Exp Var	R squared
Movielens 100k	ALS	5.730	0.326	0.096	0.079	0.026	0.100	1.110	0.860	0.025	0.023
	SAR PySpark	0.838	9.560	0.327	0.179	0.110	0.379
	SAR+	7.660	16.700	0.327	0.176	0.106	0.373
	SAR CPU	0.679	0.116	0.327	0.176	0.106	0.373
Movielens 1M	ALS	18.000	0.339	0.120	0.062	0.022	0.119	0.950	0.735	0.280	0.280
	SAR PySpark	9.230	38.300	0.278	0.108	0.064	0.309
	SAR+	38.000	108.000	0.278	0.108	0.064	0.309
	SAR CPU	5.830	0.586	0.277	0.109	0.064	0.308
Movielens 10M	ALS	92.000	0.169	0.090	0.057	0.015	0.084	0.850	0.647	0.359	0.359
	SAR PySpark
	SAR+	170.000	80.000	0.256	0.129	0.081	0.295
	SAR CPU	111.000	12.600	0.276	0.156	0.101	0.321
Movielens 20M	ALS	142.000	0.345	0.081	0.052	0.014	0.076	0.830	0.633	0.372	0.371
	SAR PySpark
	SAR+	400.000	221.000	0.203	0.071	0.041	0.226
	SAR CPU	559.000	47.300	0.247	0.135	0.085	0.287

Contributing

This project welcomes contributions and suggestions. Before contributing, please see our contribution guidelines.