# Data Science Accelerator - *Spark based movie recommender* ## Overview The accelerator is to illustrate how to efficiently build a movie recommendation system within 30 minutes! The repository contains three parts - **Data** Schemas and references to sample data used in the accelerator. - **Code** Codes for training and scoring a movie recommender. - **Docs** Documents helping to build a recommender with Azure Machine Learning Service. ## Business domain Recommendation (e-commerce, entertainment, retail, etc.). ## Data science problem The problem a recommendation system tries to resolve is **Given historical observations of user preferences (i.e., ratings) on a set of items, how to predict and generate a set of items that the users will like most probably.** ## Data understanding Typically data in a recommendation system has a schema of |user|item|rating|[timestamp]| where user, item, and rating refer to user ID, item ID, and ratings given by a user towards an item. ## Modeling A recommender is built by using Spark built-in collaborative filtering algorithm, which is a matrix factorization typed algorithm that is regularized by alternating least squares technique. ## Solution architecture The whole recommendation solution consists of Azure services such as Azure Data Science Virtual Machine, Azure blob storage, Azure Container Registry, Azure Container Services, etc. The building process is completed with Azure Machine Learning Service.