Time Series Forecasting Best Practices & Examples
Перейти к файлу
Ubuntu bb7f71f33d Readds R material (#116) 2020-03-24 02:30:06 +00:00
.github Update to CONTRIBUTING instructions (#34) 2020-01-09 16:15:59 +00:00
R_utils Readds R material (#116) 2020-03-24 02:30:06 +00:00
assets updated the pictures 2020-02-20 17:07:45 +00:00
contrib Chenhui/Add CI tests for notebooks 2020-03-23 21:43:28 +00:00
docs Chenhui/Add CI tests for notebooks 2020-03-23 21:43:28 +00:00
examples Added CI tests for example notebooks 2020-03-23 21:59:34 +00:00
fclib Chenhui/Add CI tests for notebooks 2020-03-23 21:43:28 +00:00
tests Update component governance pipeline 2020-03-24 00:19:23 +00:00
tools Chenhui/Add CI tests for notebooks 2020-03-23 21:43:28 +00:00
.flake8 Vapaunic/lib (#37) 2020-01-15 20:55:54 +00:00
.gitignore add ignored items 2020-03-24 00:30:52 +00:00
.lintr Readds R material (#116) 2020-03-24 02:30:06 +00:00
.pre-commit-config.yaml Update to CONTRIBUTING instructions (#34) 2020-01-09 16:15:59 +00:00
LICENSE
NOTICE.txt added NOTICE file, currently empty as we're not redistributing any packages 2020-01-29 11:23:39 +00:00
README.md Chenhui/Add CI tests for notebooks 2020-03-23 21:43:28 +00:00
codeofconduct.md
forecasting.Rproj Readds R material (#116) 2020-03-24 02:30:06 +00:00
pyproject.toml Update to CONTRIBUTING instructions (#34) 2020-01-09 16:15:59 +00:00

README.md

Forecasting Best Practices

Time series forecasting is one of the most important topics in data science. Almost every business needs to predict the future in order to make better decisions and allocate resources more effectively.

This repository provides examples and best practice guidelines for building forecasting solutions. The goal of this repository is to build a comprehensive set of tools and examples that leverage recent advances in forecasting algorithms to build solutions and operationalize them. Rather than creating implementations from scratch, we draw from existing state-of-the-art libraries and build additional utilities around processing and featurizing the data, optimizing and evaluating models, and scaling up to the cloud.

The examples and best practices are provided as Python Jupyter notebooks and R markdown files and a library of utility functions. We hope that these examples and utilities can significantly reduce the “time to market” by simplifying the experience from defining the business problem to the development of solutions by orders of magnitude. In addition, the example notebooks would serve as guidelines and showcase best practices and usage of the tools in a wide variety of languages.

Content

The following is a summary of the examples related to the process of building forecasting solutions covered in this repository. The examples are organized according to use cases. Currently, we focus on a retail sales forecasting use case as it is widely used in assortment planning, inventory optimization, and price optimization.

Example Models/Methods Description Language
Quick Start Auto ARIMA, Azure AutoML, Linear Regression, LightGBM Quick start notebooks that demonstrate workflow of developing a forecast model using one-round training and testing data Python
Data Exploration and Preparation Statistical Analysis and Data Transformation Data exploration and preparation examples Python, R
Model Training and Evaluation Auto ARIMA, LightGBM, Dilated CNN Deep dive notebooks that perform multi-round training and testing of various classical and deep learning forecast algorithms Python
Model Tuning and Deployment HyperDrive, LightGBM Example notebook for model tuning using Azure Machine Learning Service and deploying the best model on Azure Python
R Models Mean Forecast, ARIMA, ETS, Prophet Popular statistical forecast models and Prophet model implmented in R R

Getting Started in Python

To quickly get started with the repository on your local machine, use the following commands.

  1. Install Anaconda with Python >= 3.6. Miniconda is a quick way to get started.

  2. Clone the repository

    git clone https://github.com/microsoft/forecasting
    cd forecasting/
    
  3. Run setup scripts to create conda environment. Please execute one of the following commands from the root of Forecasting repo based on your operating system.

    • Linux
    ./tools/environment_setup.sh
    
    • Windows
    tools\environment_setup.bat
    

    Note that for Windows you need to run the batch script from Anaconda Prompt. The script creates a conda environment forecasting_env and installs the forecasting utility library fclib.

  4. Start the Jupyter notebook server

    jupyter notebook
    
  5. Run the LightGBM single-round notebook under the 00_quick_start folder. Make sure that the selected Jupyter kernel is forecasting_env.

If you have any issues with the above setup, or want to find more detailed instructions on how to set up your environment and run examples provided in the repository, on local or a remote machine, please navigate to the Setup Guide.

Getting Started in R

We assume you already have R installed on your machine. If not, simply follow the instructions on CRAN to download and install R.

The recommended editor is RStudio, which supports interactive editing and previewing of R notebooks. However, you can use any editor or IDE that supports RMarkdown. In particular, Visual Studio Code with the R extension can be used to edit and render the notebook files. The rendered .nb.html files can be viewed in any modern web browser.

The examples use the Tidyverts family of packages, which is a modern framework for time series analysis that builds on the widely-used Tidyverse family. The Tidyverts framework is still under active development, so it's recommended that you update your packages regularly to get the latest bug fixes and features.

Target Audience

Our target audience for this repository includes data scientists and machine learning engineers with varying levels of knowledge in forecasting as our content is source-only and targets custom machine learning modelling. The utilities and examples provided are intended to be solution accelerators for real-world forecasting problems.

Contributing

We hope that the open source community would contribute to the content and bring in the latest SOTA algorithm. This project welcomes contributions and suggestions. Before contributing, please see our Contributing Guide.

Reference

The following is a list of related repositories that you may find helpful.

Deep Learning for Time Series Forecasting A collection of examples for using deep neural networks for time series forecasting with Keras.
Demand Forecasting and Price Optimization Solution A Cortana Intelligence solution how-to guide for demand forecasting and price optimization.

Build Status

Build Branch Status
Linux CPU master Build Status
Linux CPU staging Build Status