Repository for Microsoft Team Data Science Process containing documents and scripts
Перейти к файлу
Debraj GuhaThakurta bf98d5ee96 Update README.md 2017-08-14 22:58:17 -07:00
Docs Update lifecycle-detail.md 2017-08-14 22:57:24 -07:00
README.md Update README.md 2017-08-14 22:58:17 -07:00
TDSP in a nutshell.pdf Datasheet 2016-11-05 14:08:04 -07:00

README.md

Team Data Science Process from Microsoft

Overview | Lifecycle | Team Structure | Project Template | Project Execution | Data Science Tools | TDSP with Azure ML


This repository contains the Team Data Science Process (TDSP) from Microsoft. TDSP is an agile, iterative, data science process for executing and delivering advanced analytics solutions. It is designed to to improve collaboration and efficiency of data science teams in enterprise organizations. It is supported through four key components:

  • a data science lifecycle definition
  • a standardized project structrure (project documentation and reporting templates)
  • infrasctructure for project execution (compute and storage infrastructure, code repositories, etc.)
  • tools for data science project tasks (version control, data exploration and modeling, work planning. etc.)

For execution of data science projects, TDSP provides guidelines on how to structure collaborative teams and tasks for data science projects, and execute data science projects using Agile planning and version control.

To perform certain stages of a data science project efficiently, TDSP also provides data exploration and (semi)automated modeling tools in R and Python.

TDSP resources on Azure

We provide documentation and end-to-end data science walkthroughs and templates for TDSP lifecycle stages using different platforms and tools on Azure, such as Azure ML, HDInsight, Microsoft R server, SQL-server, Azure Data Lake etc.

TDSP with Azure ML

Contributing to TDSP

We believe that with the help of the data science community, we can make TDSP even better, and can benefit more enterprises and individual data scientists to be more efficient. We welcome contributions to TDSP, either on documentation or on workflow or implementing TDSP on different tools for versioning or work items management. Feel free to contribute pages at TDSP/wiki.

If you have some useful data science tools and utilities to share, we encourage you to contribute to the TDSP-Utilities Github repository.

Release Notes

This is version 0.1.2 of TDSP. Version 0.1.1 was released in September 2016. We are continuously improving TDSP based on our further accumulated experience, and customer feedback.

Questions or suggestions

Should you have any questions or suggestions, please create a new discussion thread on the Issues Tab.


TDSP_LIFECYCLE


Last updated: Aug 15, 2017