- [LightGBM](https://github.com/microsoft/LightGBM) - A fast, distributed, high performance gradient boosting framework
- [Explainable Boosting Machines](https://github.com/interpretml/interpret) - interpretable model developed in Microsoft Research using bagging, gradient boosting, and automatic interaction detection to estimated generalized additive models.
### AutoML
- [Neural Network Intelligence](https://github.com/microsoft/nni) - An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
- [Archai](https://github.com/microsoft/archai) - Reproducible Rapid Research for Neural Architecture Search (NAS).
- [FLAML](https://github.com/microsoft/FLAML) - A fast and lightweight AutoML library.
- [Azure Automated Machine Learning](https://docs.microsoft.com/en-us/azure/machine-learning/concept-automated-ml) - Automated Machine Learning for Tabular data (regression, classification and forecasting) by Azure Machine Learning
### Neural Network
- [bayesianize](https://github.com/microsoft/bayesianize) - A Bayesian neural network wrapper in pytorch.
- [O-CNN](https://github.com/microsoft/O-CNN) - Octree-based convolutional neural networks for 3D shape analysis.
- [ResNet](https://github.com/KaimingHe/deep-residual-networks) - deep residual network.
- [CNTK](https://github.com/microsoft/CNTK) - microsoft cognitive toolkit (CNTK), open source deep-learning toolkit.
- [Microsoft Vision Model ResNet50](https://www.microsoft.com/en-us/research/blog/microsoft-vision-model-resnet-50-combines-web-scale-data-and-multi-task-learning-to-achieve-state-of-the-art/) - a large pretrained vision ResNet-50 model using search engine's web-scale image data.
- [Oscar](https://github.com/microsoft/Oscar) - Object-Semantics Aligned Pre-training for Vision-Language Tasks.
### Time Series
- [luminol](https://github.com/linkedin/luminol) - anomaly detection and correlation library.
### NLP
- [T-ULRv2](https://www.microsoft.com/en-us/research/blog/microsoft-turing-universal-language-representation-model-t-ulrv2-tops-xtreme-leaderboard/) - Turing multilingual language model.
- [Turing-NLG](https://www.microsoft.com/en-us/research/blog/turing-nlg-a-17-billion-parameter-language-model-by-microsoft/) - Turing Natural Language Generation, 17 billion-parameter language model.
- [DeBERTa](https://github.com/microsoft/DeBERTa) - Decoding-enhanced BERT with Disentangled Attention
- [UniLM](https://github.com/microsoft/unilm) - Unified Language Model Pre-training / Pre-training for NLP and Beyond
- [Unicoder](https://github.com/microsoft/Unicoder) - Unicoder model for understanding and generation.
- [NeuronBlocks](https://github.com/microsoft/NeuronBlocks) - building your nlp dnn models like playing lego
- [Multilingual Model Transfer](https://github.com/microsoft/Multilingual-Model-Transfer) - new deep learning models for bootstrapping language understanding models for languages with no labeled data using labeled data from other languages.
- [MT-DNN](https://github.com/microsoft/MT-DNN) - multi-task deep neural networks for natural language understanding.
- [OpenKP](https://github.com/microsoft/OpenKP) - automatically extracting keyphrases that are salient to the document meanings is an essential step in semantic document understanding.
- [DeText](https://github.com/linkedin/detext) - a deep neural text understanding framework for ranking and classification tasks.
### Interactive Machine Learning
- [Vowpal Wabbit](https://vowpalwabbit.org/) - fast, efficient, and flexible online machine learning techniques for reinforcement learning, supervised learning, and more.
### Recommendation
- [Recommenders](https://github.com/microsoft/recommenders) - examples and best practics for building recommendation systems (A2SVD, DKN, xDeepFM, LightGBM, LSTUR, NAML, NPA, NRMS, RLRMC, SAR, Vowpal Wabbit are invented/contributed by Microsoft).
- [GDMIX](https://github.com/linkedin/gdmix) - A deep ranking personalization framework
### Distributed
- [DeepSpeed](https://github.com/microsoft/DeepSpeed) - DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.
- [MMLSpark](https://github.com/Azure/mmlspark) - machine learning library on spark.
- [pyton-ml](https://github.com/linkedin/photon-ml) - a scalable machine learning library on apache spark.
- [TonY](https://github.com/linkedin/TonY) - framwork to natively run deep learning frameworks on apache hadoop.
### Casual Inference
- [EconML](https://github.com/microsoft/EconML) - Python package for estimating heterogeneous treatment effects from observational data via machine learning.
- [DoWhy](https://github.com/microsoft/dowhy) - Python library for causal inference that supports explicit modeling and testing of causal assumptions.
### Responsible AI
- [InterpretML](https://interpret.ml/) - a toolkit to help understand models and enable responsbile machine learning.
- [Interpret Community](https://github.com/interpretml/interpret-community) - extends interpret repo with additional interpretability techniques and utility functions.
- [DiCE](https://github.com/interpretml/DiCE) - diverse counterfactual explanations.
- [Interpret-Text](https://github.com/interpretml/interpret-text) - state-of-the-art explainers for text-based ml models and visualize with dashboard.
- [fairlearn](https://github.com/fairlearn/fairlearn) - python package to assess and improve fairness of machine learning models.
- [RobustDG](https://github.com/microsoft/robustdg) - Toolkit for building machine learning models that generalize to unseen domains and are robust to privacy and other attacks.
- [SHAP](https://github.com/slundberg/shap) - a game theoretic approach to explain the output of any machine learning model (scott lundbert, Microsoft Research).
- [LIME](https://github.com/marcotcr/lime) - explaining the predictions of any machine learning classifier (Marco, Microsoft Research).
- [BackwardCompatibilityML](https://github.com/microsoft/BackwardCompatibilityML) - Project for open sourcing research efforts on Backward Compatibility in Machine Learning
- [confidential-ml-utils](https://github.com/Azure/confidential-ml-utils) - Python utilities for training and deploying ML models against data you can't see.
- [presidio](https://github.com/Microsoft/presidio) - context aware, pluggable and customizable data protection and anonymization service for text and images.
- [Confidential ONNX Inference Server](https://github.com/microsoft/onnx-server-openenclave) - An Open Enclave port of the ONNX inference server with data encryption and attestation capabilities to enable confidential inference on Azure Confidential Computing.
- [Responsible-AI-Widgets](https://github.com/microsoft/responsible-ai-widgets) - responsible AI user interfaces for Fairlearn, interpret-community, and Error Analysis, as well as foundational building blocks that they rely on.
- [Error Analysis](https://erroranalysis.ai/) - A toolkit to help analyze and improve model accuracy.
### Optimization
- [ONNXRuntime](https://github.com/microsoft/onnxruntime) - cross-platfom, high performance ML inference and training accelerator.
- [ONNX Runtime Training Examples](https://github.com/microsoft/onnxruntime-training-examples) - examples for using onnx runtime for model training.
- [ONNS Converter](https://github.com/microsoft/onnxconverter-common) - common utilities for onnx converters.
- [ONNX.js](https://github.com/microsoft/onnxjs) - run onnx models using javascript.
- [ONNX.js Demo](https://github.com/microsoft/onnxjs-demo) - demos for ONNX.js.
- [Olive](https://github.com/microsoft/OLive) - a sequence of docker images that automates the process of ONNX model shipping.
- [Hummingbird](https://github.com/microsoft/hummingbird) - compile trained ml model into tensor computation for faster inference.
- [MMdnn](https://github.com/microsoft/MMdnn) - MMdnn is a set of tools to help users inter-operate among different deep learning frameworks. E.g. model conversion and visualization.
- [inifinibatch](https://github.com/microsoft/infinibatch) - Efficient, check-pointed data loading for deep learning with massive data sets.
- [InferenceSchema](https://github.com/Azure/InferenceSchema) - Schema decoration for inference code
- [nnfusion](https://github.com/microsoft/nnfusion) - flexible and efficient deep neural network compiler.
### Reinforcement Learning
- [AirSim](https://github.com/microsoft/AirSim) - open source simulator for autonomous vehicles build on unreal engine / unity from microsoft research.
- [TextWorld](https://github.com/microsoft/TextWorld) - TextWorld is a sandbox learning environment for the training and evaluation of reinforcement learning (RL) agents on text-based games.
- [Moab](https://github.com/microsoft/moab) - Project Moab, a new open-source balancing robot to help engineers and developers learn how to build real-world autonomous control systems with Project Bonsai.
- [Training Data-Driven or Surrogate Simulators](https://github.com/microsoft/datadrivenmodel) - build simulation from data for use in RL and Bonsai platform for machine teaching.
- [Bonsai Python SDK](https://github.com/BonsaiAI/bonsai3-py) - A python library for integrating data sources with Bonsai BRAIN.
### Windows
- [Windows Machine Learning](https://github.com/microsoft/Windows-Machine-Learning) - Machine Learning on Windows.
### Datasets
- [COCO Dataset](https://cocodataset.org/#home) - COCO is a large-scale object detection, segmentation, and captioning dataset.
- [MS MARCO](https://microsoft.github.io/msmarco/) - collection of datasets focused on deep learning in search.
- [InnerEye CreateDataset](https://github.com/microsoft/InnerEye-CreateDataset) - InnerEye dataset creation tool for InnerEye-DeepLearning library. Transforms DICOM data into mask for training Deep Learning models.
- [Sepsis Cohort from MIMIC III](https://github.com/microsoft/mimic_sepsis) - Sepsis cohort from MIMIC dataset.
### Debug
- [tensorwatch](https://github.com/microsoft/tensorwatch) - debugging, monitoring and visualization for python machine learning and data science.
- [PYRIGHT](https://github.com/microsoft/pyright) - static type checker for python.
- [Bench ML](https://github.com/microsoft/benchml) - Python library to benchmark popular pre-built cloud AI APIs.
- [debugpy](https://github.com/microsoft/debugpy) - An implementation of the Debug Adapter Protocol for Python
- [kineto](https://github.com/pytorch/kineto) - A CPU+GPU Profiling library that provides access to timeline traces and hardware performance counters contributed by Azure AI Platform team.
### Pipeline
- [GitHub Actions](https://github.com/features/actions) - Automate all your software workflows, now with world-class CI/CD. Build, test, and deploy your code right from GitHub.
- [Azure Pipelines](https://azure.microsoft.com/en-us/services/devops/pipelines/) - Automate your builds and deployments with Pipelines so you spend less time with the nuts and bolts and more time being creative.
- [Dagli](https://github.com/linkedin/dagli) - framework for defining machine learning models, including feature generation and transformations as DAG.
### Platform
- [AI for Earth API Platform](https://github.com/microsoft/AIforEarth-API-Platform) - distributed infrastructure designed to provide a secure, scalable, and customizable API hosting, designed to handle the needs of long-running/asynchronous machine learning model inference.
- [HivedDScheduler](https://github.com/microsoft/hivedscheduler) - Kubernetes Scheduler for Deep Learning.
- [Open Platfom for AI (OpenPAI](https://github.com/Microsoft/pai) - resource scheduling and cluster management for AI.
- [OpenPAI Runtime](https://github.com/microsoft/openpai-runtime) - Runtime for deep learning workload.
- [MLOS](https://github.com/microsoft/MLOS) - Data Science powered infrastructure and methodology to democratize and automate Performance Engineering.
- [Platform for Situated Intelligence](https://github.com/Microsoft/psi) - an open-source framework for multimodal, integrative AI.
- [Qlib](https://github.com/microsoft/qlib) - an AI-oriented quantitative investment platform.
### Tagging
- [TagAnomaly](https://github.com/microsoft/TagAnomaly) - Anomaly detection analysis and labeling tool, specifically for multiple time series (one time series per category)
- [Visual Studio Code](https://github.com/microsoft/vscode) - Code editor redefined and optimized for building and debugging modern web and cloud applications.
### Sample Code
- [Microsoft AI for Earth](https://www.microsoft.com/en-us/ai/ai-for-earth-tech-resources)
- [acoustic-bird-detection](https://github.com/microsoft/acoustic-bird-detection) - Tutorial: Accurate Bioacoustic Species Detection from Small Numbers of Training Clips Using the Biophony Model
- [beluasound](https://github.com/microsoft/belugasounds) - Using machine learning to detect beluga whale calls in hydrophone recordings.
- [arcticseals](https://github.com/microsoft/arcticseals) - detect & classify arctic seals in aerial imagery to understand how they’re adapting to a changing world.
- [Camera Trap Tool](https://github.com/microsoft/CameraTraps) - tools for training and running detectors and classifiers for wildlife images collected from motion-triggered cameras.
- [News Threads](https://github.com/microsoft/News-Threads) - The News Threads project analyzes news articles to help find similarities between news articles and trace news provenance across time.
- [InnerEye DeepLearning](https://github.com/microsoft/InnerEye-DeepLearning) - Medical Imaging Deep Learning library to train and deploy models on Azure Machine Learning and Azure Stack
- [Deep Seismic](https://github.com/microsoft/seismic-deeplearning) - Deep Learning for Seismic Imaging and Interpretation
- [Multi-species bioacoustic classification](https://github.com/microsoft/Multi_Species_Bioacoustic_Classification) - Multi-species bioacoustic classification using deep learning algorithms.
- [Nestle Acne Assessment](https://github.com/microsoft/nestle-acne-assessment) - deep learning models to assess the acne severity level based on selfie images.
- [Visual Analogies](https://github.com/microsoft/art) - exploring the connections between artworks with deep "Visual Analogies".
- [Forecasting Best Practices](https://github.com/microsoft/forecasting) - time series forecasting best practices & examples.
- [Computer Vision Recipes](https://github.com/microsoft/computervision-recipes) - best practices, code samples, and documentation for Computer Vision.
### Workshop
:runner: coming soon
### Book
- [PRML](https://www.microsoft.com/en-us/research/people/cmbishop/prml-book/) - Pattern Recognition and Machine Learning by Christopher Bishop
- [Foundations of Data Science](https://www.cs.cornell.edu/jeh/book.pdf) - Basic Theory for Data Science.
- [Mastering Azure Machine Learning: Perform large-scale end-to-end advanced machine learning in the cloud with Microsoft Azure Machine Learning](https://www.amazon.com/dp/1789807557/ref=cm_sw_r_tw_dp_SS0HC0EA2GXPK4QHF087)
- [Practical Automated Machine Learning on Azure: Using Azure Machine Learning to Quickly Build AI Solutions](https://www.amazon.com/dp/149205559X/ref=cm_sw_r_tw_dp_EN0VKVBDE5MZ1VK39K3F)
## Learning
- [Microsoft Learn](https://docs.microsoft.com/en-us/learn/) - Learning contents for Microsoft technology
- [Data Science for Manager](https://github.com/microsoft/datascience4managers) - Generalization, Utility, and Experimentation: ML Concepts for Making Better Business Decisions
- [Github Learning Lab](https://lab.github.com/) - learning contents for Github technology.
### Blog, News & Webinar
- [channel9 - AI Show](https://channel9.msdn.com/Shows/AI-Show) - videos for developers from people building Microsoft products and services.
- [Microsoft Open Source Blog](https://cloudblogs.microsoft.com/opensource/) - blog about microsoft open source technology.
- [Microsoft Research Event, Conference & Webinars](https://www.microsoft.com/en-us/research/webinar/) - Events, Conferences & Webinars by Microsoft Research.
<br>
---
## Contributing
This project welcomes contributions and suggestions. Most contributions require you to agree to a
Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us
the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.
When you submit a pull request, a CLA bot will automatically determine whether you need to provide
a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions
provided by the bot. You will only need to do this once across all repos using our CLA.
This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/).
For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) or
contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additional questions or comments.
## Trademarks
This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft
Microsoft takes the security of our software products and services seriously, which includes all source code repositories managed through our GitHub organizations, which include [Microsoft](https://github.com/Microsoft), [Azure](https://github.com/Azure), [DotNet](https://github.com/dotnet), [AspNet](https://github.com/aspnet), [Xamarin](https://github.com/xamarin), and [our GitHub organizations](https://opensource.microsoft.com/).
If you believe you have found a security vulnerability in any Microsoft-owned repository that meets [Microsoft's definition of a security vulnerability](https://docs.microsoft.com/en-us/previous-versions/tn-archive/cc751383(v=technet.10)), please report it to us as described below.
## Reporting Security Issues
**Please do not report security vulnerabilities through public GitHub issues.**
Instead, please report them to the Microsoft Security Response Center (MSRC) at [https://msrc.microsoft.com/create-report](https://msrc.microsoft.com/create-report).
If you prefer to submit without logging in, send email to [secure@microsoft.com](mailto:secure@microsoft.com). If possible, encrypt your message with our PGP key; please download it from the [Microsoft Security Response Center PGP Key page](https://www.microsoft.com/en-us/msrc/pgp-key-msrc).
You should receive a response within 24 hours. If for some reason you do not, please follow up via email to ensure we received your original message. Additional information can be found at [microsoft.com/msrc](https://www.microsoft.com/msrc).
Please include the requested information listed below (as much as you can provide) to help us better understand the nature and scope of the possible issue:
* Type of issue (e.g. buffer overflow, SQL injection, cross-site scripting, etc.)
* Full paths of source file(s) related to the manifestation of the issue
* The location of the affected source code (tag/branch/commit or direct URL)
* Any special configuration required to reproduce the issue
* Step-by-step instructions to reproduce the issue
* Proof-of-concept or exploit code (if possible)
* Impact of the issue, including how an attacker might exploit the issue
This information will help us triage your report more quickly.
If you are reporting for a bug bounty, more complete reports can contribute to a higher bounty award. Please visit our [Microsoft Bug Bounty Program](https://microsoft.com/msrc/bounty) page for more details about our active programs.
## Preferred Languages
We prefer all communications to be in English.
## Policy
Microsoft follows the principle of [Coordinated Vulnerability Disclosure](https://www.microsoft.com/en-us/msrc/cvd).