DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
machine-learning
deep-learning
pytorch
gpu
compression
billion-parameters
data-parallelism
inference
mixture-of-experts
model-parallelism
pipeline-parallelism
trillion-parameters
zero
Обновлено 2024-11-20 04:04:47 +03:00
An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
machine-learning
deep-learning
data-science
python
tensorflow
mlops
pytorch
hyperparameter-optimization
hyperparameter-tuning
machine-learning-algorithms
model-compression
nas
neural-architecture-search
neural-network
automated-machine-learning
automl
bayesian-optimization
deep-neural-network
distributed
feature-engineering
Обновлено 2024-07-03 13:54:08 +03:00
Distributed AI/HPC Monitoring Framework
Обновлено 2024-06-07 19:24:53 +03:00
Federated Learning Utilities and Tools for Experimentation
machine-learning
pytorch
simulation
gloo
nccl
personalization
privacy-tools
transformers-models
distributed-learning
federated-learning
Обновлено 2024-01-11 22:20:09 +03:00
Azure Monitor Distro for OpenTelemetry Python
Обновлено 2023-12-28 22:02:27 +03:00
An IR for efficiently simulating distributed ML computation.
Обновлено 2022-01-21 01:50:37 +03:00
Machine learning metrics for distributed, scalable PyTorch applications.
Обновлено 2021-09-14 14:47:59 +03:00
Обновлено 2021-03-05 11:20:33 +03:00
Обновлено 2021-03-05 10:11:58 +03:00
👩🔬 Train and Serve TensorFlow Models at Scale with Kubernetes and Kubeflow on Azure
machine-learning
docker
kubernetes
tensorflow
jupyter-notebook
distributed-tensorflow
jupyterhub
kubeflow
tensorflow-serving
Обновлено 2020-11-13 20:49:23 +03:00
Distributed Deep Learning using AzureML
Обновлено 2019-11-19 05:28:26 +03:00
A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
Обновлено 2019-09-10 03:57:02 +03:00
Sample code showing how to perform distributed training of a Fizyr Keras-RetinaNet model on the COCO dataset using Horovod on Batch AI
Обновлено 2018-08-10 20:56:45 +03:00
Distributed Task Queue (development branch)
Обновлено 2016-07-12 04:06:18 +03:00