Tutorial on how to deploy Deep Learning models on GPU enabled Kubernetes cluster

deep-learning flask gpu kubernetes python tensorflow

Перейти к файлу

Fidan Boylu Uz, PhD aea3e5f83c Merge pull request #54 from Microsoft/danielleodean-patch-3 Suggest adding deprecation note with new repo link		2019-02-01 14:04:55 -05:00
Keras_Tensorflow	Merge pull request #53 from Microsoft/mabou	2019-01-28 09:02:58 -05:00
Pytorch	updates teardown	2018-11-12 16:54:02 +00:00
Tensorflow	Replace 'az login' with a conditional expression.	2018-09-25 19:31:17 +00:00
static	Add files via upload	2018-10-29 15:15:46 -04:00
.gitignore	Adds python files from jupytext. Updates gitignore	2018-10-08 09:58:52 +00:00
LICENSE	Initial working version of TF model	2018-03-21 11:44:11 +00:00
README.md	Suggest adding deprecation note with new repo link	2019-01-30 14:41:11 -05:00

README.md

This repo is no longer actively maintained, please see newer version available using Azure Machine Learning here.

Authors: Mathew Salvaris and Fidan Boylu Uz

Deploy Deep Learning CNN on Kubernetes Cluster with GPUs

Overview

In this repository there are a number of tutorials in Jupyter notebooks that have step-by-step instructions on how to deploy a pretrained deep learning model on a GPU enabled Kubernetes cluster. The tutorials cover how to deploy models from the following deep learning frameworks:

For each framework, we go through the following steps:

Model development where we load the pretrained model and test it by using it to score images
Developing the interface our Flask app will use to load and call the model
Building the Docker Image with our Flask REST API and model
Testing our Docker image before deployment
Creating our Kubernetes cluster and deploying our application to it
Testing the deployed model
Testing the throughput of our model
Cleaning up resources

Design

The application we will develop is a simple image classification service, where we will submit an image and get back what class the image belongs to. The application flow for the deep learning model is as follows:

The client sends a HTTP POST request with the encoded image data.
The Flask app extracts the image from the request.
The image is then appropriately preprocessed and sent to the model for scoring.
The scoring result is then piped into a JSON object and returned to the client.

If you already have a Docker image that you would like to deploy you can skip the first four notebooks.

NOTE: The tutorial goes through step by step how to deploy a deep learning model on Azure; it does not include enterprise best practices such as securing the endpoints and setting up remote logging etc.

Deploying with GPUS: For a detailed comparison of the deployments of various deep learning models, see the blog post here which provides evidence that, at least in the scenarios tested, GPUs provide better throughput and stability at a lower cost.

Prerequisites

Linux(Ubuntu). The tutorial was developed on an Azure Linux DSVM
Docker installed. NOTE: Even with docker installed you may need to set it up so that you don't require sudo to execute docker commands see "Manage Docker as a non-root user"
Dockerhub account
Port 9999 open: Jupyter notebook will use port 9999 so please ensure that it is open. For instructions on how to do that on Azure see here

Setup

Clone the repo:

git clone <repo web URL>

docker login

Go to the framework folder you would like to run the notebooks for.
Create a conda environment:

conda env create -f environment.yml

Activate the environment:

source activate <environment name>

Run:

jupyter notebook

Start the first notebook and make sure the kernel corresponding to the above environment is selected.

Steps

After following the setup instructions above, run the Jupyter notebooks in order. The same basic steps are followed for each deep learning framework.

Cleaning up

To remove the conda environment created see here. The last Jupyter notebook within each folder also gives details on deleting Azure resources associated with this repo.

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.microsoft.com.

When you submit a pull request, a CLA-bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.