Best Practices, code samples, and documentation for Computer Vision.

artificial-intelligence azure computer-vision convolutional-neural-networks data-science deep-learning image-classification image-processing jupyter-notebook kubernetes machine-learning microsoft object-detection operationalization python similarity tutorial

Перейти к файлу

Ubuntu ab151fcf72 adding classification notebook integration test script		2019-07-01 18:29:41 +00:00
.ci	updated test	2019-06-28 15:30:06 +00:00
.github	Added github templates (copied and slightly modified from recommender) (#210 )	2019-06-03 14:58:50 -04:00
classification	added integration tests for all notebooks	2019-07-01 18:22:42 +00:00
media	added figure source	2019-06-10 14:58:14 -04:00
similarity	added integration tests for all notebooks	2019-07-01 18:22:42 +00:00
tests	adding classification notebook integration test script	2019-07-01 18:29:41 +00:00
tools/repo_metrics	Updated name of cosmos database	2019-06-13 05:04:54 -04:00
utils_cv	added integration tests for all notebooks	2019-07-01 18:22:42 +00:00
.flake8	[#167 ] remove wildcard imports (#218 )	2019-06-06 19:13:40 -04:00
.gitignore	[#167 ] remove wildcard imports (#218 )	2019-06-06 19:13:40 -04:00
.pre-commit-config.yaml	Adding pre-commit package to yml, more rules to flake8 lint config and adding pre-commit config file	2019-03-13 14:21:15 -04:00
CONTRIBUTING.md	Minor doc change	2019-03-21 02:46:46 -04:00
LICENSE	Initial commit	2019-02-11 08:23:56 -08:00
README.md	Merge branch 'master' into staging	2019-06-27 19:47:16 +00:00
pyproject.toml	set black config to use 79 chars per line	2019-02-28 20:50:52 +00:00

README.md

Computer Vision

This repository provides implementations and best practice guidelines for building Computer Vision systems. All examples are given as Jupyter notebooks, and use PyTorch as Deep Learning library.

Build Status

Build Type	Branch	Branch
Linux GPU	master	staging
Linux CPU	master	staging
Windows GPU	master	staging
Windows CPU	master	staging

Overview

The goal of this repository is to help speed up development of Computer Vision applications. Rather than implementing custom approaches, the focus is on providing examples and links to existing state-of-the-art libraries. In addition, having worked in this space for many years, we aim to answer common questions, point out often observed pitfalls, and show how to use the cloud for deployment and training.

Currently the main investment/priority is around image classification and to a lesser extend image segmentation. We are also actively working on providing a basic (but often sufficiently accurate) example for image similarity. Object detection is scheduled to start once image classification is completed. See the projects and milestones pages in this repository for more details.

Getting Started

Instructions on how to get started, as well as our example notebooks and discussions are provided in the classification subfolder.

Note that for certain Computer Vision problems, ready-made or easily customizable solutions exist which do not require any custom coding or machine learning expertise. We strongly recommend evaluating if these can sufficiently solve your problem. If these solutions are not applicable, or the accuracy of these solutions is not sufficient, then resorting to more complex and time-consuming custom approaches may be necessary.

The following Microsoft services offer simple solutions to address common Computer Vision tasks:

Cognitive Services provides pre-trained REST APIs which can be called for image classification, face recognition, OCR, video analytics, and much more. These APIs are easy to use and work out of the box (no training required), however customization is limited. See the various demos available for each domain to get a feel for the functionality (e.g., computer vision, speech to text ).
Custom Vision Service is a SaaS service to train and deploy a model as a REST API given a user-provided training set. All steps from image upload, annotation, to model deployment can be performed using either the UI or a Python SDK. Training image classification or object detection models are supported using only minimal machine learning knowledge. The Custom Vision Service offers more flexibility than using the pre-trained Cognitive Services APIs, but requires the user to bring and annotate their own data.
Azure Machine Learning service (AzureML) is a more general machine learning service that helps users accelerate training and deploying machine learning models. While not specific for Computer Vision workloads, the AzureML Python SDK can be used for scalable and reliable training and deploying machine learning solutions to the cloud. While AzureML offers significantly more flexibility than other options, it also requires significantly more machine learning and programming knowledge.
Azure AI Reference architectures While not Computer Vision specific, these reference architectures cover general Machine Learning aspects such as model deployment or batch scoring.

Computer Vision Domains

Most applications in Computer Vision (CV) fall into one of these 4 categories:

Image classification: Given an input image, predict what object is present in the image. This is typically the easiest CV problem to solve, however classification requires objects to be reasonably large in the image.

Image classification visualization

Object Detection: Given an input image, identify and locate which objects are present (using rectangular coordinates). Object detection can find small objects in an image. Compared to image classification, both model training and manually annotating images is more time-consuming in object detection, since both the label and location are required.

Object detect visualization

Image Similarity Given an input image, find all similar objects in images from a reference dataset. Here, rather than predicting a label and/or rectangle, the task is to sort through a reference dataset to find objects similar to that found in the query image.

Image similarity visualization

Image Segmentation Given an input image, assign a label to every pixel (e.g., background, bottle, hand, sky, etc.). In practice, this problem is less common in industry, in large part due to time required to label the ground truth segmentation required in order to train a solution.

Image segmentation visualization

Contributing

This project welcomes contributions and suggestions. Please see our contribution guidelines.

Data/Telemetry

The Azure Machine Learning service notebooks in this repository collect usage data and send it to Microsoft to help improve our products and services. Read Microsoft's privacy statement to learn more

To opt out of tracking, please go to the raw .ipynb files and remove the following line of code:

    "![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/ComputerVision/classification/notebooks/21_deployment_on_azure_container_instances.png)"

This URL will be slightly different depending on the file.