Best Practices, code samples, and documentation for Computer Vision.
Перейти к файлу
Alexandra Teste f3e030bb4b Commented out several tests to allow for job to complete 2019-04-04 09:40:20 -07:00
.ci Replaced script by bash type + added echo of OS type 2019-04-03 18:21:13 -07:00
docs/media created empty directory structure 2019-02-13 14:48:03 -05:00
image_classification Commented out several tests to allow for job to complete 2019-04-04 09:40:20 -07:00
tests Changed azureml version + adjusted image conversion functions + improved paths for unittests + changed test wrt ACI 2019-03-28 15:49:37 -07:00
utils_cv Added init file in utils_cv/ 2019-03-26 18:33:38 -07:00
.flake8 benchmark + parameter_sweeper module (#55) 2019-03-21 12:25:46 -04:00
.gitignore benchmark + parameter_sweeper module (#55) 2019-03-21 12:25:46 -04:00
.pre-commit-config.yaml Adding pre-commit package to yml, more rules to flake8 lint config and adding pre-commit config file 2019-03-13 14:21:15 -04:00
CONTRIBUTING.md Minor doc change 2019-03-21 02:46:46 -04:00
LICENSE Initial commit 2019-02-11 08:23:56 -08:00
README.md Pabuehle/readme edits (#85) 2019-03-29 15:18:00 -04:00
__init__.py Added license + created tests/ folder outside IC + improved probability extraction in scoring script 2019-03-25 13:17:05 -07:00
ms_products.md Pabuehle/readme edits (#85) 2019-03-29 15:18:00 -04:00
pyproject.toml set black config to use 79 chars per line 2019-02-28 20:50:52 +00:00

README.md

Computer Vision Best Practices

This repository provides implementations and best practice guidelines for building Computer Vision systems. All examples are given as Jupyter notebooks, and use PyTorch as Deep Learning library.

Build Status

Overview

The goal of this repository is to help speed up development of Computer Vision applications. Rather than implementing custom approaches, the focus is on providing examples and links to existing state-of-the-art libraries. In addition, having worked in this space for many years, we aim to answer common questions, point out often observed pitfalls, and show how to use the cloud for deployment and training.

Currently, the main investment/priority is around image classification and to a lesser extend image segmentation. We also actively work on providing a basic (but often sufficiently accurate) example on how to do image similarity. Object detection is scheduled to start once image classification is completed. See the projects and milestones in this repository for more details.

Getting Started

Instructions on how to get started, as well as our example notebooks and discussions are provided in the image classification subfolder.

Note that for certain Computer Vision problems, ready-made or easily customizable solutions exist which do not require any custom coding or machine learning expertise. We strongly recommend evaluating if any of these address the problem at hand. Only if that is not the case, or if the accuracy of these solutions is not sufficient, do we recommend the much more time-consuming and difficult (since it requires expert knowledge) path of building custom models.

These Microsoft services address common Computer Vision tasks:

  • Cognitive Services Pre-trained REST APIs which can be called to do e.g. image classification, face recognition, OCR, video analytics, and much more. These APIs are easy to use and work out of the box (e.g. no training required), however customization is limited. See the various demos to get a feeling for their functionality, e.g. on this site.

  • Custom Vision Service SaaS service to train and deploy a model as a REST API given a user-provided training set. All steps from image upload, annotation, to model deployment can either be performed using a UI, or alternatively (but not necessary) a Python SDK. Both training image classification and object detection models is supported, with only minimal machine learning knowledge. The Custom Vision Service hence offers more flexibility than using the pre-trained Cognitive Services APIs, but requires the user to bring and annotate their own datasets.

  • Azure Machine Learning service (AzureML) Scenario-agnostic machine learning service that helps users accelerate training and deploying machine learning models. While not specific for Computer Vision workloads, one can use the AzureML Python SDK to deploy scalable and reliable web-services using e.g. Kubernetes, or for heavily parallel training on a cloud-based GPU cluster. While AzureML offers significantly more flexibility than the other options above, it also requires significantly more machine learning and programming knowledge.

Computer Vision Domains

Most applications in Computer Vision fall into one of these 4 categories:

  • Image classification: Given an input image, predict what objects are present. This is typically the easiest CV problem to solve, however requires objects to be reasonably large in the image.

       Image classification visualization

  • Object Detection: Given an input image, predict what objects are present and where the objects are (using rectangular coordinates). Object detection approaches work even if the object is small. However model training takes longer than image classification, and manually annotating images is more time-consuming.

       Object detect visualization

  • Image Similarity Given an input image, find all similar images in a reference dataset. Here, rather than predicting a label or a rectangle, the task is to sort a reference dataset by their similarity to the query image.

       Image similarity visualization

  • Image Segmentation Given an input image, assign a label to all pixels e.g. background, bottle, hand, sky, etc. In practice, this problem is less common in industry, in big parts due to the segmentation masks required during training.

       Image segmentation visualization

Contributing

This project welcomes contributions and suggestions. Before contributing, please see our contribution guidelines.