updated readmes
This commit is contained in:
Родитель
946248c86a
Коммит
a57d45d248
|
@ -30,7 +30,7 @@ The following is a summary of commonly used Computer Vision scenarios that are c
|
|||
| -------- | ----------- | ----------- |
|
||||
| [Classification](scenarios/classification) | Base | Image Classification is a supervised machine learning technique that allows you to learn and predict the category of a given image. |
|
||||
| [Similarity](scenarios/similarity) | Base | Image Similarity is a way to compute a similarity score given a pair of images. Given an image, it allows you to identify the most similar image in a given dataset. |
|
||||
| [Detection](scenarios/detection) | Base | Object Detection is a supervised machine learning technique that allows you to detect the bounding box of an object within an image. |
|
||||
| [Detection](scenarios/detection) | Base | Object Detection is a technique that allows you to detect the bounding box of an object within an image. |
|
||||
| [Keypoints](scenarios/keypoints) | Base | Keypoint detection can be used to detect specific points on an object. A pre-trained model is provided to detect body joints for human pose estimation. |
|
||||
| [Action recognition](contrib/action_recognition) | Contrib | Action recognition to identify in video/webcam footage what actions are performed (e.g. "running", "opening a bottle") and at what respective start/end times.|
|
||||
| [Crowd counting](contrib/crowd_counting) | Contrib | Counting the number of people in low-crowd-density (e.g. less than 10 people) and high-crowd-density (e.g. thousands of people) scenarios.|
|
||||
|
|
|
@ -4,7 +4,7 @@
|
|||
| -------- | ----------- |
|
||||
| [Classification](classification) | Image Classification is a supervised machine learning technique that allows you to learn and predict the category of a given image. |
|
||||
| [Similarity](similarity) | Image Similarity is a way to compute a similarity score given a pair of images. Given an image, it allows you to identify the most similar image in a given dataset. |
|
||||
| [Detection](detection) | Object Detection is a supervised machine learning technique that allows you to detect the bounding box of an object within an image. |
|
||||
| [Detection](detection) | Object Detection is a technique that allows you to detect the bounding box of an object within an image. |
|
||||
| [Keypoints](scenarios/keypoints) | Keypoint detection can be used to detect specific points on an object. A pre-trained model is provided to detect body joints for human pose estimation. |
|
||||
|
||||
# Scenarios
|
||||
|
@ -24,7 +24,7 @@ While the field of Computer Vision is growing rapidly, the majority of vision ap
|
|||
|
||||
<img align="center" src="./media/intro_od_vis.jpg" height="150" alt="Object detect visualization"/>
|
||||
|
||||
- **Keypoint Detection**: Given an input image, identify and locate keypoints. Conceptually this runs an object detector first, followed by detecting keypoints on the objects. In practice, a single model runs both steps (almost) at once.
|
||||
- **Keypoint Detection**: Given an input image, identify and locate keypoints. Conceptually this runs an object detector first, followed by detecting keypoints on the objects. In practice, a single model runs both steps (almost) at once.
|
||||
|
||||
<img align="center" src="./media/intro_kp_vis.jpg" height="150" alt="Keypoint detect visualization"/>
|
||||
|
||||
|
|
|
@ -16,51 +16,41 @@ This document tries to answer frequent questions related to object detection. Fo
|
|||
* [Intersection-over-Union overlap metric](#intersection-over-union-overlap-metric)
|
||||
* [Non-maxima suppression](#non-maxima-suppression)
|
||||
* [Mean Average Precision](#mean-average-precision)
|
||||
|
||||
|
||||
* Training
|
||||
* [How to improve accuracy?](#how-to-improve-accuracy)
|
||||
|
||||
|
||||
|
||||
## General
|
||||
|
||||
### Why Torchvision?
|
||||
|
||||
Torchvision has a large active user-base and hence its object detection implementation is easy to use, well tested, and uses state-of-the-art technology which has proven itself in the community. For these reasons we decided to use Torchvision as our object detection library. For advanced users who want to experiment with the latest cutting-edge technology, we recommend to start with our Torchvision notebooks and then also to look into more researchy implementations such as the [mmdetection](https://github.com/open-mmlab/mmdetection) repository.
|
||||
Torchvision has a large active user-base and hence its object detection implementation is easy to use, well tested, and uses state-of-the-art technology which has proven itself in the community. For these reasons we decided to use Torchvision as our object detection library. For advanced users who want to experiment with the latest cutting-edge technology, we recommend to start with our Torchvision notebooks and then also to look into more researchy implementations such as the [mmdetection](https://github.com/open-mmlab/mmdetection) repository.
|
||||
|
||||
## Data
|
||||
|
||||
### How to annotate images?
|
||||
|
||||
Annotated object locations are required to train and evaluate an object detector. One of the best open source UIs which runs on Windows and Linux is [VOTT](https://github.com/Microsoft/VoTT/releases). VOTT can be used to manually draw rectangles around one or more objects in an image. These annotations can then be exported in Pascal-VOC format (single xml-file per image) which the provided notebooks know how to read.
|
||||
Annotated object locations are required to train and evaluate an object detector. One of the best open source UIs which runs on Windows and Linux is [VOTT](https://github.com/Microsoft/VoTT/releases). Another good tool is [LabelImg](https://github.com/tzutalin/labelImg/releases).
|
||||
|
||||
VOTT can be used to manually draw rectangles around one or more objects in an image. These annotations can then be exported in Pascal-VOC format (single xml-file per image) which the provided notebooks know how to read.
|
||||
<p align="center">
|
||||
<img src="media/vott_ui.jpg" width="600" align="center"/>
|
||||
</p>
|
||||
|
||||
When creating a new project in VOTT, note that the "source connection" can simply point to a local folder which contains the images to be annotated, and respectively the "target connection" to a folder where to write the output. Pascal VOC style annotations can be exported by selecting "Pascal VOC" in the "Export Settings" tab and then using the "Export Project" button in the "Tags Editor" tab.
|
||||
|
||||
For mask (segmentation) annotation, an easy-to-use online tool is
|
||||
[Labelbox](https://labelbox.com/). Other alternatives include
|
||||
[CVAT](https://github.com/opencv/cvat) and
|
||||
[RectLabel](https://rectlabel.com/) (Mac only).
|
||||
For mask (segmentation) annotation, an easy-to-use online tool is [Labelbox](https://labelbox.com/), shown in the screenshot below. See the demo [Introducing Image Segmentation at Labelbox](https://labelbox.com/blog/introducing-image-segmentation/) on how to use the tool, and the [02_mask_rcnn notebook](02_mask_rcnn.ipynb) how to convert the Labelbox annotations to Pascal VOC format. Alternatives to Labelbox include [CVAT](https://github.com/opencv/cvat) or [RectLabel](https://rectlabel.com/) (Mac only).
|
||||
|
||||
<p align="center"> <img src="media/labelbox_mask_annotation.png"
|
||||
width="600"/> </p>
|
||||
|
||||
A good demo can be found at [Introducing Image Segmentation at
|
||||
Labelbox](https://labelbox.com/blog/introducing-image-segmentation/).
|
||||
Besides for annotating mask, Labelbox can also be used to annotate
|
||||
bounding box, polyline and keypoint.
|
||||
Besides drawing masks, Labelbox can also be used to annotate keypoints.
|
||||
|
||||
<p align="center">
|
||||
<img src="media/labelbox_keypoint_annotation.png" width="600"/>
|
||||
</p>
|
||||
|
||||
However, it has limitation to the number of labeled images per year
|
||||
for free account. And it does not provide export options for COCO or
|
||||
PASCAL VOC. Annotations at Labelbox still needs to be converted into
|
||||
the format used in our notebooks, which is explained in our [Mask R-CNN
|
||||
notebook](02_mask_rcnn.ipynb).
|
||||
|
||||
|
||||
Selection and annotating images is complex and consistency is key. For example:
|
||||
* All objects in an image need to be annotated, even if the image contains many of them. Consider removing the image if this would take too much time.
|
||||
|
|
|
@ -24,6 +24,7 @@ We provide several notebooks to show how object detection algorithms can be desi
|
|||
| [00_webcam.ipynb](./00_webcam.ipynb)| Quick-start notebook which demonstrates how to build an object detection system using a single image or webcam as input.
|
||||
| [01_training_introduction.ipynb](./01_training_introduction.ipynb)| Notebook which explains the basic concepts around model training and evaluation.|
|
||||
| [02_mask_rcnn.ipynb](./02_mask_rcnn.ipynb) | In addition to detecting objects, also find their precise pixel-masks in an image. |
|
||||
| [03_keypoint_rcnn.ipynb](../detection/03_keypoint_rcnn.ipynb)| Notebook which shows how to (i) run a pre-trained model for human pose estimation; and (ii) train a custom keypoint detection model.|
|
||||
| [11_exploring_hyperparameters_on_azureml.ipynb](./11_exploring_hyperparameters_on_azureml.ipynb)| Performs highly parallel parameter sweeping using AzureML's HyperDrive. |
|
||||
| [12_hard_negative_sampling.ipynb](./12_hard_negative_sampling.ipynb) | Demonstrates how to sample hard negatives to improve model performance. |
|
||||
| [20_deployment_on_kubernetes.ipynb](./20_deployment_on_kubernetes.ipynb) | Deploys a trained model using AzureML. |
|
||||
|
|
|
@ -1,14 +1,14 @@
|
|||
# Keypoint Detection
|
||||
|
||||
This repository contains examples and best practice guidelines for building keypoint detection systems.
|
||||
This repository contains examples and best practice guidelines for building keypoint detection systems. It also shows how to use a pre-trained model for human pose estimation.
|
||||
|
||||
Keypoints are defined as points-of-interests on objects. For example, one might be interested in finding the position of the lid on a bottle. Another example is to find body joints (hands, shoulders, etc.) for human pose estimation.
|
||||
|
||||
We use an extension of Mask R-CNN which simultaneously detects objects and their keypoints. The underlying technology is very similar to our approach for object detection, ie. based on [Torchvision's](https://pytorch.org/docs/stable/torchvision/index.html) Mask R-CNN implementation. We therefore placed our example notebook for keypoint localization in the [detection](../detection) folder.
|
||||
We use an extension of Mask R-CNN which simultaneously detects objects and their keypoints. The underlying technology is very similar to our approach for object detection, ie. based on [Torchvision's](https://pytorch.org/docs/stable/torchvision/index.html) Mask R-CNN. The example notebook for keypoint localization is therefore in the [detection](../detection) folder.
|
||||
|
||||
| Human pose estimation using a pre-trained model | Detecting the top and bottom on our fridge objects | Detecting various keypoints on milk bottles|
|
||||
| Detecting the top and bottom on our fridge objects | Detecting various keypoints on milk bottles| Human pose estimation using the provided pre-trained model |
|
||||
|--|--|--|
|
||||
| <img align="center" src="./media/kp_example1.jpg" height="200"/> | <img align="center" src="./media/kp_example3.jpg" height="200"/> | <img align="center" src="./media/kp_example2_zoom.jpg" height="200"/>
|
||||
| <img align="center" src="./media/kp_example3.jpg" height="200"/> | <img align="center" src="./media/kp_example2_zoom.jpg" height="200"/> | <img align="center" src="./media/kp_example1.jpg" height="200"/> |
|
||||
|
||||
## Frequently asked questions
|
||||
|
||||
|
@ -19,7 +19,7 @@ See the [FAQ.md](../detection/FAQ.md) in the object detection folder.
|
|||
|
||||
| Notebook name | Description |
|
||||
| --- | --- |
|
||||
| [03_keypoint_rcnn.ipynb](../detection.00_webcam.ipynb)| Notebook which shows how to (i) run a pre-trained model for human pose estimation; and (ii) train a custom keypoint detection model.|
|
||||
| [03_keypoint_rcnn.ipynb](../detection/03_keypoint_rcnn.ipynb)| Notebook which shows how to (i) run a pre-trained model for human pose estimation; and (ii) train a custom keypoint detection model.|
|
||||
|
||||
|
||||
## Contribution guidelines
|
||||
|
|
Загрузка…
Ссылка в новой задаче