This commit is contained in:
Jen Looper 2022-05-06 18:05:29 -04:00
Родитель 968117204e
Коммит a97f86c5e6
2 изменённых файлов: 32 добавлений и 16 удалений

Просмотреть файл

@ -1,21 +1,21 @@
# Segmentation
We have already learnt about Object Detection, which allows us to locate objects in the image by predicting their *bounding boxes*. However, for some tasks we do not only need bounding boxes, but also more precise object localization. This task is called **segmentation**.
We have previously learned about Object Detection, which allows us to locate objects in the image by predicting their *bounding boxes*. However, for some tasks we do not only need bounding boxes, but also more precise object localization. This task is called **segmentation**.
## [Pre-lecture quiz](https://black-ground-0cc93280f.1.azurestaticapps.net/quiz/112)
Segmentation can be viewed as **pixel classification**, whereas for **each** pixel of image we must predict its class (*background* being one of the classes). There are two main segmentation algorithms:
* **Semantic segmentation** only tells pixel class, and does not make a distinction between different objects of the same class
* **Semantic segmentation** only tells the pixel class, and does not make a distinction between different objects of the same class
* **Instance segmentation** divides classes into different instances.
For instance segmentation 10 sheep are different objects, for semantic segmentation all sheep are represented by one class.
For instance segmentation, these sheep are different objects, but for semantic segmentation all sheep are represented by one class.
<img src="images/instance_vs_semantic.jpeg" width="50%">
> Image from [this blog post](https://nirmalamurali.medium.com/image-classification-vs-semantic-segmentation-vs-instance-segmentation-625c33a08d50)
There are different neural architectures for segmentation, but they all have the same structure. In a way, it is similar to autoencoder, but instead of deconstructing the original image, our goal is to deconstruct a **mask**. Thus, segmentation network has the following parts:
There are different neural architectures for segmentation, but they all have the same structure. In a way, it is similar to the autoencoder you learned about previously, but instead of deconstructing the original image, our goal is to deconstruct a **mask**. Thus, a segmentation network has the following parts:
* **Encoder** extracts features from input image
* **Decoder** transforms those features into the **mask image**, with the same size and number of channels corresponding to the number of classes.
@ -24,31 +24,44 @@ There are different neural architectures for segmentation, but they all have the
> Image from [this publication](https://arxiv.org/pdf/2001.05566.pdf)
We should especially mention the loss function that is used for segmentation. In classical autoencoders we need to measure the similarity between two images, and we can use mean square error to do that. In segmentation, each pixel in the target mask image represents the class number (one-hot-encoded along the third dimension), so we need to use loss functions specific for classification - cross-entropy loss, averaged over all pixels. If the mask is binary - **binary cross-entropy loss** (BCE) is used.
We should especially mention the loss function that is used for segmentation. When using classical autoencoders, we need to measure the similarity between two images, and we can use mean square error (MSE) to do that. In segmentation, each pixel in the target mask image represents the class number (one-hot-encoded along the third dimension), so we need to use loss functions specific for classification - cross-entropy loss, averaged over all pixels. If the mask is binary - **binary cross-entropy loss** (BCE) is used.
> ✅ One-hot encoding is a way to encode a class label into a vector of length equal to the number of classes. Take a look at [this article](https://datagy.io/sklearn-one-hot-encode/) on this technique.
## Segmentation for Medical Imaging
In this lesson, we will see the segmentation in action by training the network to recognize human nevi on the medical images. We will be using <a href="https://www.fc.up.pt/addi/ph2%20database.html">PH<sup>2</sup> Database</a> of dermoscopy images. This dataset contains 200 images of three classes: typical nevus, atypical nevus, and melanoma. All images also contain corresponding **mask** that outline the nevus.
In this lesson, we will see the segmentation in action by training the network to recognize human nevi (also known as moles) on medical images. We will be using <a href="https://www.fc.up.pt/addi/ph2%20database.html">PH<sup>2</sup> Database</a> of dermoscopy images as the image source. This dataset contains 200 images of three classes: typical nevus, atypical nevus, and melanoma. All images also contain a corresponding **mask** that outlines the nevus.
<img src="images/navi.png"/>
> ✅ This technique is particularly appropriate for this type of medical imaging, but what other real-world applications could you envision?
We will train a model to segment any nevus from the background.
<img alt="navi" src="images/navi.png"/>
## Notebooks
> Image from the PH<sup>2</sup> Database
Open the notebook to learn more about different semantic segmentation architectures and see them in action.
We will train a model to segment any nevus from its background.
## ✍️ Exercises: Semantic Segmentation
Open the notebooks below to learn more about different semantic segmentation architectures, practice working with them, and see them in action.
* [Semantic Segmentation Pytorch](SemanticSegmentationPytorch.ipynb)
* [Semantic Segmentation TensorFlow](SemanticSegmentationTF.ipynb)
## [Post-lecture quiz](https://black-ground-0cc93280f.1.azurestaticapps.net/quiz/212)
## [Lab](lab/README.md)
## Conclusion
In this lab, we encourage you to try **human body segmentation** using [Segmentation Full Body MADS Dataset](https://www.kaggle.com/datasets/tapakah68/segmentation-full-body-mads-dataset) from Kaggle.
Segmentation is a very powerful technique for image classification, moving beyond bounding boxes to pixel-level classification. It is a technique used in medical imaging, among other applications.
> ✅ Todo: Conclusion
## Assignment
## 🚀 Challenge
Body segmentation is just one of the common tasks that we can do with images of people. Another important tasks include **skeleton detection** and **pose detection**. Try out [OpenPose](https://github.com/CMU-Perceptual-Computing-Lab/openpose) library to see how pose detection can be used.
## Review & Self Study
This [wikipedia article](https://wikipedia.org/wiki/Image_segmentation) offers a good overview of the various applications of this technique. Learn more on your own about the subdomains of Instance segmentation and Panoptic segmentation in this field of inquiry.
## [Assignment](lab/README.md)
In this lab, try **human body segmentation** using [Segmentation Full Body MADS Dataset](https://www.kaggle.com/datasets/tapakah68/segmentation-full-body-mads-dataset) from Kaggle.

Просмотреть файл

@ -4,7 +4,10 @@
In this section we will learn about:
* Intro to Computer Vision and OpenCV: coming soon
* [Convolutional Neural Networks](07-ConvNets/README.md)
* [Pre-trained Networks and Transfer Learning](08-TransferLearning/README.md)
* [Autoencoders](09-Autoencoders/README.md)
* [Generative Adversarial Networks](10-GANs/README.md)
* [Generative Adversarial Networks](10-GANs/README.md)
* Object Detection: coming soon
* [Semantic Segmentation](12-Segmentation/README.md)