E2E Sample - Live Stream Video Object Detection with Onnx in WPF Desktop App (#602)

* Added UWP project with simple camera support

* added simple rectangle overlay drawing

* Added ML.NET predictions with yolo model

* added TODO comment for Bitmap type issue

* added shared and wpf project (and it's broken)

* reference shared classes from web app

* DeepLearning_ObjectDetection_Onnx -> ObjectDetection_Onnx

* draw a square

* Rough wpf web cam code

* added toolkit

* view live webcam stream

* initial working prototype

* use same colors as web app

* only provide the model once and share across projs

* Renaming

* Use MicrosoftMLVersion in csproj files

* fix build

* one more 1.3.1 -> MicrosoftMLVersion

* cleanup performance and comments

* use .net core 3.0

* fix build def

* fix camera bug on some devices (thanks @eerhardt)

* update web project to .net core 3.0

* edits and cleanup to original README

* update README to include information on the WPF desktop app

* Update samples/csharp/end-to-end-apps/ObjectDetection-Onnx/README.md

Co-Authored-By: Brigit Murtaugh <brigit.murtaugh@microsoft.com>

* Update samples/csharp/end-to-end-apps/ObjectDetection-Onnx/README.md

Co-Authored-By: Brigit Murtaugh <brigit.murtaugh@microsoft.com>

* Update samples/csharp/end-to-end-apps/ObjectDetection-Onnx/README.md

Co-Authored-By: Brigit Murtaugh <brigit.murtaugh@microsoft.com>

* Update samples/csharp/end-to-end-apps/ObjectDetection-Onnx/README.md

Co-Authored-By: Brigit Murtaugh <brigit.murtaugh@microsoft.com>

* Update samples/csharp/end-to-end-apps/ObjectDetection-Onnx/README.md

Co-Authored-By: Brigit Murtaugh <brigit.murtaugh@microsoft.com>

* Update samples/csharp/end-to-end-apps/ObjectDetection-Onnx/README.md

Co-Authored-By: Brigit Murtaugh <brigit.murtaugh@microsoft.com>

* Update samples/csharp/end-to-end-apps/ObjectDetection-Onnx/README.md

Co-Authored-By: Brigit Murtaugh <brigit.murtaugh@microsoft.com>

* Update samples/csharp/end-to-end-apps/ObjectDetection-Onnx/README.md

Co-Authored-By: Brigit Murtaugh <brigit.murtaugh@microsoft.com>

* Update samples/csharp/end-to-end-apps/ObjectDetection-Onnx/README.md

Co-Authored-By: Brigit Murtaugh <brigit.murtaugh@microsoft.com>

* Update samples/csharp/end-to-end-apps/ObjectDetection-Onnx/OnnxObjectDetectionApp/MainWindow.xaml.cs

Co-Authored-By: Brigit Murtaugh <brigit.murtaugh@microsoft.com>

* Update samples/csharp/end-to-end-apps/ObjectDetection-Onnx/OnnxObjectDetectionApp/MainWindow.xaml.cs

Co-Authored-By: Brigit Murtaugh <brigit.murtaugh@microsoft.com>

* readme edits and formatting. thanks @bamurtaugh!

* readme edits and fixes per @nicolehaugen review. thanks!
This commit is contained in:
Colby Williams 2019-08-20 09:11:30 -04:00 коммит произвёл GitHub
Родитель bd8bac45e1
Коммит 7e8ce494bb
Не найден ключ, соответствующий данной подписи
Идентификатор ключа GPG: 4AEE18F83AFDEB23
98 изменённых файлов: 670 добавлений и 266 удалений

Просмотреть файл

@ -163,10 +163,16 @@ phases:
- phase: ObjectDetectionE2EAPP
queue: Hosted VS2017
steps:
- task: UseDotNet@2
displayName: 'Use .NET Core 3.0'
inputs:
version: 3.0.x
includePreviewVersions: true
installationPath: $(Agent.ToolsDirectory)/dotnet
- task: DotNetCoreCLI@2
displayName: Build Object Detection E2E (Onnx Scorer)
inputs:
projects: '.\samples\csharp\end-to-end-apps\DeepLearning_ObjectDetection_Onnx\OnnxObjectDetectionE2EApp.sln'
projects: '.\samples\csharp\end-to-end-apps\ObjectDetection-Onnx\OnnxObjectDetection.sln'
- phase: SalesSpikeChangeDetectionE2E
queue: Hosted VS2017

Просмотреть файл

@ -1,24 +0,0 @@
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Threading.Tasks;
using Microsoft.AspNetCore;
using Microsoft.AspNetCore.Hosting;
using Microsoft.Extensions.Configuration;
using Microsoft.Extensions.Logging;
namespace OnnxObjectDetectionE2EAPP
{
public class Program
{
public static void Main(string[] args)
{
CreateWebHostBuilder(args).Build().Run();
}
public static IWebHostBuilder CreateWebHostBuilder(string[] args) =>
WebHost.CreateDefaultBuilder(args)
.UseStartup<Startup>();
}
}

Просмотреть файл

@ -1,25 +0,0 @@

Microsoft Visual Studio Solution File, Format Version 12.00
# Visual Studio Version 16
VisualStudioVersion = 16.0.28803.452
MinimumVisualStudioVersion = 10.0.40219.1
Project("{9A19103F-16F7-4668-BE54-9A1E7A4F7556}") = "OnnxObjectDetectionE2EAPP", "OnnxObjectDetectionE2EAPP\OnnxObjectDetectionE2EAPP.csproj", "{4A91FD3C-80FC-40E9-9A0B-0F832B313C24}"
EndProject
Global
GlobalSection(SolutionConfigurationPlatforms) = preSolution
Debug|Any CPU = Debug|Any CPU
Release|Any CPU = Release|Any CPU
EndGlobalSection
GlobalSection(ProjectConfigurationPlatforms) = postSolution
{4A91FD3C-80FC-40E9-9A0B-0F832B313C24}.Debug|Any CPU.ActiveCfg = Debug|Any CPU
{4A91FD3C-80FC-40E9-9A0B-0F832B313C24}.Debug|Any CPU.Build.0 = Debug|Any CPU
{4A91FD3C-80FC-40E9-9A0B-0F832B313C24}.Release|Any CPU.ActiveCfg = Release|Any CPU
{4A91FD3C-80FC-40E9-9A0B-0F832B313C24}.Release|Any CPU.Build.0 = Release|Any CPU
EndGlobalSection
GlobalSection(SolutionProperties) = preSolution
HideSolutionNode = FALSE
EndGlobalSection
GlobalSection(ExtensibilityGlobals) = postSolution
SolutionGuid = {E4E2676A-8816-4A2F-A0F0-1E2718DAFC78}
EndGlobalSection
EndGlobal

Просмотреть файл

@ -1,152 +0,0 @@
# Object Detection - Asp.Net cpre Web/Service Sample
| ML.NET version | API type | Status | App Type | Data type | Scenario | ML Task | Algorithms |
|----------------|-------------------|-------------------------------|-------------|-----------|---------------------|---------------------------|-----------------------------|
| v1.3.1 | Dynamic API | Up-to-date | End-End app | image files | Object Detection | Deep Learning | Tiny Yolo2 ONNX model |
## Problem
Object detection is one of the classical problems in computer vision: Recognize what objects are inside a given image and also where they are in the image. For these cases, you can either use pre-trained models or train your own model to classify images specific to your custom domain.
How the app works?
When the app runs it shows the images list on the bottom at **Sample Input Images** section.select any image to process. After the image is processed, it is shown under **Processed Images** section with the bounding boxes around detected objects as shown below.
![](./docs/Screenshots/ObjectDetection.gif)
Alternatively you can try uploading your own images as shown below.
![](./docs/Screenshots/FileUpload.gif)
## DataSet
There are two data sources: the `tsv` file and the image files. The [tsv file](./OnnxObjectDetectionE2EAPP/TestImages/tags.tsv) contains two columns: the first one is defined as `ImagePath` and the second one is the `Label` corresponding to the image. As you can observe, the file does not have a header row, and looks like this:
The images are located in the [TestImages](./OnnxObjectDetectionE2EAPP/TestImages) folder. These images have been downloaded from internet.
For example, below are urls from which the images downloaded from:
https://github.com/simo23/tinyYOLOv2/blob/master/dog.jpg
https://github.com/simo23/tinyYOLOv2/blob/master/person.jpg
## Pre-trained model
There are multiple models which are pre-trained for identifying multiple objects in the images. here we are using the pretrained model, **Tiny Yolo2** in **ONNX** format. This model is a real-time neural network for object detection that detects 20 different classes. It is made up of 9 convolutional layers and 6 max-pooling layers and is a smaller version of the more complex full [YOLOv2](https://pjreddie.com/darknet/yolov2/) network.
The Open Neural Network eXchange i.e [ONNX](http://onnx.ai/) is an open format to represent deep learning models. With ONNX, developers can move models between state-of-the-art tools and choose the combination that is best for them. ONNX is developed and supported by a community of partners.
The model is downloaded from the [ONNX Model Zoo](https://github.com/onnx/models/tree/master/tiny_yolov2) which is a is a collection of pre-trained, state-of-the-art models in the ONNX format.
The Tiny YOLO2 model was trained on the [Pascal VOC](http://host.robots.ox.ac.uk/pascal/VOC/) dataset. Below are the model's prerequisites.
**Model input and output**
**Input**
Input image of the shape (3x416x416)
**Output**
Output is a (1x125x13x13) array
**Pre-processing steps**
Resize the input image to a (3x416x416) array of type float32.
**Post-processing steps**
The output is a (125x13x13) tensor where 13x13 is the number of grid cells that the image gets divided into. Each grid cell corresponds to 125 channels, made up of the 5 bounding boxes predicted by the grid cell and the 25 data elements that describe each bounding box (5x25=125). For more information on how to derive the final bounding boxes and their corresponding confidence scores, refer to this [post](http://machinethink.net/blog/object-detection-with-yolo/).
## Solution
The sample contains Razor Webapp which contains both **Razor UI pages** and **API controller** classes to process images.
## Code Walkthrough
The difference between the [getting started object detection sample](https://github.com/dotnet/machinelearning-samples/tree/master/samples/csharp/getting-started/DeepLearning_ObjectDetection_Onnx) and this end-to-end sample is we load the images from **file** in getting started sample where as we load the images from **in-memory** in end-to-end sample.
Define the schema of data in a class type and refer that type while loading data into IDataView using TextLoader. Here the class type is **ImageInputData**. ML.Net supports Bitmap type for images. To load the images from in-memory you just need to specify **Bitmap** type in the class decorated with [ImageType(height, width)] attribute as shown below.
```csharp
public class ImageInputData
{
[ImageType(416, 416)]
public Bitmap Image { get; set; }
}
```
### ML.NET: Configure the model
The first step is to create an empty dataview as we just need schema of data while configuring up model.
```csharp
var dataView = _mlContext.Data.LoadFromEnumerable(new List<ImageNetData>());
```
The second step is to define the estimator pipeline. Usually, when dealing with deep neural networks, you must adapt the images to the format expected by the network. This is the reason images are resized and then transformed (mainly, pixel values are normalized across all R,G,B channels).
```csharp
var pipeline = _mlContext.Transforms.ResizeImages(resizing: ImageResizingEstimator.ResizingKind.Fill, outputColumnName: "image", imageWidth: ImageSettings.imageWidth, imageHeight: ImageSettings.imageHeight, inputColumnName: nameof(ImageInputData.Image))
.Append(_mlContext.Transforms.ExtractPixels(outputColumnName: "image"))
.Append(_mlContext.Transforms.ApplyOnnxModel(modelFile: onnxModelFilePath, outputColumnNames: new[] { TinyYoloModelSettings.ModelOutput }, inputColumnNames: new[] { TinyYoloModelSettings.ModelInput }));
```
You also need to check the neural network, and check the names of the input / output nodes. In order to inspect the model, you can use tools like [Netron](https://github.com/lutzroeder/netron), which is automatically installed with [Visual Studio Tools for AI](https://visualstudio.microsoft.com/downloads/ai-tools-vs/).
These names are used later in the definition of the estimation pipe: in the case of the inception network, the input tensor is named 'image' and the output is named 'grid'
Define the **input** and **output** parameters of the Tiny Yolo2 Onnx Model.
```
public struct TinyYoloModelSettings
{
// for checking TIny yolo2 Model input and output parameter names,
//you can use tools like Netron,
// which is installed by Visual Studio AI Tools
// input tensor name
public const string ModelInput = "image";
// output tensor name
public const string ModelOutput = "grid";
}
```
![inspecting neural network with netron](./docs/Netron/netron.PNG)
Create the model by fitting the dataview.
```csharp
var model = pipeline.Fit(dataView);
```
# Detect objects in the image:
After the model is configured, we need to save the model, load the saved model and the pass the image to the model to detect objects.
When obtaining the prediction, we get an array of floats in the property `PredictedLabels`. The array is a float array of size **21125**. This is the output of model i,e 125x13x13 as discussed earlier. This output is interpreted by `YoloOutputParser` class and returns a number of bounding boxes for each image. Again these boxes are filtered so that we retrieve only 5 bounding boxes which have better confidence(how much certain that a box contains the obejct) for each object of the image.
```
var probs = model.Predict(imageInputData).PredictedLabels;
IList<YoloBoundingBox> boundingBoxes = _parser.ParseOutputs(probs);
filteredBoxes = _parser.FilterBoundingBoxes(boundingBoxes, 5, .5F);
```
# Draw bounding boxes around detected objects in Image.
The final step is we draw the bounding boxes around the objects using Paint API and return the image to the browser and it is displayed on the browser
```
var img = _objectDetectionService.DrawBoundingBox(imageFilePath);
using (MemoryStream m = new MemoryStream())
{
img.Save(m, img.RawFormat);
byte[] imageBytes = m.ToArray();
// Convert byte[] to Base64 String
base64String = Convert.ToBase64String(imageBytes);
var result = new Result { imageString = base64String };
return result;
}
```
**Note** The Tiny Yolo2 model is not having much accuracy compare to full YOLO2 model. As this is a sample program we are using Tiny version of Yolo model i.e Tiny_Yolo2

Просмотреть файл

@ -0,0 +1,93 @@

Microsoft Visual Studio Solution File, Format Version 12.00
# Visual Studio Version 16
VisualStudioVersion = 16.0.28803.452
MinimumVisualStudioVersion = 10.0.40219.1
Project("{9A19103F-16F7-4668-BE54-9A1E7A4F7556}") = "OnnxObjectDetectionWeb", "OnnxObjectDetectionWeb\OnnxObjectDetectionWeb.csproj", "{4A91FD3C-80FC-40E9-9A0B-0F832B313C24}"
EndProject
Project("{9A19103F-16F7-4668-BE54-9A1E7A4F7556}") = "OnnxObjectDetection", "OnnxObjectDetection\OnnxObjectDetection.csproj", "{7B159949-6D64-41B2-A30F-1952FA8EBA3E}"
EndProject
Project("{9A19103F-16F7-4668-BE54-9A1E7A4F7556}") = "OnnxObjectDetectionApp", "OnnxObjectDetectionApp\OnnxObjectDetectionApp.csproj", "{30411590-5517-4E40-8AC6-88E916B66B09}"
EndProject
Global
GlobalSection(SolutionConfigurationPlatforms) = preSolution
Debug|Any CPU = Debug|Any CPU
Debug|ARM = Debug|ARM
Debug|ARM64 = Debug|ARM64
Debug|x64 = Debug|x64
Debug|x86 = Debug|x86
Release|Any CPU = Release|Any CPU
Release|ARM = Release|ARM
Release|ARM64 = Release|ARM64
Release|x64 = Release|x64
Release|x86 = Release|x86
EndGlobalSection
GlobalSection(ProjectConfigurationPlatforms) = postSolution
{4A91FD3C-80FC-40E9-9A0B-0F832B313C24}.Debug|Any CPU.ActiveCfg = Debug|Any CPU
{4A91FD3C-80FC-40E9-9A0B-0F832B313C24}.Debug|Any CPU.Build.0 = Debug|Any CPU
{4A91FD3C-80FC-40E9-9A0B-0F832B313C24}.Debug|ARM.ActiveCfg = Debug|Any CPU
{4A91FD3C-80FC-40E9-9A0B-0F832B313C24}.Debug|ARM.Build.0 = Debug|Any CPU
{4A91FD3C-80FC-40E9-9A0B-0F832B313C24}.Debug|ARM64.ActiveCfg = Debug|Any CPU
{4A91FD3C-80FC-40E9-9A0B-0F832B313C24}.Debug|ARM64.Build.0 = Debug|Any CPU
{4A91FD3C-80FC-40E9-9A0B-0F832B313C24}.Debug|x64.ActiveCfg = Debug|Any CPU
{4A91FD3C-80FC-40E9-9A0B-0F832B313C24}.Debug|x64.Build.0 = Debug|Any CPU
{4A91FD3C-80FC-40E9-9A0B-0F832B313C24}.Debug|x86.ActiveCfg = Debug|Any CPU
{4A91FD3C-80FC-40E9-9A0B-0F832B313C24}.Debug|x86.Build.0 = Debug|Any CPU
{4A91FD3C-80FC-40E9-9A0B-0F832B313C24}.Release|Any CPU.ActiveCfg = Release|Any CPU
{4A91FD3C-80FC-40E9-9A0B-0F832B313C24}.Release|Any CPU.Build.0 = Release|Any CPU
{4A91FD3C-80FC-40E9-9A0B-0F832B313C24}.Release|ARM.ActiveCfg = Release|Any CPU
{4A91FD3C-80FC-40E9-9A0B-0F832B313C24}.Release|ARM.Build.0 = Release|Any CPU
{4A91FD3C-80FC-40E9-9A0B-0F832B313C24}.Release|ARM64.ActiveCfg = Release|Any CPU
{4A91FD3C-80FC-40E9-9A0B-0F832B313C24}.Release|ARM64.Build.0 = Release|Any CPU
{4A91FD3C-80FC-40E9-9A0B-0F832B313C24}.Release|x64.ActiveCfg = Release|Any CPU
{4A91FD3C-80FC-40E9-9A0B-0F832B313C24}.Release|x64.Build.0 = Release|Any CPU
{4A91FD3C-80FC-40E9-9A0B-0F832B313C24}.Release|x86.ActiveCfg = Release|Any CPU
{4A91FD3C-80FC-40E9-9A0B-0F832B313C24}.Release|x86.Build.0 = Release|Any CPU
{7B159949-6D64-41B2-A30F-1952FA8EBA3E}.Debug|Any CPU.ActiveCfg = Debug|Any CPU
{7B159949-6D64-41B2-A30F-1952FA8EBA3E}.Debug|Any CPU.Build.0 = Debug|Any CPU
{7B159949-6D64-41B2-A30F-1952FA8EBA3E}.Debug|ARM.ActiveCfg = Debug|Any CPU
{7B159949-6D64-41B2-A30F-1952FA8EBA3E}.Debug|ARM.Build.0 = Debug|Any CPU
{7B159949-6D64-41B2-A30F-1952FA8EBA3E}.Debug|ARM64.ActiveCfg = Debug|Any CPU
{7B159949-6D64-41B2-A30F-1952FA8EBA3E}.Debug|ARM64.Build.0 = Debug|Any CPU
{7B159949-6D64-41B2-A30F-1952FA8EBA3E}.Debug|x64.ActiveCfg = Debug|Any CPU
{7B159949-6D64-41B2-A30F-1952FA8EBA3E}.Debug|x64.Build.0 = Debug|Any CPU
{7B159949-6D64-41B2-A30F-1952FA8EBA3E}.Debug|x86.ActiveCfg = Debug|Any CPU
{7B159949-6D64-41B2-A30F-1952FA8EBA3E}.Debug|x86.Build.0 = Debug|Any CPU
{7B159949-6D64-41B2-A30F-1952FA8EBA3E}.Release|Any CPU.ActiveCfg = Release|Any CPU
{7B159949-6D64-41B2-A30F-1952FA8EBA3E}.Release|Any CPU.Build.0 = Release|Any CPU
{7B159949-6D64-41B2-A30F-1952FA8EBA3E}.Release|ARM.ActiveCfg = Release|Any CPU
{7B159949-6D64-41B2-A30F-1952FA8EBA3E}.Release|ARM.Build.0 = Release|Any CPU
{7B159949-6D64-41B2-A30F-1952FA8EBA3E}.Release|ARM64.ActiveCfg = Release|Any CPU
{7B159949-6D64-41B2-A30F-1952FA8EBA3E}.Release|ARM64.Build.0 = Release|Any CPU
{7B159949-6D64-41B2-A30F-1952FA8EBA3E}.Release|x64.ActiveCfg = Release|Any CPU
{7B159949-6D64-41B2-A30F-1952FA8EBA3E}.Release|x64.Build.0 = Release|Any CPU
{7B159949-6D64-41B2-A30F-1952FA8EBA3E}.Release|x86.ActiveCfg = Release|Any CPU
{7B159949-6D64-41B2-A30F-1952FA8EBA3E}.Release|x86.Build.0 = Release|Any CPU
{30411590-5517-4E40-8AC6-88E916B66B09}.Debug|Any CPU.ActiveCfg = Debug|Any CPU
{30411590-5517-4E40-8AC6-88E916B66B09}.Debug|Any CPU.Build.0 = Debug|Any CPU
{30411590-5517-4E40-8AC6-88E916B66B09}.Debug|ARM.ActiveCfg = Debug|Any CPU
{30411590-5517-4E40-8AC6-88E916B66B09}.Debug|ARM.Build.0 = Debug|Any CPU
{30411590-5517-4E40-8AC6-88E916B66B09}.Debug|ARM64.ActiveCfg = Debug|Any CPU
{30411590-5517-4E40-8AC6-88E916B66B09}.Debug|ARM64.Build.0 = Debug|Any CPU
{30411590-5517-4E40-8AC6-88E916B66B09}.Debug|x64.ActiveCfg = Debug|Any CPU
{30411590-5517-4E40-8AC6-88E916B66B09}.Debug|x64.Build.0 = Debug|Any CPU
{30411590-5517-4E40-8AC6-88E916B66B09}.Debug|x86.ActiveCfg = Debug|Any CPU
{30411590-5517-4E40-8AC6-88E916B66B09}.Debug|x86.Build.0 = Debug|Any CPU
{30411590-5517-4E40-8AC6-88E916B66B09}.Release|Any CPU.ActiveCfg = Release|Any CPU
{30411590-5517-4E40-8AC6-88E916B66B09}.Release|Any CPU.Build.0 = Release|Any CPU
{30411590-5517-4E40-8AC6-88E916B66B09}.Release|ARM.ActiveCfg = Release|Any CPU
{30411590-5517-4E40-8AC6-88E916B66B09}.Release|ARM.Build.0 = Release|Any CPU
{30411590-5517-4E40-8AC6-88E916B66B09}.Release|ARM64.ActiveCfg = Release|Any CPU
{30411590-5517-4E40-8AC6-88E916B66B09}.Release|ARM64.Build.0 = Release|Any CPU
{30411590-5517-4E40-8AC6-88E916B66B09}.Release|x64.ActiveCfg = Release|Any CPU
{30411590-5517-4E40-8AC6-88E916B66B09}.Release|x64.Build.0 = Release|Any CPU
{30411590-5517-4E40-8AC6-88E916B66B09}.Release|x86.ActiveCfg = Release|Any CPU
{30411590-5517-4E40-8AC6-88E916B66B09}.Release|x86.Build.0 = Release|Any CPU
EndGlobalSection
GlobalSection(SolutionProperties) = preSolution
HideSolutionNode = FALSE
EndGlobalSection
GlobalSection(ExtensibilityGlobals) = postSolution
SolutionGuid = {E4E2676A-8816-4A2F-A0F0-1E2718DAFC78}
EndGlobalSection
EndGlobal

Просмотреть файл

@ -1,7 +1,7 @@
using Microsoft.ML.Transforms.Image;
using System.Drawing;
namespace OnnxObjectDetectionE2EAPP
namespace OnnxObjectDetection
{
public class ImageInputData
{

Просмотреть файл

@ -1,6 +1,6 @@
using Microsoft.ML.Data;
namespace OnnxObjectDetectionE2EAPP
namespace OnnxObjectDetection
{
public class ImageObjectPrediction
{

Просмотреть файл

@ -1,12 +1,9 @@
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using Microsoft.ML;
using Microsoft.ML;
using Microsoft.ML.Transforms.Image;
using OnnxObjectDetectionE2EAPP.Utilities;
using System.Collections.Generic;
using System.Linq;
namespace OnnxObjectDetectionE2EAPP.MLModel
namespace OnnxObjectDetection
{
public class OnnxModelConfigurator
{
@ -16,7 +13,8 @@ namespace OnnxObjectDetectionE2EAPP.MLModel
public OnnxModelConfigurator(string onnxModelFilePath)
{
_mlContext = new MLContext();
// Model creation and pipeline definition for images needs to run just once, so calling it from the constructor:
// Model creation and pipeline definition for images needs to run just once,
// so calling it from the constructor:
_mlModel = SetupMlNetModel(onnxModelFilePath);
}
@ -28,14 +26,13 @@ namespace OnnxObjectDetectionE2EAPP.MLModel
public struct TinyYoloModelSettings
{
// for checking TIny yolo2 Model input and output parameter names,
//you can use tools like Netron,
// which is installed by Visual Studio AI Tools
// To check Tiny Yolo2 Model input and output parameter names,
// you can use tools like Netron: https://github.com/lutzroeder/netron
// input tensor name
// Input tensor name
public const string ModelInput = "image";
// output tensor name
// Output tensor name
public const string ModelOutput = "grid";
}
@ -52,6 +49,11 @@ namespace OnnxObjectDetectionE2EAPP.MLModel
return mlNetModel;
}
public PredictionEngine<ImageInputData, ImageObjectPrediction> GetMlNetPredictionEngine()
{
return _mlContext.Model.CreatePredictionEngine<ImageInputData, ImageObjectPrediction>(_mlModel);
}
public void SaveMLNetModel(string mlnetModelFilePath)
{
// Save/persist the model to a .ZIP file to be loaded by the PredictionEnginePool
@ -59,4 +61,3 @@ namespace OnnxObjectDetectionE2EAPP.MLModel
}
}
}

Просмотреть файл

@ -0,0 +1,19 @@
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<TargetFramework>netstandard2.1</TargetFramework>
</PropertyGroup>
<ItemGroup>
<PackageReference Include="Microsoft.ML" Version="$(MicrosoftMLVersion)" />
<PackageReference Include="Microsoft.ML.ImageAnalytics" Version="$(MicrosoftMLVersion)" />
<PackageReference Include="Microsoft.ML.OnnxTransformer" Version="$(MicrosoftMLVersion)" />
</ItemGroup>
<ItemGroup>
<None Update="ML\OnnxModel\TinyYolo2_model.onnx">
<CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
</None>
</ItemGroup>
</Project>

Просмотреть файл

@ -1,4 +1,4 @@
namespace OnnxObjectDetectionE2EAPP
namespace OnnxObjectDetection
{
public class DimensionsBase
{

Просмотреть файл

@ -1,6 +1,6 @@
using System.Drawing;
namespace OnnxObjectDetectionE2EAPP
namespace OnnxObjectDetection
{
public class BoundingBoxDimensions : DimensionsBase { }

Просмотреть файл

@ -3,21 +3,37 @@ using System.Collections.Generic;
using System.Drawing;
using System.Linq;
namespace OnnxObjectDetectionE2EAPP
namespace OnnxObjectDetection
{
class YoloOutputParser
public class YoloOutputParser
{
class CellDimensions : DimensionsBase { }
// The number of rows in the grid the image is divided into.
public const int ROW_COUNT = 13;
// The number of columns in the grid the image is divided into.
public const int COL_COUNT = 13;
// The total number of values contained in one cell of the grid.
public const int CHANNEL_COUNT = 125;
// The number of bounding boxes in a cell.
public const int BOXES_PER_CELL = 5;
// The number of features contained within a box (x,y,height,width,confidence).
public const int BOX_INFO_FEATURE_COUNT = 5;
// The number of class predictions contained in each bounding box.
public const int CLASS_COUNT = 20;
// The width of one cell in the image grid.
public const float CELL_WIDTH = 32;
// The height of one cell in the image grid.
public const float CELL_HEIGHT = 32;
// The starting position of the current cell in the grid.
private int channelStride = ROW_COUNT * COL_COUNT;
private float[] anchors = new float[]
@ -58,12 +74,14 @@ namespace OnnxObjectDetectionE2EAPP
Color.DarkTurquoise
};
// Applies the sigmoid function that outputs a number between 0 and 1.
private float Sigmoid(float value)
{
var k = (float)Math.Exp(value);
return k / (1.0f + k);
}
// Normalizes an input vector into a probability distribution.
private float[] Softmax(float[] values)
{
var maxVal = values.Max();
@ -73,6 +91,7 @@ namespace OnnxObjectDetectionE2EAPP
return exp.Select(v => (float)(v / sumExp)).ToArray();
}
// Maps elements in the one-dimensional model output to the corresponding position in a 125 x 13 x 13 tensor.
private int GetOffset(int x, int y, int channel)
{
// YOLO outputs a tensor that has a shape of 125x13x13, which
@ -82,6 +101,7 @@ namespace OnnxObjectDetectionE2EAPP
return (channel * this.channelStride) + (y * COL_COUNT) + x;
}
// Extracts the bounding box dimensions using the GetOffset method from the model output.
private BoundingBoxDimensions ExtractBoundingBoxDimensions(float[] modelOutput, int x, int y, int channel)
{
return new BoundingBoxDimensions
@ -93,11 +113,14 @@ namespace OnnxObjectDetectionE2EAPP
};
}
// Extracts the confidence value which states how sure the model is that it has detected an object
// and uses the Sigmoid function to turn it into a percentage.
private float GetConfidence(float[] modelOutput, int x, int y, int channel)
{
return Sigmoid(modelOutput[GetOffset(x, y, channel + 4)]);
}
// Uses the bounding box dimensions and maps them onto its respective cell within the image.
private CellDimensions MapBoundingBoxToCell(int x, int y, int box, BoundingBoxDimensions boxDimensions)
{
return new CellDimensions
@ -109,6 +132,8 @@ namespace OnnxObjectDetectionE2EAPP
};
}
// Extracts the class predictions for the bounding box from the model output using the GetOffset
// method and turns them into a probability distribution using the Softmax method.
public float[] ExtractClasses(float[] modelOutput, int x, int y, int channel)
{
float[] predictedClasses = new float[CLASS_COUNT];
@ -120,6 +145,7 @@ namespace OnnxObjectDetectionE2EAPP
return Softmax(predictedClasses);
}
// Selects the class from the list of predicted classes with the highest probability.
private ValueTuple<int, float> GetTopResult(float[] predictedClasses)
{
return predictedClasses
@ -128,6 +154,7 @@ namespace OnnxObjectDetectionE2EAPP
.First();
}
// Filters overlapping bounding boxes with lower probabilities.
private float IntersectionOverUnion(RectangleF boundingBoxA, RectangleF boundingBoxB)
{
var areaA = boundingBoxA.Width * boundingBoxA.Height;

Просмотреть файл

@ -0,0 +1,9 @@
<Application x:Class="OnnxObjectDetectionApp.App"
xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
xmlns:local="clr-namespace:OnnxObjectDetectionApp"
StartupUri="MainWindow.xaml">
<Application.Resources>
</Application.Resources>
</Application>

Просмотреть файл

@ -0,0 +1,11 @@
using System.Windows;
namespace OnnxObjectDetectionApp
{
/// <summary>
/// Interaction logic for App.xaml
/// </summary>
public partial class App : Application
{
}
}

Просмотреть файл

@ -0,0 +1,16 @@
<Window x:Class="OnnxObjectDetectionApp.MainWindow"
xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
xmlns:d="http://schemas.microsoft.com/expression/blend/2008"
xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006"
xmlns:local="clr-namespace:OnnxObjectDetectionApp"
mc:Ignorable="d"
Title="ML.NET Object Detection (Onnx)" Height="506" Width="640">
<Grid Background="Black">
<Grid.RowDefinitions>
<RowDefinition Height="*" />
</Grid.RowDefinitions>
<Image x:Name="WebCamImage" Grid.Row="0" />
<Canvas x:Name="WebCamCanvas" Grid.Row="0" Width="{Binding Path=ActualWidth, ElementName=WebCamImage}"/>
</Grid>
</Window>

Просмотреть файл

@ -0,0 +1,188 @@
using Microsoft.ML;
using OnnxObjectDetection;
using OpenCvSharp;
using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Diagnostics;
using System.Drawing;
using System.IO;
using System.Threading;
using System.Threading.Tasks;
using System.Windows;
using System.Windows.Controls;
using System.Windows.Media;
using System.Windows.Media.Imaging;
using Rectangle = System.Windows.Shapes.Rectangle;
namespace OnnxObjectDetectionApp
{
public partial class MainWindow : System.Windows.Window
{
private VideoCapture capture;
private CancellationTokenSource cameraCaptureCancellationTokenSource;
private readonly YoloOutputParser yoloParser = new YoloOutputParser();
private PredictionEngine<ImageInputData, ImageObjectPrediction> predictionEngine;
public MainWindow()
{
InitializeComponent();
LoadModel();
}
protected override void OnActivated(EventArgs e)
{
base.OnActivated(e);
StartCameraCapture();
}
protected override void OnDeactivated(EventArgs e)
{
base.OnDeactivated(e);
StopCameraCapture();
}
private void LoadModel()
{
var onnxModel = "TinyYolo2_model.onnx";
var modelDirectory = Path.Combine(Environment.CurrentDirectory, @"ML\OnnxModel");
var onnxPath = Path.Combine(modelDirectory, onnxModel);
var onnxModelConfigurator = new OnnxModelConfigurator(onnxPath);
predictionEngine = onnxModelConfigurator.GetMlNetPredictionEngine();
}
private void StartCameraCapture()
{
cameraCaptureCancellationTokenSource = new CancellationTokenSource();
Task.Run(() => CaptureCamera(cameraCaptureCancellationTokenSource.Token), cameraCaptureCancellationTokenSource.Token) ;
}
private void StopCameraCapture()
{
cameraCaptureCancellationTokenSource?.Cancel();
}
private async Task CaptureCamera(CancellationToken token)
{
if (capture == null)
capture = new VideoCapture(CaptureDevice.DShow);
capture.Open(0);
if (capture.IsOpened())
{
while (!token.IsCancellationRequested)
{
using MemoryStream memoryStream = capture.RetrieveMat().Flip(FlipMode.Y).ToMemoryStream();
await Application.Current.Dispatcher.InvokeAsync(() =>
{
var imageSource = new BitmapImage();
imageSource.BeginInit();
imageSource.CacheOption = BitmapCacheOption.OnLoad;
imageSource.StreamSource = memoryStream;
imageSource.EndInit();
WebCamImage.Source = imageSource;
});
var bitmapImage = new Bitmap(memoryStream);
await ParseWebCamFrame(bitmapImage);
}
capture.Release();
}
}
async Task ParseWebCamFrame(Bitmap bitmap)
{
if (predictionEngine == null)
return;
var frame = new ImageInputData { Image = bitmap };
var filteredBoxes = DetectObjectsUsingModel(frame);
await Application.Current.Dispatcher.InvokeAsync(() =>
{
DrawOverlays(filteredBoxes, (int)WebCamImage.ActualHeight, (int)WebCamImage.ActualWidth);
});
}
public IList<YoloBoundingBox> DetectObjectsUsingModel(ImageInputData imageInputData)
{
var labels = predictionEngine.Predict(imageInputData).PredictedLabels;
var boundingBoxes = yoloParser.ParseOutputs(labels);
var filteredBoxes = yoloParser.FilterBoundingBoxes(boundingBoxes, 5, 0.5f);
return filteredBoxes;
}
private void DrawOverlays(IList<YoloBoundingBox> filteredBoxes, int originalHeight, int originalWidth)
{
WebCamCanvas.Children.Clear();
foreach (var box in filteredBoxes)
{
// process output boxes
var x = (uint)Math.Max(box.Dimensions.X, 0);
var y = (uint)Math.Max(box.Dimensions.Y, 0);
var width = (uint)Math.Min(originalWidth - x, box.Dimensions.Width);
var height = (uint)Math.Min(originalHeight - y, box.Dimensions.Height);
// fit to current image size
x = (uint)originalWidth * x / OnnxModelConfigurator.ImageSettings.imageWidth;
y = (uint)originalHeight * y / OnnxModelConfigurator.ImageSettings.imageHeight;
width = (uint)originalWidth * width / OnnxModelConfigurator.ImageSettings.imageWidth;
height = (uint)originalHeight * height / OnnxModelConfigurator.ImageSettings.imageHeight;
var boxColor = box.BoxColor.ToMediaColor();
var description = $"{box.Label} ({(box.Confidence * 100).ToString("0")}%)";
var objBox = new Rectangle
{
Width = width,
Height = height,
Fill = new SolidColorBrush(Colors.Transparent),
Stroke = new SolidColorBrush(boxColor),
StrokeThickness = 2.0,
Margin = new Thickness(x, y, 0, 0)
};
var objDescription = new TextBlock
{
Margin = new Thickness(x + 4, y + 4, 0, 0),
Text = description,
FontWeight = FontWeights.Bold,
Width = 126,
Height = 21,
TextAlignment = TextAlignment.Center
};
var objDescriptionBackground = new Rectangle
{
Width = 134,
Height = 29,
Fill = new SolidColorBrush(boxColor),
Margin = new Thickness(x, y, 0, 0)
};
WebCamCanvas.Children.Add(objDescriptionBackground);
WebCamCanvas.Children.Add(objDescription);
WebCamCanvas.Children.Add(objBox);
}
}
}
internal static class ColorExtensions
{
internal static System.Windows.Media.Color ToMediaColor(this System.Drawing.Color drawingColor)
{
return System.Windows.Media.Color.FromArgb(drawingColor.A, drawingColor.R, drawingColor.G, drawingColor.B);
}
}
}

Просмотреть файл

@ -0,0 +1,21 @@
<Project Sdk="Microsoft.NET.Sdk.WindowsDesktop">
<PropertyGroup>
<OutputType>WinExe</OutputType>
<TargetFramework>netcoreapp3.0</TargetFramework>
<UseWPF>true</UseWPF>
</PropertyGroup>
<ItemGroup>
<PackageReference Include="Microsoft.ML" Version="$(MicrosoftMLVersion)" />
<PackageReference Include="Microsoft.ML.ImageAnalytics" Version="$(MicrosoftMLVersion)" />
<PackageReference Include="Microsoft.ML.OnnxTransformer" Version="$(MicrosoftMLVersion)" />
<PackageReference Include="Microsoft.Windows.Compatibility" Version="2.1.1" />
<PackageReference Include="OpenCvSharp3-AnyCPU" Version="4.0.0.20181129" />
</ItemGroup>
<ItemGroup>
<ProjectReference Include="..\OnnxObjectDetection\OnnxObjectDetection.csproj" />
</ItemGroup>
</Project>

Просмотреть файл

До

Ширина:  |  Высота:  |  Размер: 371 KiB

После

Ширина:  |  Высота:  |  Размер: 371 KiB

Просмотреть файл

До

Ширина:  |  Высота:  |  Размер: 70 KiB

После

Ширина:  |  Высота:  |  Размер: 70 KiB

Просмотреть файл

До

Ширина:  |  Высота:  |  Размер: 74 KiB

После

Ширина:  |  Высота:  |  Размер: 74 KiB

Просмотреть файл

До

Ширина:  |  Высота:  |  Размер: 391 KiB

После

Ширина:  |  Высота:  |  Размер: 391 KiB

Просмотреть файл

@ -9,6 +9,7 @@ using Microsoft.Extensions.Logging;
using OnnxObjectDetectionE2EAPP.Infrastructure;
using OnnxObjectDetectionE2EAPP.Services;
using OnnxObjectDetectionE2EAPP.Utilities;
using OnnxObjectDetection;
namespace OnnxObjectDetectionE2EAPP.Controllers
{

Просмотреть файл

@ -25,13 +25,13 @@ namespace OnnxObjectDetectionE2EAPP.Infrastructure
public static ImageFormat GetImageFormat(byte[] bytes)
{
// see http://www.mikekunz.com/image_file_header.html
var bmp = Encoding.ASCII.GetBytes("BM"); // BMP
var gif = Encoding.ASCII.GetBytes("GIF"); // GIF
var png = new byte[] { 137, 80, 78, 71 }; // PNG
var tiff = new byte[] { 73, 73, 42 }; // TIFF
var tiff2 = new byte[] { 77, 77, 42 }; // TIFF
var jpeg = new byte[] { 255, 216, 255, 224 }; // jpeg
var jpeg2 = new byte[] { 255, 216, 255, 225 }; // jpeg canon
var bmp = Encoding.ASCII.GetBytes("BM"); // BMP
var gif = Encoding.ASCII.GetBytes("GIF"); // GIF
var png = new byte[] { 137, 80, 78, 71 }; // PNG
var tiff = new byte[] { 73, 73, 42 }; // TIFF
var tiff2 = new byte[] { 77, 77, 42 }; // TIFF
var jpeg = new byte[] { 255, 216, 255, 224 }; // jpeg
var jpeg2 = new byte[] { 255, 216, 255, 225 }; // jpeg canon
var jpg1 = new byte[] { 255, 216, 255, 219 };
var jpg2 = new byte[] { 255, 216, 255, 226 };

Просмотреть файл

@ -1,8 +1,7 @@
<Project Sdk="Microsoft.NET.Sdk.Web">
<PropertyGroup>
<TargetFramework>netcoreapp2.2</TargetFramework>
<AspNetCoreHostingModel>InProcess</AspNetCoreHostingModel>
<TargetFramework>netcoreapp3.0</TargetFramework>
<UserSecretsId>07b5f177-2aea-4405-8e78-4121f2a881ff</UserSecretsId>
</PropertyGroup>
@ -30,18 +29,18 @@
<ItemGroup>
<PackageReference Include="Microsoft.AspNetCore.App" />
<PackageReference Include="Microsoft.AspNetCore.Razor.Design" Version="2.2.0" PrivateAssets="All" />
<PackageReference Include="Microsoft.ML.OnnxTransformer" Version="$(MicrosoftMLVersion)" />
<PackageReference Include="Microsoft.ML" Version="$(MicrosoftMLVersion)" />
<PackageReference Include="Microsoft.ML.ImageAnalytics" Version="$(MicrosoftMLVersion)" />
<PackageReference Include="Microsoft.ML.OnnxTransformer" Version="$(MicrosoftMLVersion)" />
<PackageReference Include="Microsoft.Extensions.ML" Version="$(MicrosoftMLPreviewVersion)" />
</ItemGroup>
<ItemGroup>
<ProjectReference Include="..\OnnxObjectDetection\OnnxObjectDetection.csproj" />
</ItemGroup>
<ItemGroup>
<None Update="ML\OnnxModel\TinyYolo2_model.onnx">
<CopyToOutputDirectory>Always</CopyToOutputDirectory>
</None>
<None Update="ML\MLNETModel\ReadMe.txt">
<CopyToOutputDirectory>Always</CopyToOutputDirectory>
</None>

Просмотреть файл

@ -0,0 +1,20 @@
using Microsoft.AspNetCore.Hosting;
using Microsoft.Extensions.Hosting;
namespace OnnxObjectDetectionE2EAPP
{
public class Program
{
public static void Main(string[] args)
{
CreateHostBuilder(args).Build().Run();
}
public static IHostBuilder CreateHostBuilder(string[] args) =>
Host.CreateDefaultBuilder(args)
.ConfigureWebHostDefaults(webBuilder =>
{
webBuilder.UseStartup<Startup>();
});
}
}

Просмотреть файл

@ -3,7 +3,7 @@ using System;
using System.Collections.Generic;
using System.Drawing;
using System.Drawing.Drawing2D;
using OnnxObjectDetectionE2EAPP.MLModel;
using OnnxObjectDetection;
namespace OnnxObjectDetectionE2EAPP.Services
{
@ -15,20 +15,20 @@ namespace OnnxObjectDetectionE2EAPP.Services
public class ObjectDetectionService : IObjectDetectionService
{
private readonly YoloOutputParser _parser = new YoloOutputParser();
IList<YoloBoundingBox> filteredBoxes;
private readonly PredictionEnginePool<ImageInputData, ImageObjectPrediction> model;
private readonly YoloOutputParser yoloParser = new YoloOutputParser();
private readonly PredictionEnginePool<ImageInputData, ImageObjectPrediction> predictionEngine;
public ObjectDetectionService(PredictionEnginePool<ImageInputData, ImageObjectPrediction> model)
public ObjectDetectionService(PredictionEnginePool<ImageInputData, ImageObjectPrediction> predictionEngine)
{
this.model = model;
this.predictionEngine = predictionEngine;
}
public void DetectObjectsUsingModel(ImageInputData imageInputData)
{
var probs = model.Predict(imageInputData).PredictedLabels;
IList<YoloBoundingBox> boundingBoxes = _parser.ParseOutputs(probs);
filteredBoxes = _parser.FilterBoundingBoxes(boundingBoxes, 5, .5F);
var probs = predictionEngine.Predict(imageInputData).PredictedLabels;
IList<YoloBoundingBox> boundingBoxes = yoloParser.ParseOutputs(probs);
filteredBoxes = yoloParser.FilterBoundingBoxes(boundingBoxes, 5, .5F);
}
public Image DrawBoundingBox(string imageFilePath)

Просмотреть файл

@ -1,15 +1,14 @@
using Microsoft.AspNetCore.Builder;
using Microsoft.AspNetCore.Hosting;
using Microsoft.AspNetCore.Http;
using Microsoft.AspNetCore.Mvc;
using Microsoft.Extensions.Configuration;
using Microsoft.Extensions.DependencyInjection;
using OnnxObjectDetectionE2EAPP.Infrastructure;
using Microsoft.Extensions.Hosting;
using Microsoft.Extensions.ML;
using OnnxObjectDetectionE2EAPP.Infrastructure;
using OnnxObjectDetectionE2EAPP.Services;
using System.IO;
using OnnxObjectDetectionE2EAPP.Utilities;
using OnnxObjectDetectionE2EAPP.MLModel;
using OnnxObjectDetection;
namespace OnnxObjectDetectionE2EAPP
{
@ -21,8 +20,8 @@ namespace OnnxObjectDetectionE2EAPP
{
Configuration = configuration;
_onnxModelFilePath = GetAbsolutePath(Configuration["MLModel:OnnxModelFilePath"]);
_mlnetModelFilePath = GetAbsolutePath(Configuration["MLModel:MLNETModelFilePath"]);
_onnxModelFilePath = CommonHelpers.GetAbsolutePath(Configuration["MLModel:OnnxModelFilePath"]);
_mlnetModelFilePath = CommonHelpers.GetAbsolutePath(Configuration["MLModel:MLNETModelFilePath"]);
OnnxModelConfigurator onnxModelConfigurator = new OnnxModelConfigurator(_onnxModelFilePath);
@ -40,8 +39,9 @@ namespace OnnxObjectDetectionE2EAPP
options.CheckConsentNeeded = context => true;
options.MinimumSameSitePolicy = SameSiteMode.None;
});
services.AddMvc().SetCompatibilityVersion(CompatibilityVersion.Version_2_2);
services.AddControllers();
services.AddRazorPages();
services.AddPredictionEnginePool<ImageInputData, ImageObjectPrediction>().
FromFile(_mlnetModelFilePath);
@ -51,9 +51,9 @@ namespace OnnxObjectDetectionE2EAPP
}
// This method gets called by the runtime. Use this method to configure the HTTP request pipeline.
public void Configure(IApplicationBuilder app, IHostingEnvironment env)
public void Configure(IApplicationBuilder app, IWebHostEnvironment env)
{
if (env.IsDevelopment())
if (env.EnvironmentName == Environments.Development)
{
app.UseDeveloperExceptionPage();
}
@ -65,16 +65,11 @@ namespace OnnxObjectDetectionE2EAPP
app.UseStaticFiles();
app.UseCookiePolicy();
app.UseMvc();
app.UseRouting();
app.UseEndpoints(endpoints => {
endpoints.MapControllers();
endpoints.MapRazorPages();
});
}
public static string GetAbsolutePath(string relativePath)
{
FileInfo _dataRoot = new FileInfo(typeof(Program).Assembly.Location);
string assemblyFolderPath = _dataRoot.Directory.FullName;
string fullPath = Path.Combine(assemblyFolderPath, relativePath);
return fullPath;
}
}
}

Просмотреть файл

До

Ширина:  |  Высота:  |  Размер: 371 KiB

После

Ширина:  |  Высота:  |  Размер: 371 KiB

Просмотреть файл

До

Ширина:  |  Высота:  |  Размер: 70 KiB

После

Ширина:  |  Высота:  |  Размер: 70 KiB

Просмотреть файл

До

Ширина:  |  Высота:  |  Размер: 74 KiB

После

Ширина:  |  Высота:  |  Размер: 74 KiB

Просмотреть файл

До

Ширина:  |  Высота:  |  Размер: 391 KiB

После

Ширина:  |  Высота:  |  Размер: 391 KiB

Просмотреть файл

До

Ширина:  |  Высота:  |  Размер: 31 KiB

После

Ширина:  |  Высота:  |  Размер: 31 KiB

Просмотреть файл

@ -0,0 +1,199 @@
# Object Detection - ASP.NET Core Web & WPF Desktop Sample
| ML.NET version | API type | Status | App Type | Data type | Scenario | ML Task | Algorithms |
|----------------|-------------|------------|-------------|-------------|------------------|---------------|------------------------|
| v1.3.1 | Dynamic API | Up-to-date | End-End app | image files | Object Detection | Deep Learning | Tiny YOLOv2 ONNX model |
## Problem
Object detection is one of the classic problems in computer vision: Recognize what objects are inside a given image and also where they are in the image. For these cases, you can either use pre-trained models or train your own model to classify images specific to your custom domain. In this sample, we'll use a pre-trained model.
## How the sample works
This sample consists of two separate apps:
- An ASP.NET Core Web App that allows the user to upload or select an image. The Web app then runs the image through an object detection model using ML.NET, and paints bounding boxes with labels indicating the objects detected.
- A WPF Core desktop app that renders a live-stream of the device's web cam, runs the video frames through an object detection model using ML.NET, and paints bounding boxes with labels indicating the objects detected in real-time.
The Web app shows the images listed on the right, and each image may be selected to process. Once the image is processed, it is drawn in the middle of the screen with labeled bounding boxes around each detected object as shown below.
![Animated image showing object detection web sample](./docs/Screenshots/ObjectDetection.gif)
Alternatively you can try uploading your own images as shown below.
![Animated image showing object detection web sample](./docs/Screenshots/FileUpload.gif)
## Pre-trained model
There are multiple pre-trained models for identifying multiple objects in the images. Both the **WPF app** and the **Web app** use the pre-trained model, **Tiny YOLOv2** in [**ONNX**](http://onnx.ai/) format. This model is a real-time neural network for object detection that detects 20 different classes. It is made up of 9 convolutional layers and 6 max-pooling layers and is a smaller version of the more complex full [YOLOv2](https://pjreddie.com/darknet/yolov2/) network.
The Open Neural Network eXchange i.e [ONNX](http://onnx.ai/) is an open format to represent deep learning models. With ONNX, developers can move models between state-of-the-art tools and choose the combination that is best for them. ONNX is developed and supported by a community of partners, including Microsoft.
The model is downloaded from the [ONNX Model Zoo](https://github.com/onnx/models/tree/master/vision/object_detection_segmentation/tiny_yolov2) which is a is a collection of pre-trained, state-of-the-art models in the ONNX format.
The Tiny YOLOv2 model was trained on the [Pascal VOC](http://host.robots.ox.ac.uk/pascal/VOC/) dataset. Below are the model's prerequisites.
### Model input and output
- **Input:** An image of the shape (3x416x416)
- **Output:** An (1x125x13x13) array
### Pre-processing steps
Resize the input image to an (3x416x416) array of type `float32`.
### Post-processing steps
The output is a (125x13x13) tensor where 13x13 is the number of grid cells that the image gets divided into. Each grid cell corresponds to 125 channels, made up of the 5 bounding boxes predicted by the grid cell and the 25 data elements that describe each bounding box (5x25=125). For more information on how to derive the final bounding boxes and their corresponding confidence scores, refer to this [post](http://machinethink.net/blog/object-detection-with-yolo/).
## Solution
**The projects in this solution use .NET Core 3.0. In order to run this sample, you must install the .NET Core SDK 3.0. To do this either:**
1. Manually install the SDK by going to [.NET Core 3.0 download page](https://aka.ms/netcore3download) and download the latest **.NET Core Installer** in the **SDK** column.
2. Or, if you're using Visual Studio 2019, go to: _**Tools > Options > Environment > Preview Features**_ and check the box next to: _**Use previews of the .NET Core SDK**_
### The solution contains three projects
- [**OnnxObjectDetection**](./OnnxObjectDetection) is a .NET Standard library used by both the WPF app and the Web app. It contains most of the logic for running images through the model and parsing the resulting prediction. This project also contains the Onnx model file. With the exception of drawing the labels bounding boxes on the image/screen, all of the following code snippets are contained in this project.
- [**OnnxObjectDetectionWeb**](./OnnxObjectDetectionWeb) contains an ASP.NET Core Web App that that contains both **Razor UI pages** and an **API controller** to process and render images.
- [**OnnxObjectDetectionApp**](./OnnxObjectDetectionApp) contains an .NET CORE WPF Desktop App that uses [OpenCvSharp](https://github.com/shimat/opencvsharp) to capture the video from the devices webcam.
## Code Walkthrough
_This sample differs from the [getting-started object detection sample](https://github.com/dotnet/machinelearning-samples/tree/master/samples/csharp/getting-started/DeepLearning_ObjectDetection_Onnx) in that here we load/process the images **in-memory** whereas the getting-started sample loads the images from a **file**._
Create a class that defines the data schema to use while loading data into an `IDataView`. ML.NET supports the `Bitmap` type for images, so we'll specify `Bitmap` property decorated with the `ImageTypeAttribute`, as shown below.
```csharp
public class ImageInputData
{
[ImageType(416, 416)]
public Bitmap Image { get; set; }
}
```
### ML.NET: Configure the model
The first step is to create an empty DataView since we need schema of the data while configuring the model.
```csharp
var dataView = _mlContext.Data.LoadFromEnumerable(new List<ImageInputData>());
```
The second step is to define the estimator pipeline. Usually when dealing with deep neural networks, you must adapt the images to the format expected by the network. For this reason, the code below resizes and transforms the images (pixel values are normalized across all R,G,B channels).
```csharp
var pipeline = _mlContext.Transforms.ResizeImages(resizing: ImageResizingEstimator.ResizingKind.Fill, outputColumnName: "image", imageWidth: ImageSettings.imageWidth, imageHeight: ImageSettings.imageHeight, inputColumnName: nameof(ImageInputData.Image))
.Append(_mlContext.Transforms.ExtractPixels(outputColumnName: "image"))
.Append(_mlContext.Transforms.ApplyOnnxModel(modelFile: onnxModelFilePath, outputColumnNames: new[] { TinyYoloModelSettings.ModelOutput }, inputColumnNames: new[] { TinyYoloModelSettings.ModelInput }));
```
You also need to inspect the neural network to get the **names** of the **input** and **output** nodes, which are used later when we define the estimation pipeline. To do this, you can use tools like [Netron](https://github.com/lutzroeder/netron), a GUI visualizer for neural networks, deep learning, and machine learning models.
Below is an example of what we'd see upon opening this sample's Tiny YOLOv2 model with Netron:
![Output from inspecting the Tiny YOLOv2 model with Netron](./docs/Netron/netron.PNG)
From the Netron output above, we can see that our Tiny YOLOv2 network's input tensor is named **'image'** and its output is named **'grid.'**
We'll use these to define the **input** and **output** parameters of the Tiny YOLOv2 Onnx Model.
```csharp
public struct TinyYoloModelSettings
{
// To check Tiny YOLOv2 Model input and output parameter names,
// you can use tools like Netron: https://github.com/lutzroeder/netron
// Input tensor name
public const string ModelInput = "image";
// Output tensor name
public const string ModelOutput = "grid";
}
```
Create the model by fitting the DataView.
```csharp
var model = pipeline.Fit(dataView);
```
## Load model and create PredictionEngine
After the model is configured, we need to save the model, load the saved model, create a `PredictionEngine`, and then pass the image to the engine to detect objects using the model.
This is one place that the **Web** app and the **WPF** app differ slightly.
The **Web** app uses a `PredictionEnginePool` to efficiently manage and provide the service with a `PredictionEngine` to use to make predictions. Internally, it is optimized so the object dependencies are cached and shared across Http requests with minimum overhead when creating those objects
```csharp
public ObjectDetectionService(PredictionEnginePool<ImageInputData, ImageObjectPrediction> predictionEngine)
{
this.predictionEngine = predictionEngine;
}
```
Whereas the **WPF** desktop app creates a single `PredictionEngine` and caches locally to be used for each frame prediction. And the key point to clarify is that the calling code that instantiates the `PredictionEngine` is responsible for handling the caching (as compared to the `PredictionEnginePool`).
```csharp
public PredictionEngine<ImageInputData, ImageObjectPrediction> GetMlNetPredictionEngine()
{
return _mlContext.Model.CreatePredictionEngine<ImageInputData, ImageObjectPrediction>(_mlModel);
}
```
## Detect objects in the image
When obtaining the prediction, we get a `float` array of size **21125** in the `PredictedLabels` property. This is the 125x13x13 output of the model discussed earlier. We then use the `YoloOutputParser` class to interpret and return a number of bounding boxes for each image. Again, these boxes are filtered so that we retrieve only 5 with high confidence.
```csharp
var labels = predictionEngine.Predict(imageInputData).PredictedLabels;
var boundingBoxes = yoloParser.ParseOutputs(labels);
var filteredBoxes = yoloParser.FilterBoundingBoxes(boundingBoxes, 5, 0.5f);
```
## Draw bounding boxes around detected objects in Image
The final step is to draw the bounding boxes around the objects.
The **Web** app draws the boxes directly onto the image using Paint API and returns the image to display it in the browser.
```csharp
var img = _objectDetectionService.DrawBoundingBox(imageFilePath);
using (MemoryStream m = new MemoryStream())
{
img.Save(m, img.RawFormat);
byte[] imageBytes = m.ToArray();
// Convert byte[] to Base64 String
base64String = Convert.ToBase64String(imageBytes);
var result = new Result { imageString = base64String };
return result;
}
```
Alternatively, the **WPF** app draws the bounding boxes on a [`Canvas`](https://docs.microsoft.com/en-us/dotnet/api/system.windows.controls.canvas?view=netcore-3.0) element that overlaps the streaming video playback.
```csharp
DrawOverlays(filteredBoxes, WebCamImage.ActualHeight, WebCamImage.ActualWidth);
WebCamCanvas.Children.Clear();
foreach (var box in filteredBoxes)
{
var objBox = new Rectangle {/* ... */ };
var objDescription = new TextBlock {/* ... */ };
var objDescriptionBackground = new Rectangle {/* ... */ };
WebCamCanvas.Children.Add(objDescriptionBackground);
WebCamCanvas.Children.Add(objDescription);
WebCamCanvas.Children.Add(objBox);
}
```
## Note on accuracy
Tiny YOLOv2 is significantly less accurate than the full YOLOv2 model, but the tiny version is sufficient for this sample app.

Просмотреть файл

До

Ширина:  |  Высота:  |  Размер: 17 KiB

После

Ширина:  |  Высота:  |  Размер: 17 KiB

Просмотреть файл

До

Ширина:  |  Высота:  |  Размер: 1.9 MiB

После

Ширина:  |  Высота:  |  Размер: 1.9 MiB

Просмотреть файл

До

Ширина:  |  Высота:  |  Размер: 965 KiB

После

Ширина:  |  Высота:  |  Размер: 965 KiB

Просмотреть файл

До

Ширина:  |  Высота:  |  Размер: 584 KiB

После

Ширина:  |  Высота:  |  Размер: 584 KiB

Просмотреть файл

До

Ширина:  |  Высота:  |  Размер: 439 KiB

После

Ширина:  |  Высота:  |  Размер: 439 KiB