E2E Sample - Live Stream Video Object Detection with Onnx in WPF Desktop App (#602)
* Added UWP project with simple camera support * added simple rectangle overlay drawing * Added ML.NET predictions with yolo model * added TODO comment for Bitmap type issue * added shared and wpf project (and it's broken) * reference shared classes from web app * DeepLearning_ObjectDetection_Onnx -> ObjectDetection_Onnx * draw a square * Rough wpf web cam code * added toolkit * view live webcam stream * initial working prototype * use same colors as web app * only provide the model once and share across projs * Renaming * Use MicrosoftMLVersion in csproj files * fix build * one more 1.3.1 -> MicrosoftMLVersion * cleanup performance and comments * use .net core 3.0 * fix build def * fix camera bug on some devices (thanks @eerhardt) * update web project to .net core 3.0 * edits and cleanup to original README * update README to include information on the WPF desktop app * Update samples/csharp/end-to-end-apps/ObjectDetection-Onnx/README.md Co-Authored-By: Brigit Murtaugh <brigit.murtaugh@microsoft.com> * Update samples/csharp/end-to-end-apps/ObjectDetection-Onnx/README.md Co-Authored-By: Brigit Murtaugh <brigit.murtaugh@microsoft.com> * Update samples/csharp/end-to-end-apps/ObjectDetection-Onnx/README.md Co-Authored-By: Brigit Murtaugh <brigit.murtaugh@microsoft.com> * Update samples/csharp/end-to-end-apps/ObjectDetection-Onnx/README.md Co-Authored-By: Brigit Murtaugh <brigit.murtaugh@microsoft.com> * Update samples/csharp/end-to-end-apps/ObjectDetection-Onnx/README.md Co-Authored-By: Brigit Murtaugh <brigit.murtaugh@microsoft.com> * Update samples/csharp/end-to-end-apps/ObjectDetection-Onnx/README.md Co-Authored-By: Brigit Murtaugh <brigit.murtaugh@microsoft.com> * Update samples/csharp/end-to-end-apps/ObjectDetection-Onnx/README.md Co-Authored-By: Brigit Murtaugh <brigit.murtaugh@microsoft.com> * Update samples/csharp/end-to-end-apps/ObjectDetection-Onnx/README.md Co-Authored-By: Brigit Murtaugh <brigit.murtaugh@microsoft.com> * Update samples/csharp/end-to-end-apps/ObjectDetection-Onnx/README.md Co-Authored-By: Brigit Murtaugh <brigit.murtaugh@microsoft.com> * Update samples/csharp/end-to-end-apps/ObjectDetection-Onnx/OnnxObjectDetectionApp/MainWindow.xaml.cs Co-Authored-By: Brigit Murtaugh <brigit.murtaugh@microsoft.com> * Update samples/csharp/end-to-end-apps/ObjectDetection-Onnx/OnnxObjectDetectionApp/MainWindow.xaml.cs Co-Authored-By: Brigit Murtaugh <brigit.murtaugh@microsoft.com> * readme edits and formatting. thanks @bamurtaugh! * readme edits and fixes per @nicolehaugen review. thanks!
|
@ -163,10 +163,16 @@ phases:
|
|||
- phase: ObjectDetectionE2EAPP
|
||||
queue: Hosted VS2017
|
||||
steps:
|
||||
- task: UseDotNet@2
|
||||
displayName: 'Use .NET Core 3.0'
|
||||
inputs:
|
||||
version: 3.0.x
|
||||
includePreviewVersions: true
|
||||
installationPath: $(Agent.ToolsDirectory)/dotnet
|
||||
- task: DotNetCoreCLI@2
|
||||
displayName: Build Object Detection E2E (Onnx Scorer)
|
||||
inputs:
|
||||
projects: '.\samples\csharp\end-to-end-apps\DeepLearning_ObjectDetection_Onnx\OnnxObjectDetectionE2EApp.sln'
|
||||
projects: '.\samples\csharp\end-to-end-apps\ObjectDetection-Onnx\OnnxObjectDetection.sln'
|
||||
|
||||
- phase: SalesSpikeChangeDetectionE2E
|
||||
queue: Hosted VS2017
|
||||
|
|
|
@ -1,24 +0,0 @@
|
|||
using System;
|
||||
using System.Collections.Generic;
|
||||
using System.IO;
|
||||
using System.Linq;
|
||||
using System.Threading.Tasks;
|
||||
using Microsoft.AspNetCore;
|
||||
using Microsoft.AspNetCore.Hosting;
|
||||
using Microsoft.Extensions.Configuration;
|
||||
using Microsoft.Extensions.Logging;
|
||||
|
||||
namespace OnnxObjectDetectionE2EAPP
|
||||
{
|
||||
public class Program
|
||||
{
|
||||
public static void Main(string[] args)
|
||||
{
|
||||
CreateWebHostBuilder(args).Build().Run();
|
||||
}
|
||||
|
||||
public static IWebHostBuilder CreateWebHostBuilder(string[] args) =>
|
||||
WebHost.CreateDefaultBuilder(args)
|
||||
.UseStartup<Startup>();
|
||||
}
|
||||
}
|
|
@ -1,25 +0,0 @@
|
|||
|
||||
Microsoft Visual Studio Solution File, Format Version 12.00
|
||||
# Visual Studio Version 16
|
||||
VisualStudioVersion = 16.0.28803.452
|
||||
MinimumVisualStudioVersion = 10.0.40219.1
|
||||
Project("{9A19103F-16F7-4668-BE54-9A1E7A4F7556}") = "OnnxObjectDetectionE2EAPP", "OnnxObjectDetectionE2EAPP\OnnxObjectDetectionE2EAPP.csproj", "{4A91FD3C-80FC-40E9-9A0B-0F832B313C24}"
|
||||
EndProject
|
||||
Global
|
||||
GlobalSection(SolutionConfigurationPlatforms) = preSolution
|
||||
Debug|Any CPU = Debug|Any CPU
|
||||
Release|Any CPU = Release|Any CPU
|
||||
EndGlobalSection
|
||||
GlobalSection(ProjectConfigurationPlatforms) = postSolution
|
||||
{4A91FD3C-80FC-40E9-9A0B-0F832B313C24}.Debug|Any CPU.ActiveCfg = Debug|Any CPU
|
||||
{4A91FD3C-80FC-40E9-9A0B-0F832B313C24}.Debug|Any CPU.Build.0 = Debug|Any CPU
|
||||
{4A91FD3C-80FC-40E9-9A0B-0F832B313C24}.Release|Any CPU.ActiveCfg = Release|Any CPU
|
||||
{4A91FD3C-80FC-40E9-9A0B-0F832B313C24}.Release|Any CPU.Build.0 = Release|Any CPU
|
||||
EndGlobalSection
|
||||
GlobalSection(SolutionProperties) = preSolution
|
||||
HideSolutionNode = FALSE
|
||||
EndGlobalSection
|
||||
GlobalSection(ExtensibilityGlobals) = postSolution
|
||||
SolutionGuid = {E4E2676A-8816-4A2F-A0F0-1E2718DAFC78}
|
||||
EndGlobalSection
|
||||
EndGlobal
|
|
@ -1,152 +0,0 @@
|
|||
# Object Detection - Asp.Net cpre Web/Service Sample
|
||||
|
||||
| ML.NET version | API type | Status | App Type | Data type | Scenario | ML Task | Algorithms |
|
||||
|----------------|-------------------|-------------------------------|-------------|-----------|---------------------|---------------------------|-----------------------------|
|
||||
| v1.3.1 | Dynamic API | Up-to-date | End-End app | image files | Object Detection | Deep Learning | Tiny Yolo2 ONNX model |
|
||||
|
||||
## Problem
|
||||
Object detection is one of the classical problems in computer vision: Recognize what objects are inside a given image and also where they are in the image. For these cases, you can either use pre-trained models or train your own model to classify images specific to your custom domain.
|
||||
|
||||
How the app works?
|
||||
|
||||
When the app runs it shows the images list on the bottom at **Sample Input Images** section.select any image to process. After the image is processed, it is shown under **Processed Images** section with the bounding boxes around detected objects as shown below.
|
||||
|
||||
![](./docs/Screenshots/ObjectDetection.gif)
|
||||
|
||||
Alternatively you can try uploading your own images as shown below.
|
||||
|
||||
![](./docs/Screenshots/FileUpload.gif)
|
||||
|
||||
## DataSet
|
||||
There are two data sources: the `tsv` file and the image files. The [tsv file](./OnnxObjectDetectionE2EAPP/TestImages/tags.tsv) contains two columns: the first one is defined as `ImagePath` and the second one is the `Label` corresponding to the image. As you can observe, the file does not have a header row, and looks like this:
|
||||
|
||||
|
||||
The images are located in the [TestImages](./OnnxObjectDetectionE2EAPP/TestImages) folder. These images have been downloaded from internet.
|
||||
|
||||
For example, below are urls from which the images downloaded from:
|
||||
|
||||
https://github.com/simo23/tinyYOLOv2/blob/master/dog.jpg
|
||||
|
||||
https://github.com/simo23/tinyYOLOv2/blob/master/person.jpg
|
||||
|
||||
|
||||
## Pre-trained model
|
||||
There are multiple models which are pre-trained for identifying multiple objects in the images. here we are using the pretrained model, **Tiny Yolo2** in **ONNX** format. This model is a real-time neural network for object detection that detects 20 different classes. It is made up of 9 convolutional layers and 6 max-pooling layers and is a smaller version of the more complex full [YOLOv2](https://pjreddie.com/darknet/yolov2/) network.
|
||||
|
||||
The Open Neural Network eXchange i.e [ONNX](http://onnx.ai/) is an open format to represent deep learning models. With ONNX, developers can move models between state-of-the-art tools and choose the combination that is best for them. ONNX is developed and supported by a community of partners.
|
||||
|
||||
The model is downloaded from the [ONNX Model Zoo](https://github.com/onnx/models/tree/master/tiny_yolov2) which is a is a collection of pre-trained, state-of-the-art models in the ONNX format.
|
||||
|
||||
The Tiny YOLO2 model was trained on the [Pascal VOC](http://host.robots.ox.ac.uk/pascal/VOC/) dataset. Below are the model's prerequisites.
|
||||
|
||||
**Model input and output**
|
||||
|
||||
**Input**
|
||||
|
||||
Input image of the shape (3x416x416)
|
||||
|
||||
**Output**
|
||||
|
||||
Output is a (1x125x13x13) array
|
||||
|
||||
**Pre-processing steps**
|
||||
|
||||
Resize the input image to a (3x416x416) array of type float32.
|
||||
|
||||
**Post-processing steps**
|
||||
|
||||
The output is a (125x13x13) tensor where 13x13 is the number of grid cells that the image gets divided into. Each grid cell corresponds to 125 channels, made up of the 5 bounding boxes predicted by the grid cell and the 25 data elements that describe each bounding box (5x25=125). For more information on how to derive the final bounding boxes and their corresponding confidence scores, refer to this [post](http://machinethink.net/blog/object-detection-with-yolo/).
|
||||
|
||||
|
||||
## Solution
|
||||
The sample contains Razor Webapp which contains both **Razor UI pages** and **API controller** classes to process images.
|
||||
|
||||
## Code Walkthrough
|
||||
|
||||
The difference between the [getting started object detection sample](https://github.com/dotnet/machinelearning-samples/tree/master/samples/csharp/getting-started/DeepLearning_ObjectDetection_Onnx) and this end-to-end sample is we load the images from **file** in getting started sample where as we load the images from **in-memory** in end-to-end sample.
|
||||
|
||||
Define the schema of data in a class type and refer that type while loading data into IDataView using TextLoader. Here the class type is **ImageInputData**. ML.Net supports Bitmap type for images. To load the images from in-memory you just need to specify **Bitmap** type in the class decorated with [ImageType(height, width)] attribute as shown below.
|
||||
|
||||
```csharp
|
||||
public class ImageInputData
|
||||
{
|
||||
[ImageType(416, 416)]
|
||||
public Bitmap Image { get; set; }
|
||||
}
|
||||
```
|
||||
|
||||
### ML.NET: Configure the model
|
||||
|
||||
The first step is to create an empty dataview as we just need schema of data while configuring up model.
|
||||
|
||||
```csharp
|
||||
var dataView = _mlContext.Data.LoadFromEnumerable(new List<ImageNetData>());
|
||||
```
|
||||
|
||||
The second step is to define the estimator pipeline. Usually, when dealing with deep neural networks, you must adapt the images to the format expected by the network. This is the reason images are resized and then transformed (mainly, pixel values are normalized across all R,G,B channels).
|
||||
|
||||
```csharp
|
||||
var pipeline = _mlContext.Transforms.ResizeImages(resizing: ImageResizingEstimator.ResizingKind.Fill, outputColumnName: "image", imageWidth: ImageSettings.imageWidth, imageHeight: ImageSettings.imageHeight, inputColumnName: nameof(ImageInputData.Image))
|
||||
.Append(_mlContext.Transforms.ExtractPixels(outputColumnName: "image"))
|
||||
.Append(_mlContext.Transforms.ApplyOnnxModel(modelFile: onnxModelFilePath, outputColumnNames: new[] { TinyYoloModelSettings.ModelOutput }, inputColumnNames: new[] { TinyYoloModelSettings.ModelInput }));
|
||||
|
||||
|
||||
```
|
||||
You also need to check the neural network, and check the names of the input / output nodes. In order to inspect the model, you can use tools like [Netron](https://github.com/lutzroeder/netron), which is automatically installed with [Visual Studio Tools for AI](https://visualstudio.microsoft.com/downloads/ai-tools-vs/).
|
||||
These names are used later in the definition of the estimation pipe: in the case of the inception network, the input tensor is named 'image' and the output is named 'grid'
|
||||
|
||||
Define the **input** and **output** parameters of the Tiny Yolo2 Onnx Model.
|
||||
|
||||
```
|
||||
public struct TinyYoloModelSettings
|
||||
{
|
||||
// for checking TIny yolo2 Model input and output parameter names,
|
||||
//you can use tools like Netron,
|
||||
// which is installed by Visual Studio AI Tools
|
||||
|
||||
// input tensor name
|
||||
public const string ModelInput = "image";
|
||||
|
||||
// output tensor name
|
||||
public const string ModelOutput = "grid";
|
||||
}
|
||||
```
|
||||
|
||||
![inspecting neural network with netron](./docs/Netron/netron.PNG)
|
||||
|
||||
Create the model by fitting the dataview.
|
||||
|
||||
```csharp
|
||||
var model = pipeline.Fit(dataView);
|
||||
```
|
||||
|
||||
# Detect objects in the image:
|
||||
|
||||
After the model is configured, we need to save the model, load the saved model and the pass the image to the model to detect objects.
|
||||
When obtaining the prediction, we get an array of floats in the property `PredictedLabels`. The array is a float array of size **21125**. This is the output of model i,e 125x13x13 as discussed earlier. This output is interpreted by `YoloOutputParser` class and returns a number of bounding boxes for each image. Again these boxes are filtered so that we retrieve only 5 bounding boxes which have better confidence(how much certain that a box contains the obejct) for each object of the image.
|
||||
```
|
||||
var probs = model.Predict(imageInputData).PredictedLabels;
|
||||
IList<YoloBoundingBox> boundingBoxes = _parser.ParseOutputs(probs);
|
||||
filteredBoxes = _parser.FilterBoundingBoxes(boundingBoxes, 5, .5F);
|
||||
```
|
||||
|
||||
# Draw bounding boxes around detected objects in Image.
|
||||
|
||||
The final step is we draw the bounding boxes around the objects using Paint API and return the image to the browser and it is displayed on the browser
|
||||
```
|
||||
var img = _objectDetectionService.DrawBoundingBox(imageFilePath);
|
||||
|
||||
using (MemoryStream m = new MemoryStream())
|
||||
{
|
||||
img.Save(m, img.RawFormat);
|
||||
byte[] imageBytes = m.ToArray();
|
||||
|
||||
// Convert byte[] to Base64 String
|
||||
base64String = Convert.ToBase64String(imageBytes);
|
||||
var result = new Result { imageString = base64String };
|
||||
return result;
|
||||
}
|
||||
```
|
||||
**Note** The Tiny Yolo2 model is not having much accuracy compare to full YOLO2 model. As this is a sample program we are using Tiny version of Yolo model i.e Tiny_Yolo2
|
||||
|
||||
|
|
@ -0,0 +1,93 @@
|
|||
|
||||
Microsoft Visual Studio Solution File, Format Version 12.00
|
||||
# Visual Studio Version 16
|
||||
VisualStudioVersion = 16.0.28803.452
|
||||
MinimumVisualStudioVersion = 10.0.40219.1
|
||||
Project("{9A19103F-16F7-4668-BE54-9A1E7A4F7556}") = "OnnxObjectDetectionWeb", "OnnxObjectDetectionWeb\OnnxObjectDetectionWeb.csproj", "{4A91FD3C-80FC-40E9-9A0B-0F832B313C24}"
|
||||
EndProject
|
||||
Project("{9A19103F-16F7-4668-BE54-9A1E7A4F7556}") = "OnnxObjectDetection", "OnnxObjectDetection\OnnxObjectDetection.csproj", "{7B159949-6D64-41B2-A30F-1952FA8EBA3E}"
|
||||
EndProject
|
||||
Project("{9A19103F-16F7-4668-BE54-9A1E7A4F7556}") = "OnnxObjectDetectionApp", "OnnxObjectDetectionApp\OnnxObjectDetectionApp.csproj", "{30411590-5517-4E40-8AC6-88E916B66B09}"
|
||||
EndProject
|
||||
Global
|
||||
GlobalSection(SolutionConfigurationPlatforms) = preSolution
|
||||
Debug|Any CPU = Debug|Any CPU
|
||||
Debug|ARM = Debug|ARM
|
||||
Debug|ARM64 = Debug|ARM64
|
||||
Debug|x64 = Debug|x64
|
||||
Debug|x86 = Debug|x86
|
||||
Release|Any CPU = Release|Any CPU
|
||||
Release|ARM = Release|ARM
|
||||
Release|ARM64 = Release|ARM64
|
||||
Release|x64 = Release|x64
|
||||
Release|x86 = Release|x86
|
||||
EndGlobalSection
|
||||
GlobalSection(ProjectConfigurationPlatforms) = postSolution
|
||||
{4A91FD3C-80FC-40E9-9A0B-0F832B313C24}.Debug|Any CPU.ActiveCfg = Debug|Any CPU
|
||||
{4A91FD3C-80FC-40E9-9A0B-0F832B313C24}.Debug|Any CPU.Build.0 = Debug|Any CPU
|
||||
{4A91FD3C-80FC-40E9-9A0B-0F832B313C24}.Debug|ARM.ActiveCfg = Debug|Any CPU
|
||||
{4A91FD3C-80FC-40E9-9A0B-0F832B313C24}.Debug|ARM.Build.0 = Debug|Any CPU
|
||||
{4A91FD3C-80FC-40E9-9A0B-0F832B313C24}.Debug|ARM64.ActiveCfg = Debug|Any CPU
|
||||
{4A91FD3C-80FC-40E9-9A0B-0F832B313C24}.Debug|ARM64.Build.0 = Debug|Any CPU
|
||||
{4A91FD3C-80FC-40E9-9A0B-0F832B313C24}.Debug|x64.ActiveCfg = Debug|Any CPU
|
||||
{4A91FD3C-80FC-40E9-9A0B-0F832B313C24}.Debug|x64.Build.0 = Debug|Any CPU
|
||||
{4A91FD3C-80FC-40E9-9A0B-0F832B313C24}.Debug|x86.ActiveCfg = Debug|Any CPU
|
||||
{4A91FD3C-80FC-40E9-9A0B-0F832B313C24}.Debug|x86.Build.0 = Debug|Any CPU
|
||||
{4A91FD3C-80FC-40E9-9A0B-0F832B313C24}.Release|Any CPU.ActiveCfg = Release|Any CPU
|
||||
{4A91FD3C-80FC-40E9-9A0B-0F832B313C24}.Release|Any CPU.Build.0 = Release|Any CPU
|
||||
{4A91FD3C-80FC-40E9-9A0B-0F832B313C24}.Release|ARM.ActiveCfg = Release|Any CPU
|
||||
{4A91FD3C-80FC-40E9-9A0B-0F832B313C24}.Release|ARM.Build.0 = Release|Any CPU
|
||||
{4A91FD3C-80FC-40E9-9A0B-0F832B313C24}.Release|ARM64.ActiveCfg = Release|Any CPU
|
||||
{4A91FD3C-80FC-40E9-9A0B-0F832B313C24}.Release|ARM64.Build.0 = Release|Any CPU
|
||||
{4A91FD3C-80FC-40E9-9A0B-0F832B313C24}.Release|x64.ActiveCfg = Release|Any CPU
|
||||
{4A91FD3C-80FC-40E9-9A0B-0F832B313C24}.Release|x64.Build.0 = Release|Any CPU
|
||||
{4A91FD3C-80FC-40E9-9A0B-0F832B313C24}.Release|x86.ActiveCfg = Release|Any CPU
|
||||
{4A91FD3C-80FC-40E9-9A0B-0F832B313C24}.Release|x86.Build.0 = Release|Any CPU
|
||||
{7B159949-6D64-41B2-A30F-1952FA8EBA3E}.Debug|Any CPU.ActiveCfg = Debug|Any CPU
|
||||
{7B159949-6D64-41B2-A30F-1952FA8EBA3E}.Debug|Any CPU.Build.0 = Debug|Any CPU
|
||||
{7B159949-6D64-41B2-A30F-1952FA8EBA3E}.Debug|ARM.ActiveCfg = Debug|Any CPU
|
||||
{7B159949-6D64-41B2-A30F-1952FA8EBA3E}.Debug|ARM.Build.0 = Debug|Any CPU
|
||||
{7B159949-6D64-41B2-A30F-1952FA8EBA3E}.Debug|ARM64.ActiveCfg = Debug|Any CPU
|
||||
{7B159949-6D64-41B2-A30F-1952FA8EBA3E}.Debug|ARM64.Build.0 = Debug|Any CPU
|
||||
{7B159949-6D64-41B2-A30F-1952FA8EBA3E}.Debug|x64.ActiveCfg = Debug|Any CPU
|
||||
{7B159949-6D64-41B2-A30F-1952FA8EBA3E}.Debug|x64.Build.0 = Debug|Any CPU
|
||||
{7B159949-6D64-41B2-A30F-1952FA8EBA3E}.Debug|x86.ActiveCfg = Debug|Any CPU
|
||||
{7B159949-6D64-41B2-A30F-1952FA8EBA3E}.Debug|x86.Build.0 = Debug|Any CPU
|
||||
{7B159949-6D64-41B2-A30F-1952FA8EBA3E}.Release|Any CPU.ActiveCfg = Release|Any CPU
|
||||
{7B159949-6D64-41B2-A30F-1952FA8EBA3E}.Release|Any CPU.Build.0 = Release|Any CPU
|
||||
{7B159949-6D64-41B2-A30F-1952FA8EBA3E}.Release|ARM.ActiveCfg = Release|Any CPU
|
||||
{7B159949-6D64-41B2-A30F-1952FA8EBA3E}.Release|ARM.Build.0 = Release|Any CPU
|
||||
{7B159949-6D64-41B2-A30F-1952FA8EBA3E}.Release|ARM64.ActiveCfg = Release|Any CPU
|
||||
{7B159949-6D64-41B2-A30F-1952FA8EBA3E}.Release|ARM64.Build.0 = Release|Any CPU
|
||||
{7B159949-6D64-41B2-A30F-1952FA8EBA3E}.Release|x64.ActiveCfg = Release|Any CPU
|
||||
{7B159949-6D64-41B2-A30F-1952FA8EBA3E}.Release|x64.Build.0 = Release|Any CPU
|
||||
{7B159949-6D64-41B2-A30F-1952FA8EBA3E}.Release|x86.ActiveCfg = Release|Any CPU
|
||||
{7B159949-6D64-41B2-A30F-1952FA8EBA3E}.Release|x86.Build.0 = Release|Any CPU
|
||||
{30411590-5517-4E40-8AC6-88E916B66B09}.Debug|Any CPU.ActiveCfg = Debug|Any CPU
|
||||
{30411590-5517-4E40-8AC6-88E916B66B09}.Debug|Any CPU.Build.0 = Debug|Any CPU
|
||||
{30411590-5517-4E40-8AC6-88E916B66B09}.Debug|ARM.ActiveCfg = Debug|Any CPU
|
||||
{30411590-5517-4E40-8AC6-88E916B66B09}.Debug|ARM.Build.0 = Debug|Any CPU
|
||||
{30411590-5517-4E40-8AC6-88E916B66B09}.Debug|ARM64.ActiveCfg = Debug|Any CPU
|
||||
{30411590-5517-4E40-8AC6-88E916B66B09}.Debug|ARM64.Build.0 = Debug|Any CPU
|
||||
{30411590-5517-4E40-8AC6-88E916B66B09}.Debug|x64.ActiveCfg = Debug|Any CPU
|
||||
{30411590-5517-4E40-8AC6-88E916B66B09}.Debug|x64.Build.0 = Debug|Any CPU
|
||||
{30411590-5517-4E40-8AC6-88E916B66B09}.Debug|x86.ActiveCfg = Debug|Any CPU
|
||||
{30411590-5517-4E40-8AC6-88E916B66B09}.Debug|x86.Build.0 = Debug|Any CPU
|
||||
{30411590-5517-4E40-8AC6-88E916B66B09}.Release|Any CPU.ActiveCfg = Release|Any CPU
|
||||
{30411590-5517-4E40-8AC6-88E916B66B09}.Release|Any CPU.Build.0 = Release|Any CPU
|
||||
{30411590-5517-4E40-8AC6-88E916B66B09}.Release|ARM.ActiveCfg = Release|Any CPU
|
||||
{30411590-5517-4E40-8AC6-88E916B66B09}.Release|ARM.Build.0 = Release|Any CPU
|
||||
{30411590-5517-4E40-8AC6-88E916B66B09}.Release|ARM64.ActiveCfg = Release|Any CPU
|
||||
{30411590-5517-4E40-8AC6-88E916B66B09}.Release|ARM64.Build.0 = Release|Any CPU
|
||||
{30411590-5517-4E40-8AC6-88E916B66B09}.Release|x64.ActiveCfg = Release|Any CPU
|
||||
{30411590-5517-4E40-8AC6-88E916B66B09}.Release|x64.Build.0 = Release|Any CPU
|
||||
{30411590-5517-4E40-8AC6-88E916B66B09}.Release|x86.ActiveCfg = Release|Any CPU
|
||||
{30411590-5517-4E40-8AC6-88E916B66B09}.Release|x86.Build.0 = Release|Any CPU
|
||||
EndGlobalSection
|
||||
GlobalSection(SolutionProperties) = preSolution
|
||||
HideSolutionNode = FALSE
|
||||
EndGlobalSection
|
||||
GlobalSection(ExtensibilityGlobals) = postSolution
|
||||
SolutionGuid = {E4E2676A-8816-4A2F-A0F0-1E2718DAFC78}
|
||||
EndGlobalSection
|
||||
EndGlobal
|
|
@ -1,7 +1,7 @@
|
|||
using Microsoft.ML.Transforms.Image;
|
||||
using System.Drawing;
|
||||
|
||||
namespace OnnxObjectDetectionE2EAPP
|
||||
namespace OnnxObjectDetection
|
||||
{
|
||||
public class ImageInputData
|
||||
{
|
|
@ -1,6 +1,6 @@
|
|||
using Microsoft.ML.Data;
|
||||
|
||||
namespace OnnxObjectDetectionE2EAPP
|
||||
namespace OnnxObjectDetection
|
||||
{
|
||||
public class ImageObjectPrediction
|
||||
{
|
|
@ -1,12 +1,9 @@
|
|||
using System;
|
||||
using System.Collections.Generic;
|
||||
using System.IO;
|
||||
using System.Linq;
|
||||
using Microsoft.ML;
|
||||
using Microsoft.ML;
|
||||
using Microsoft.ML.Transforms.Image;
|
||||
using OnnxObjectDetectionE2EAPP.Utilities;
|
||||
using System.Collections.Generic;
|
||||
using System.Linq;
|
||||
|
||||
namespace OnnxObjectDetectionE2EAPP.MLModel
|
||||
namespace OnnxObjectDetection
|
||||
{
|
||||
public class OnnxModelConfigurator
|
||||
{
|
||||
|
@ -16,7 +13,8 @@ namespace OnnxObjectDetectionE2EAPP.MLModel
|
|||
public OnnxModelConfigurator(string onnxModelFilePath)
|
||||
{
|
||||
_mlContext = new MLContext();
|
||||
// Model creation and pipeline definition for images needs to run just once, so calling it from the constructor:
|
||||
// Model creation and pipeline definition for images needs to run just once,
|
||||
// so calling it from the constructor:
|
||||
_mlModel = SetupMlNetModel(onnxModelFilePath);
|
||||
}
|
||||
|
||||
|
@ -28,14 +26,13 @@ namespace OnnxObjectDetectionE2EAPP.MLModel
|
|||
|
||||
public struct TinyYoloModelSettings
|
||||
{
|
||||
// for checking TIny yolo2 Model input and output parameter names,
|
||||
//you can use tools like Netron,
|
||||
// which is installed by Visual Studio AI Tools
|
||||
// To check Tiny Yolo2 Model input and output parameter names,
|
||||
// you can use tools like Netron: https://github.com/lutzroeder/netron
|
||||
|
||||
// input tensor name
|
||||
// Input tensor name
|
||||
public const string ModelInput = "image";
|
||||
|
||||
// output tensor name
|
||||
// Output tensor name
|
||||
public const string ModelOutput = "grid";
|
||||
}
|
||||
|
||||
|
@ -52,6 +49,11 @@ namespace OnnxObjectDetectionE2EAPP.MLModel
|
|||
return mlNetModel;
|
||||
}
|
||||
|
||||
public PredictionEngine<ImageInputData, ImageObjectPrediction> GetMlNetPredictionEngine()
|
||||
{
|
||||
return _mlContext.Model.CreatePredictionEngine<ImageInputData, ImageObjectPrediction>(_mlModel);
|
||||
}
|
||||
|
||||
public void SaveMLNetModel(string mlnetModelFilePath)
|
||||
{
|
||||
// Save/persist the model to a .ZIP file to be loaded by the PredictionEnginePool
|
||||
|
@ -59,4 +61,3 @@ namespace OnnxObjectDetectionE2EAPP.MLModel
|
|||
}
|
||||
}
|
||||
}
|
||||
|
|
@ -0,0 +1,19 @@
|
|||
<Project Sdk="Microsoft.NET.Sdk">
|
||||
|
||||
<PropertyGroup>
|
||||
<TargetFramework>netstandard2.1</TargetFramework>
|
||||
</PropertyGroup>
|
||||
|
||||
<ItemGroup>
|
||||
<PackageReference Include="Microsoft.ML" Version="$(MicrosoftMLVersion)" />
|
||||
<PackageReference Include="Microsoft.ML.ImageAnalytics" Version="$(MicrosoftMLVersion)" />
|
||||
<PackageReference Include="Microsoft.ML.OnnxTransformer" Version="$(MicrosoftMLVersion)" />
|
||||
</ItemGroup>
|
||||
|
||||
<ItemGroup>
|
||||
<None Update="ML\OnnxModel\TinyYolo2_model.onnx">
|
||||
<CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
|
||||
</None>
|
||||
</ItemGroup>
|
||||
|
||||
</Project>
|
|
@ -1,4 +1,4 @@
|
|||
namespace OnnxObjectDetectionE2EAPP
|
||||
namespace OnnxObjectDetection
|
||||
{
|
||||
public class DimensionsBase
|
||||
{
|
|
@ -1,6 +1,6 @@
|
|||
using System.Drawing;
|
||||
|
||||
namespace OnnxObjectDetectionE2EAPP
|
||||
namespace OnnxObjectDetection
|
||||
{
|
||||
public class BoundingBoxDimensions : DimensionsBase { }
|
||||
|
|
@ -3,21 +3,37 @@ using System.Collections.Generic;
|
|||
using System.Drawing;
|
||||
using System.Linq;
|
||||
|
||||
namespace OnnxObjectDetectionE2EAPP
|
||||
namespace OnnxObjectDetection
|
||||
{
|
||||
class YoloOutputParser
|
||||
public class YoloOutputParser
|
||||
{
|
||||
class CellDimensions : DimensionsBase { }
|
||||
|
||||
|
||||
// The number of rows in the grid the image is divided into.
|
||||
public const int ROW_COUNT = 13;
|
||||
|
||||
// The number of columns in the grid the image is divided into.
|
||||
public const int COL_COUNT = 13;
|
||||
|
||||
// The total number of values contained in one cell of the grid.
|
||||
public const int CHANNEL_COUNT = 125;
|
||||
|
||||
// The number of bounding boxes in a cell.
|
||||
public const int BOXES_PER_CELL = 5;
|
||||
|
||||
// The number of features contained within a box (x,y,height,width,confidence).
|
||||
public const int BOX_INFO_FEATURE_COUNT = 5;
|
||||
|
||||
// The number of class predictions contained in each bounding box.
|
||||
public const int CLASS_COUNT = 20;
|
||||
|
||||
// The width of one cell in the image grid.
|
||||
public const float CELL_WIDTH = 32;
|
||||
|
||||
// The height of one cell in the image grid.
|
||||
public const float CELL_HEIGHT = 32;
|
||||
|
||||
// The starting position of the current cell in the grid.
|
||||
private int channelStride = ROW_COUNT * COL_COUNT;
|
||||
|
||||
private float[] anchors = new float[]
|
||||
|
@ -58,12 +74,14 @@ namespace OnnxObjectDetectionE2EAPP
|
|||
Color.DarkTurquoise
|
||||
};
|
||||
|
||||
// Applies the sigmoid function that outputs a number between 0 and 1.
|
||||
private float Sigmoid(float value)
|
||||
{
|
||||
var k = (float)Math.Exp(value);
|
||||
return k / (1.0f + k);
|
||||
}
|
||||
|
||||
// Normalizes an input vector into a probability distribution.
|
||||
private float[] Softmax(float[] values)
|
||||
{
|
||||
var maxVal = values.Max();
|
||||
|
@ -73,6 +91,7 @@ namespace OnnxObjectDetectionE2EAPP
|
|||
return exp.Select(v => (float)(v / sumExp)).ToArray();
|
||||
}
|
||||
|
||||
// Maps elements in the one-dimensional model output to the corresponding position in a 125 x 13 x 13 tensor.
|
||||
private int GetOffset(int x, int y, int channel)
|
||||
{
|
||||
// YOLO outputs a tensor that has a shape of 125x13x13, which
|
||||
|
@ -82,6 +101,7 @@ namespace OnnxObjectDetectionE2EAPP
|
|||
return (channel * this.channelStride) + (y * COL_COUNT) + x;
|
||||
}
|
||||
|
||||
// Extracts the bounding box dimensions using the GetOffset method from the model output.
|
||||
private BoundingBoxDimensions ExtractBoundingBoxDimensions(float[] modelOutput, int x, int y, int channel)
|
||||
{
|
||||
return new BoundingBoxDimensions
|
||||
|
@ -93,11 +113,14 @@ namespace OnnxObjectDetectionE2EAPP
|
|||
};
|
||||
}
|
||||
|
||||
// Extracts the confidence value which states how sure the model is that it has detected an object
|
||||
// and uses the Sigmoid function to turn it into a percentage.
|
||||
private float GetConfidence(float[] modelOutput, int x, int y, int channel)
|
||||
{
|
||||
return Sigmoid(modelOutput[GetOffset(x, y, channel + 4)]);
|
||||
}
|
||||
|
||||
// Uses the bounding box dimensions and maps them onto its respective cell within the image.
|
||||
private CellDimensions MapBoundingBoxToCell(int x, int y, int box, BoundingBoxDimensions boxDimensions)
|
||||
{
|
||||
return new CellDimensions
|
||||
|
@ -109,6 +132,8 @@ namespace OnnxObjectDetectionE2EAPP
|
|||
};
|
||||
}
|
||||
|
||||
// Extracts the class predictions for the bounding box from the model output using the GetOffset
|
||||
// method and turns them into a probability distribution using the Softmax method.
|
||||
public float[] ExtractClasses(float[] modelOutput, int x, int y, int channel)
|
||||
{
|
||||
float[] predictedClasses = new float[CLASS_COUNT];
|
||||
|
@ -120,6 +145,7 @@ namespace OnnxObjectDetectionE2EAPP
|
|||
return Softmax(predictedClasses);
|
||||
}
|
||||
|
||||
// Selects the class from the list of predicted classes with the highest probability.
|
||||
private ValueTuple<int, float> GetTopResult(float[] predictedClasses)
|
||||
{
|
||||
return predictedClasses
|
||||
|
@ -128,6 +154,7 @@ namespace OnnxObjectDetectionE2EAPP
|
|||
.First();
|
||||
}
|
||||
|
||||
// Filters overlapping bounding boxes with lower probabilities.
|
||||
private float IntersectionOverUnion(RectangleF boundingBoxA, RectangleF boundingBoxB)
|
||||
{
|
||||
var areaA = boundingBoxA.Width * boundingBoxA.Height;
|
|
@ -0,0 +1,9 @@
|
|||
<Application x:Class="OnnxObjectDetectionApp.App"
|
||||
xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
|
||||
xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
|
||||
xmlns:local="clr-namespace:OnnxObjectDetectionApp"
|
||||
StartupUri="MainWindow.xaml">
|
||||
<Application.Resources>
|
||||
|
||||
</Application.Resources>
|
||||
</Application>
|
|
@ -0,0 +1,11 @@
|
|||
using System.Windows;
|
||||
|
||||
namespace OnnxObjectDetectionApp
|
||||
{
|
||||
/// <summary>
|
||||
/// Interaction logic for App.xaml
|
||||
/// </summary>
|
||||
public partial class App : Application
|
||||
{
|
||||
}
|
||||
}
|
|
@ -0,0 +1,16 @@
|
|||
<Window x:Class="OnnxObjectDetectionApp.MainWindow"
|
||||
xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
|
||||
xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
|
||||
xmlns:d="http://schemas.microsoft.com/expression/blend/2008"
|
||||
xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006"
|
||||
xmlns:local="clr-namespace:OnnxObjectDetectionApp"
|
||||
mc:Ignorable="d"
|
||||
Title="ML.NET Object Detection (Onnx)" Height="506" Width="640">
|
||||
<Grid Background="Black">
|
||||
<Grid.RowDefinitions>
|
||||
<RowDefinition Height="*" />
|
||||
</Grid.RowDefinitions>
|
||||
<Image x:Name="WebCamImage" Grid.Row="0" />
|
||||
<Canvas x:Name="WebCamCanvas" Grid.Row="0" Width="{Binding Path=ActualWidth, ElementName=WebCamImage}"/>
|
||||
</Grid>
|
||||
</Window>
|
|
@ -0,0 +1,188 @@
|
|||
using Microsoft.ML;
|
||||
using OnnxObjectDetection;
|
||||
using OpenCvSharp;
|
||||
using System;
|
||||
using System.Collections.Generic;
|
||||
using System.ComponentModel;
|
||||
using System.Diagnostics;
|
||||
using System.Drawing;
|
||||
using System.IO;
|
||||
using System.Threading;
|
||||
using System.Threading.Tasks;
|
||||
using System.Windows;
|
||||
using System.Windows.Controls;
|
||||
using System.Windows.Media;
|
||||
using System.Windows.Media.Imaging;
|
||||
using Rectangle = System.Windows.Shapes.Rectangle;
|
||||
|
||||
namespace OnnxObjectDetectionApp
|
||||
{
|
||||
public partial class MainWindow : System.Windows.Window
|
||||
{
|
||||
private VideoCapture capture;
|
||||
private CancellationTokenSource cameraCaptureCancellationTokenSource;
|
||||
|
||||
private readonly YoloOutputParser yoloParser = new YoloOutputParser();
|
||||
private PredictionEngine<ImageInputData, ImageObjectPrediction> predictionEngine;
|
||||
|
||||
public MainWindow()
|
||||
{
|
||||
InitializeComponent();
|
||||
LoadModel();
|
||||
}
|
||||
|
||||
protected override void OnActivated(EventArgs e)
|
||||
{
|
||||
base.OnActivated(e);
|
||||
StartCameraCapture();
|
||||
}
|
||||
|
||||
protected override void OnDeactivated(EventArgs e)
|
||||
{
|
||||
base.OnDeactivated(e);
|
||||
StopCameraCapture();
|
||||
}
|
||||
|
||||
private void LoadModel()
|
||||
{
|
||||
var onnxModel = "TinyYolo2_model.onnx";
|
||||
var modelDirectory = Path.Combine(Environment.CurrentDirectory, @"ML\OnnxModel");
|
||||
var onnxPath = Path.Combine(modelDirectory, onnxModel);
|
||||
|
||||
var onnxModelConfigurator = new OnnxModelConfigurator(onnxPath);
|
||||
predictionEngine = onnxModelConfigurator.GetMlNetPredictionEngine();
|
||||
}
|
||||
|
||||
private void StartCameraCapture()
|
||||
{
|
||||
cameraCaptureCancellationTokenSource = new CancellationTokenSource();
|
||||
Task.Run(() => CaptureCamera(cameraCaptureCancellationTokenSource.Token), cameraCaptureCancellationTokenSource.Token) ;
|
||||
}
|
||||
|
||||
private void StopCameraCapture()
|
||||
{
|
||||
cameraCaptureCancellationTokenSource?.Cancel();
|
||||
}
|
||||
|
||||
private async Task CaptureCamera(CancellationToken token)
|
||||
{
|
||||
if (capture == null)
|
||||
capture = new VideoCapture(CaptureDevice.DShow);
|
||||
|
||||
capture.Open(0);
|
||||
|
||||
if (capture.IsOpened())
|
||||
{
|
||||
while (!token.IsCancellationRequested)
|
||||
{
|
||||
using MemoryStream memoryStream = capture.RetrieveMat().Flip(FlipMode.Y).ToMemoryStream();
|
||||
|
||||
await Application.Current.Dispatcher.InvokeAsync(() =>
|
||||
{
|
||||
var imageSource = new BitmapImage();
|
||||
|
||||
imageSource.BeginInit();
|
||||
imageSource.CacheOption = BitmapCacheOption.OnLoad;
|
||||
imageSource.StreamSource = memoryStream;
|
||||
imageSource.EndInit();
|
||||
|
||||
WebCamImage.Source = imageSource;
|
||||
});
|
||||
|
||||
var bitmapImage = new Bitmap(memoryStream);
|
||||
|
||||
await ParseWebCamFrame(bitmapImage);
|
||||
}
|
||||
|
||||
capture.Release();
|
||||
}
|
||||
}
|
||||
|
||||
async Task ParseWebCamFrame(Bitmap bitmap)
|
||||
{
|
||||
if (predictionEngine == null)
|
||||
return;
|
||||
|
||||
var frame = new ImageInputData { Image = bitmap };
|
||||
var filteredBoxes = DetectObjectsUsingModel(frame);
|
||||
|
||||
await Application.Current.Dispatcher.InvokeAsync(() =>
|
||||
{
|
||||
DrawOverlays(filteredBoxes, (int)WebCamImage.ActualHeight, (int)WebCamImage.ActualWidth);
|
||||
});
|
||||
}
|
||||
|
||||
public IList<YoloBoundingBox> DetectObjectsUsingModel(ImageInputData imageInputData)
|
||||
{
|
||||
var labels = predictionEngine.Predict(imageInputData).PredictedLabels;
|
||||
var boundingBoxes = yoloParser.ParseOutputs(labels);
|
||||
var filteredBoxes = yoloParser.FilterBoundingBoxes(boundingBoxes, 5, 0.5f);
|
||||
|
||||
return filteredBoxes;
|
||||
}
|
||||
|
||||
private void DrawOverlays(IList<YoloBoundingBox> filteredBoxes, int originalHeight, int originalWidth)
|
||||
{
|
||||
WebCamCanvas.Children.Clear();
|
||||
|
||||
foreach (var box in filteredBoxes)
|
||||
{
|
||||
// process output boxes
|
||||
var x = (uint)Math.Max(box.Dimensions.X, 0);
|
||||
var y = (uint)Math.Max(box.Dimensions.Y, 0);
|
||||
var width = (uint)Math.Min(originalWidth - x, box.Dimensions.Width);
|
||||
var height = (uint)Math.Min(originalHeight - y, box.Dimensions.Height);
|
||||
|
||||
// fit to current image size
|
||||
x = (uint)originalWidth * x / OnnxModelConfigurator.ImageSettings.imageWidth;
|
||||
y = (uint)originalHeight * y / OnnxModelConfigurator.ImageSettings.imageHeight;
|
||||
width = (uint)originalWidth * width / OnnxModelConfigurator.ImageSettings.imageWidth;
|
||||
height = (uint)originalHeight * height / OnnxModelConfigurator.ImageSettings.imageHeight;
|
||||
|
||||
var boxColor = box.BoxColor.ToMediaColor();
|
||||
|
||||
var description = $"{box.Label} ({(box.Confidence * 100).ToString("0")}%)";
|
||||
|
||||
var objBox = new Rectangle
|
||||
{
|
||||
Width = width,
|
||||
Height = height,
|
||||
Fill = new SolidColorBrush(Colors.Transparent),
|
||||
Stroke = new SolidColorBrush(boxColor),
|
||||
StrokeThickness = 2.0,
|
||||
Margin = new Thickness(x, y, 0, 0)
|
||||
};
|
||||
|
||||
var objDescription = new TextBlock
|
||||
{
|
||||
Margin = new Thickness(x + 4, y + 4, 0, 0),
|
||||
Text = description,
|
||||
FontWeight = FontWeights.Bold,
|
||||
Width = 126,
|
||||
Height = 21,
|
||||
TextAlignment = TextAlignment.Center
|
||||
};
|
||||
|
||||
var objDescriptionBackground = new Rectangle
|
||||
{
|
||||
Width = 134,
|
||||
Height = 29,
|
||||
Fill = new SolidColorBrush(boxColor),
|
||||
Margin = new Thickness(x, y, 0, 0)
|
||||
};
|
||||
|
||||
WebCamCanvas.Children.Add(objDescriptionBackground);
|
||||
WebCamCanvas.Children.Add(objDescription);
|
||||
WebCamCanvas.Children.Add(objBox);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
internal static class ColorExtensions
|
||||
{
|
||||
internal static System.Windows.Media.Color ToMediaColor(this System.Drawing.Color drawingColor)
|
||||
{
|
||||
return System.Windows.Media.Color.FromArgb(drawingColor.A, drawingColor.R, drawingColor.G, drawingColor.B);
|
||||
}
|
||||
}
|
||||
}
|
|
@ -0,0 +1,21 @@
|
|||
<Project Sdk="Microsoft.NET.Sdk.WindowsDesktop">
|
||||
|
||||
<PropertyGroup>
|
||||
<OutputType>WinExe</OutputType>
|
||||
<TargetFramework>netcoreapp3.0</TargetFramework>
|
||||
<UseWPF>true</UseWPF>
|
||||
</PropertyGroup>
|
||||
|
||||
<ItemGroup>
|
||||
<PackageReference Include="Microsoft.ML" Version="$(MicrosoftMLVersion)" />
|
||||
<PackageReference Include="Microsoft.ML.ImageAnalytics" Version="$(MicrosoftMLVersion)" />
|
||||
<PackageReference Include="Microsoft.ML.OnnxTransformer" Version="$(MicrosoftMLVersion)" />
|
||||
<PackageReference Include="Microsoft.Windows.Compatibility" Version="2.1.1" />
|
||||
<PackageReference Include="OpenCvSharp3-AnyCPU" Version="4.0.0.20181129" />
|
||||
</ItemGroup>
|
||||
|
||||
<ItemGroup>
|
||||
<ProjectReference Include="..\OnnxObjectDetection\OnnxObjectDetection.csproj" />
|
||||
</ItemGroup>
|
||||
|
||||
</Project>
|
До Ширина: | Высота: | Размер: 143 KiB После Ширина: | Высота: | Размер: 143 KiB |
До Ширина: | Высота: | Размер: 219 KiB После Ширина: | Высота: | Размер: 219 KiB |
До Ширина: | Высота: | Размер: 371 KiB После Ширина: | Высота: | Размер: 371 KiB |
До Ширина: | Высота: | Размер: 70 KiB После Ширина: | Высота: | Размер: 70 KiB |
До Ширина: | Высота: | Размер: 74 KiB После Ширина: | Высота: | Размер: 74 KiB |
До Ширина: | Высота: | Размер: 391 KiB После Ширина: | Высота: | Размер: 391 KiB |
|
@ -9,6 +9,7 @@ using Microsoft.Extensions.Logging;
|
|||
using OnnxObjectDetectionE2EAPP.Infrastructure;
|
||||
using OnnxObjectDetectionE2EAPP.Services;
|
||||
using OnnxObjectDetectionE2EAPP.Utilities;
|
||||
using OnnxObjectDetection;
|
||||
|
||||
namespace OnnxObjectDetectionE2EAPP.Controllers
|
||||
{
|
|
@ -25,13 +25,13 @@ namespace OnnxObjectDetectionE2EAPP.Infrastructure
|
|||
public static ImageFormat GetImageFormat(byte[] bytes)
|
||||
{
|
||||
// see http://www.mikekunz.com/image_file_header.html
|
||||
var bmp = Encoding.ASCII.GetBytes("BM"); // BMP
|
||||
var gif = Encoding.ASCII.GetBytes("GIF"); // GIF
|
||||
var png = new byte[] { 137, 80, 78, 71 }; // PNG
|
||||
var tiff = new byte[] { 73, 73, 42 }; // TIFF
|
||||
var tiff2 = new byte[] { 77, 77, 42 }; // TIFF
|
||||
var jpeg = new byte[] { 255, 216, 255, 224 }; // jpeg
|
||||
var jpeg2 = new byte[] { 255, 216, 255, 225 }; // jpeg canon
|
||||
var bmp = Encoding.ASCII.GetBytes("BM"); // BMP
|
||||
var gif = Encoding.ASCII.GetBytes("GIF"); // GIF
|
||||
var png = new byte[] { 137, 80, 78, 71 }; // PNG
|
||||
var tiff = new byte[] { 73, 73, 42 }; // TIFF
|
||||
var tiff2 = new byte[] { 77, 77, 42 }; // TIFF
|
||||
var jpeg = new byte[] { 255, 216, 255, 224 }; // jpeg
|
||||
var jpeg2 = new byte[] { 255, 216, 255, 225 }; // jpeg canon
|
||||
var jpg1 = new byte[] { 255, 216, 255, 219 };
|
||||
var jpg2 = new byte[] { 255, 216, 255, 226 };
|
||||
|
|
@ -1,8 +1,7 @@
|
|||
<Project Sdk="Microsoft.NET.Sdk.Web">
|
||||
|
||||
<PropertyGroup>
|
||||
<TargetFramework>netcoreapp2.2</TargetFramework>
|
||||
<AspNetCoreHostingModel>InProcess</AspNetCoreHostingModel>
|
||||
<TargetFramework>netcoreapp3.0</TargetFramework>
|
||||
<UserSecretsId>07b5f177-2aea-4405-8e78-4121f2a881ff</UserSecretsId>
|
||||
</PropertyGroup>
|
||||
|
||||
|
@ -30,18 +29,18 @@
|
|||
|
||||
|
||||
<ItemGroup>
|
||||
<PackageReference Include="Microsoft.AspNetCore.App" />
|
||||
<PackageReference Include="Microsoft.AspNetCore.Razor.Design" Version="2.2.0" PrivateAssets="All" />
|
||||
<PackageReference Include="Microsoft.ML.OnnxTransformer" Version="$(MicrosoftMLVersion)" />
|
||||
<PackageReference Include="Microsoft.ML" Version="$(MicrosoftMLVersion)" />
|
||||
<PackageReference Include="Microsoft.ML.ImageAnalytics" Version="$(MicrosoftMLVersion)" />
|
||||
<PackageReference Include="Microsoft.ML.OnnxTransformer" Version="$(MicrosoftMLVersion)" />
|
||||
<PackageReference Include="Microsoft.Extensions.ML" Version="$(MicrosoftMLPreviewVersion)" />
|
||||
</ItemGroup>
|
||||
|
||||
|
||||
<ItemGroup>
|
||||
<ProjectReference Include="..\OnnxObjectDetection\OnnxObjectDetection.csproj" />
|
||||
</ItemGroup>
|
||||
|
||||
<ItemGroup>
|
||||
<None Update="ML\OnnxModel\TinyYolo2_model.onnx">
|
||||
<CopyToOutputDirectory>Always</CopyToOutputDirectory>
|
||||
</None>
|
||||
<None Update="ML\MLNETModel\ReadMe.txt">
|
||||
<CopyToOutputDirectory>Always</CopyToOutputDirectory>
|
||||
</None>
|
|
@ -0,0 +1,20 @@
|
|||
using Microsoft.AspNetCore.Hosting;
|
||||
using Microsoft.Extensions.Hosting;
|
||||
|
||||
namespace OnnxObjectDetectionE2EAPP
|
||||
{
|
||||
public class Program
|
||||
{
|
||||
public static void Main(string[] args)
|
||||
{
|
||||
CreateHostBuilder(args).Build().Run();
|
||||
}
|
||||
|
||||
public static IHostBuilder CreateHostBuilder(string[] args) =>
|
||||
Host.CreateDefaultBuilder(args)
|
||||
.ConfigureWebHostDefaults(webBuilder =>
|
||||
{
|
||||
webBuilder.UseStartup<Startup>();
|
||||
});
|
||||
}
|
||||
}
|
|
@ -3,7 +3,7 @@ using System;
|
|||
using System.Collections.Generic;
|
||||
using System.Drawing;
|
||||
using System.Drawing.Drawing2D;
|
||||
using OnnxObjectDetectionE2EAPP.MLModel;
|
||||
using OnnxObjectDetection;
|
||||
|
||||
namespace OnnxObjectDetectionE2EAPP.Services
|
||||
{
|
||||
|
@ -15,20 +15,20 @@ namespace OnnxObjectDetectionE2EAPP.Services
|
|||
|
||||
public class ObjectDetectionService : IObjectDetectionService
|
||||
{
|
||||
private readonly YoloOutputParser _parser = new YoloOutputParser();
|
||||
IList<YoloBoundingBox> filteredBoxes;
|
||||
private readonly PredictionEnginePool<ImageInputData, ImageObjectPrediction> model;
|
||||
private readonly YoloOutputParser yoloParser = new YoloOutputParser();
|
||||
private readonly PredictionEnginePool<ImageInputData, ImageObjectPrediction> predictionEngine;
|
||||
|
||||
public ObjectDetectionService(PredictionEnginePool<ImageInputData, ImageObjectPrediction> model)
|
||||
public ObjectDetectionService(PredictionEnginePool<ImageInputData, ImageObjectPrediction> predictionEngine)
|
||||
{
|
||||
this.model = model;
|
||||
this.predictionEngine = predictionEngine;
|
||||
}
|
||||
|
||||
public void DetectObjectsUsingModel(ImageInputData imageInputData)
|
||||
{
|
||||
var probs = model.Predict(imageInputData).PredictedLabels;
|
||||
IList<YoloBoundingBox> boundingBoxes = _parser.ParseOutputs(probs);
|
||||
filteredBoxes = _parser.FilterBoundingBoxes(boundingBoxes, 5, .5F);
|
||||
var probs = predictionEngine.Predict(imageInputData).PredictedLabels;
|
||||
IList<YoloBoundingBox> boundingBoxes = yoloParser.ParseOutputs(probs);
|
||||
filteredBoxes = yoloParser.FilterBoundingBoxes(boundingBoxes, 5, .5F);
|
||||
}
|
||||
|
||||
public Image DrawBoundingBox(string imageFilePath)
|
|
@ -1,15 +1,14 @@
|
|||
using Microsoft.AspNetCore.Builder;
|
||||
using Microsoft.AspNetCore.Hosting;
|
||||
using Microsoft.AspNetCore.Http;
|
||||
using Microsoft.AspNetCore.Mvc;
|
||||
using Microsoft.Extensions.Configuration;
|
||||
using Microsoft.Extensions.DependencyInjection;
|
||||
using OnnxObjectDetectionE2EAPP.Infrastructure;
|
||||
using Microsoft.Extensions.Hosting;
|
||||
using Microsoft.Extensions.ML;
|
||||
using OnnxObjectDetectionE2EAPP.Infrastructure;
|
||||
using OnnxObjectDetectionE2EAPP.Services;
|
||||
using System.IO;
|
||||
using OnnxObjectDetectionE2EAPP.Utilities;
|
||||
using OnnxObjectDetectionE2EAPP.MLModel;
|
||||
using OnnxObjectDetection;
|
||||
|
||||
namespace OnnxObjectDetectionE2EAPP
|
||||
{
|
||||
|
@ -21,8 +20,8 @@ namespace OnnxObjectDetectionE2EAPP
|
|||
{
|
||||
Configuration = configuration;
|
||||
|
||||
_onnxModelFilePath = GetAbsolutePath(Configuration["MLModel:OnnxModelFilePath"]);
|
||||
_mlnetModelFilePath = GetAbsolutePath(Configuration["MLModel:MLNETModelFilePath"]);
|
||||
_onnxModelFilePath = CommonHelpers.GetAbsolutePath(Configuration["MLModel:OnnxModelFilePath"]);
|
||||
_mlnetModelFilePath = CommonHelpers.GetAbsolutePath(Configuration["MLModel:MLNETModelFilePath"]);
|
||||
|
||||
OnnxModelConfigurator onnxModelConfigurator = new OnnxModelConfigurator(_onnxModelFilePath);
|
||||
|
||||
|
@ -40,8 +39,9 @@ namespace OnnxObjectDetectionE2EAPP
|
|||
options.CheckConsentNeeded = context => true;
|
||||
options.MinimumSameSitePolicy = SameSiteMode.None;
|
||||
});
|
||||
|
||||
services.AddMvc().SetCompatibilityVersion(CompatibilityVersion.Version_2_2);
|
||||
|
||||
services.AddControllers();
|
||||
services.AddRazorPages();
|
||||
|
||||
services.AddPredictionEnginePool<ImageInputData, ImageObjectPrediction>().
|
||||
FromFile(_mlnetModelFilePath);
|
||||
|
@ -51,9 +51,9 @@ namespace OnnxObjectDetectionE2EAPP
|
|||
}
|
||||
|
||||
// This method gets called by the runtime. Use this method to configure the HTTP request pipeline.
|
||||
public void Configure(IApplicationBuilder app, IHostingEnvironment env)
|
||||
public void Configure(IApplicationBuilder app, IWebHostEnvironment env)
|
||||
{
|
||||
if (env.IsDevelopment())
|
||||
if (env.EnvironmentName == Environments.Development)
|
||||
{
|
||||
app.UseDeveloperExceptionPage();
|
||||
}
|
||||
|
@ -65,16 +65,11 @@ namespace OnnxObjectDetectionE2EAPP
|
|||
app.UseStaticFiles();
|
||||
app.UseCookiePolicy();
|
||||
|
||||
app.UseMvc();
|
||||
app.UseRouting();
|
||||
app.UseEndpoints(endpoints => {
|
||||
endpoints.MapControllers();
|
||||
endpoints.MapRazorPages();
|
||||
});
|
||||
}
|
||||
|
||||
public static string GetAbsolutePath(string relativePath)
|
||||
{
|
||||
FileInfo _dataRoot = new FileInfo(typeof(Program).Assembly.Location);
|
||||
string assemblyFolderPath = _dataRoot.Directory.FullName;
|
||||
|
||||
string fullPath = Path.Combine(assemblyFolderPath, relativePath);
|
||||
return fullPath;
|
||||
}
|
||||
}
|
||||
}
|
До Ширина: | Высота: | Размер: 371 KiB После Ширина: | Высота: | Размер: 371 KiB |
До Ширина: | Высота: | Размер: 70 KiB После Ширина: | Высота: | Размер: 70 KiB |
До Ширина: | Высота: | Размер: 74 KiB После Ширина: | Высота: | Размер: 74 KiB |
До Ширина: | Высота: | Размер: 391 KiB После Ширина: | Высота: | Размер: 391 KiB |
До Ширина: | Высота: | Размер: 31 KiB После Ширина: | Высота: | Размер: 31 KiB |
|
@ -0,0 +1,199 @@
|
|||
# Object Detection - ASP.NET Core Web & WPF Desktop Sample
|
||||
|
||||
| ML.NET version | API type | Status | App Type | Data type | Scenario | ML Task | Algorithms |
|
||||
|----------------|-------------|------------|-------------|-------------|------------------|---------------|------------------------|
|
||||
| v1.3.1 | Dynamic API | Up-to-date | End-End app | image files | Object Detection | Deep Learning | Tiny YOLOv2 ONNX model |
|
||||
|
||||
## Problem
|
||||
|
||||
Object detection is one of the classic problems in computer vision: Recognize what objects are inside a given image and also where they are in the image. For these cases, you can either use pre-trained models or train your own model to classify images specific to your custom domain. In this sample, we'll use a pre-trained model.
|
||||
|
||||
## How the sample works
|
||||
|
||||
This sample consists of two separate apps:
|
||||
|
||||
- An ASP.NET Core Web App that allows the user to upload or select an image. The Web app then runs the image through an object detection model using ML.NET, and paints bounding boxes with labels indicating the objects detected.
|
||||
- A WPF Core desktop app that renders a live-stream of the device's web cam, runs the video frames through an object detection model using ML.NET, and paints bounding boxes with labels indicating the objects detected in real-time.
|
||||
|
||||
The Web app shows the images listed on the right, and each image may be selected to process. Once the image is processed, it is drawn in the middle of the screen with labeled bounding boxes around each detected object as shown below.
|
||||
|
||||
![Animated image showing object detection web sample](./docs/Screenshots/ObjectDetection.gif)
|
||||
|
||||
Alternatively you can try uploading your own images as shown below.
|
||||
|
||||
![Animated image showing object detection web sample](./docs/Screenshots/FileUpload.gif)
|
||||
|
||||
## Pre-trained model
|
||||
|
||||
There are multiple pre-trained models for identifying multiple objects in the images. Both the **WPF app** and the **Web app** use the pre-trained model, **Tiny YOLOv2** in [**ONNX**](http://onnx.ai/) format. This model is a real-time neural network for object detection that detects 20 different classes. It is made up of 9 convolutional layers and 6 max-pooling layers and is a smaller version of the more complex full [YOLOv2](https://pjreddie.com/darknet/yolov2/) network.
|
||||
|
||||
The Open Neural Network eXchange i.e [ONNX](http://onnx.ai/) is an open format to represent deep learning models. With ONNX, developers can move models between state-of-the-art tools and choose the combination that is best for them. ONNX is developed and supported by a community of partners, including Microsoft.
|
||||
|
||||
The model is downloaded from the [ONNX Model Zoo](https://github.com/onnx/models/tree/master/vision/object_detection_segmentation/tiny_yolov2) which is a is a collection of pre-trained, state-of-the-art models in the ONNX format.
|
||||
|
||||
The Tiny YOLOv2 model was trained on the [Pascal VOC](http://host.robots.ox.ac.uk/pascal/VOC/) dataset. Below are the model's prerequisites.
|
||||
|
||||
### Model input and output
|
||||
|
||||
- **Input:** An image of the shape (3x416x416)
|
||||
- **Output:** An (1x125x13x13) array
|
||||
|
||||
### Pre-processing steps
|
||||
|
||||
Resize the input image to an (3x416x416) array of type `float32`.
|
||||
|
||||
### Post-processing steps
|
||||
|
||||
The output is a (125x13x13) tensor where 13x13 is the number of grid cells that the image gets divided into. Each grid cell corresponds to 125 channels, made up of the 5 bounding boxes predicted by the grid cell and the 25 data elements that describe each bounding box (5x25=125). For more information on how to derive the final bounding boxes and their corresponding confidence scores, refer to this [post](http://machinethink.net/blog/object-detection-with-yolo/).
|
||||
|
||||
## Solution
|
||||
|
||||
**The projects in this solution use .NET Core 3.0. In order to run this sample, you must install the .NET Core SDK 3.0. To do this either:**
|
||||
|
||||
1. Manually install the SDK by going to [.NET Core 3.0 download page](https://aka.ms/netcore3download) and download the latest **.NET Core Installer** in the **SDK** column.
|
||||
2. Or, if you're using Visual Studio 2019, go to: _**Tools > Options > Environment > Preview Features**_ and check the box next to: _**Use previews of the .NET Core SDK**_
|
||||
|
||||
### The solution contains three projects
|
||||
|
||||
- [**OnnxObjectDetection**](./OnnxObjectDetection) is a .NET Standard library used by both the WPF app and the Web app. It contains most of the logic for running images through the model and parsing the resulting prediction. This project also contains the Onnx model file. With the exception of drawing the labels bounding boxes on the image/screen, all of the following code snippets are contained in this project.
|
||||
- [**OnnxObjectDetectionWeb**](./OnnxObjectDetectionWeb) contains an ASP.NET Core Web App that that contains both **Razor UI pages** and an **API controller** to process and render images.
|
||||
- [**OnnxObjectDetectionApp**](./OnnxObjectDetectionApp) contains an .NET CORE WPF Desktop App that uses [OpenCvSharp](https://github.com/shimat/opencvsharp) to capture the video from the devices webcam.
|
||||
|
||||
## Code Walkthrough
|
||||
|
||||
_This sample differs from the [getting-started object detection sample](https://github.com/dotnet/machinelearning-samples/tree/master/samples/csharp/getting-started/DeepLearning_ObjectDetection_Onnx) in that here we load/process the images **in-memory** whereas the getting-started sample loads the images from a **file**._
|
||||
|
||||
Create a class that defines the data schema to use while loading data into an `IDataView`. ML.NET supports the `Bitmap` type for images, so we'll specify `Bitmap` property decorated with the `ImageTypeAttribute`, as shown below.
|
||||
|
||||
```csharp
|
||||
public class ImageInputData
|
||||
{
|
||||
[ImageType(416, 416)]
|
||||
public Bitmap Image { get; set; }
|
||||
}
|
||||
```
|
||||
|
||||
### ML.NET: Configure the model
|
||||
|
||||
The first step is to create an empty DataView since we need schema of the data while configuring the model.
|
||||
|
||||
```csharp
|
||||
var dataView = _mlContext.Data.LoadFromEnumerable(new List<ImageInputData>());
|
||||
```
|
||||
|
||||
The second step is to define the estimator pipeline. Usually when dealing with deep neural networks, you must adapt the images to the format expected by the network. For this reason, the code below resizes and transforms the images (pixel values are normalized across all R,G,B channels).
|
||||
|
||||
```csharp
|
||||
var pipeline = _mlContext.Transforms.ResizeImages(resizing: ImageResizingEstimator.ResizingKind.Fill, outputColumnName: "image", imageWidth: ImageSettings.imageWidth, imageHeight: ImageSettings.imageHeight, inputColumnName: nameof(ImageInputData.Image))
|
||||
.Append(_mlContext.Transforms.ExtractPixels(outputColumnName: "image"))
|
||||
.Append(_mlContext.Transforms.ApplyOnnxModel(modelFile: onnxModelFilePath, outputColumnNames: new[] { TinyYoloModelSettings.ModelOutput }, inputColumnNames: new[] { TinyYoloModelSettings.ModelInput }));
|
||||
```
|
||||
|
||||
You also need to inspect the neural network to get the **names** of the **input** and **output** nodes, which are used later when we define the estimation pipeline. To do this, you can use tools like [Netron](https://github.com/lutzroeder/netron), a GUI visualizer for neural networks, deep learning, and machine learning models.
|
||||
|
||||
Below is an example of what we'd see upon opening this sample's Tiny YOLOv2 model with Netron:
|
||||
|
||||
![Output from inspecting the Tiny YOLOv2 model with Netron](./docs/Netron/netron.PNG)
|
||||
|
||||
From the Netron output above, we can see that our Tiny YOLOv2 network's input tensor is named **'image'** and its output is named **'grid.'**
|
||||
|
||||
We'll use these to define the **input** and **output** parameters of the Tiny YOLOv2 Onnx Model.
|
||||
|
||||
```csharp
|
||||
public struct TinyYoloModelSettings
|
||||
{
|
||||
// To check Tiny YOLOv2 Model input and output parameter names,
|
||||
// you can use tools like Netron: https://github.com/lutzroeder/netron
|
||||
|
||||
// Input tensor name
|
||||
public const string ModelInput = "image";
|
||||
|
||||
// Output tensor name
|
||||
public const string ModelOutput = "grid";
|
||||
}
|
||||
```
|
||||
|
||||
Create the model by fitting the DataView.
|
||||
|
||||
```csharp
|
||||
var model = pipeline.Fit(dataView);
|
||||
```
|
||||
|
||||
## Load model and create PredictionEngine
|
||||
|
||||
After the model is configured, we need to save the model, load the saved model, create a `PredictionEngine`, and then pass the image to the engine to detect objects using the model.
|
||||
This is one place that the **Web** app and the **WPF** app differ slightly.
|
||||
|
||||
The **Web** app uses a `PredictionEnginePool` to efficiently manage and provide the service with a `PredictionEngine` to use to make predictions. Internally, it is optimized so the object dependencies are cached and shared across Http requests with minimum overhead when creating those objects
|
||||
|
||||
```csharp
|
||||
public ObjectDetectionService(PredictionEnginePool<ImageInputData, ImageObjectPrediction> predictionEngine)
|
||||
{
|
||||
this.predictionEngine = predictionEngine;
|
||||
}
|
||||
```
|
||||
|
||||
Whereas the **WPF** desktop app creates a single `PredictionEngine` and caches locally to be used for each frame prediction. And the key point to clarify is that the calling code that instantiates the `PredictionEngine` is responsible for handling the caching (as compared to the `PredictionEnginePool`).
|
||||
|
||||
```csharp
|
||||
public PredictionEngine<ImageInputData, ImageObjectPrediction> GetMlNetPredictionEngine()
|
||||
{
|
||||
return _mlContext.Model.CreatePredictionEngine<ImageInputData, ImageObjectPrediction>(_mlModel);
|
||||
}
|
||||
```
|
||||
|
||||
## Detect objects in the image
|
||||
|
||||
When obtaining the prediction, we get a `float` array of size **21125** in the `PredictedLabels` property. This is the 125x13x13 output of the model discussed earlier. We then use the `YoloOutputParser` class to interpret and return a number of bounding boxes for each image. Again, these boxes are filtered so that we retrieve only 5 with high confidence.
|
||||
|
||||
```csharp
|
||||
var labels = predictionEngine.Predict(imageInputData).PredictedLabels;
|
||||
var boundingBoxes = yoloParser.ParseOutputs(labels);
|
||||
var filteredBoxes = yoloParser.FilterBoundingBoxes(boundingBoxes, 5, 0.5f);
|
||||
```
|
||||
|
||||
## Draw bounding boxes around detected objects in Image
|
||||
|
||||
The final step is to draw the bounding boxes around the objects.
|
||||
|
||||
The **Web** app draws the boxes directly onto the image using Paint API and returns the image to display it in the browser.
|
||||
|
||||
```csharp
|
||||
var img = _objectDetectionService.DrawBoundingBox(imageFilePath);
|
||||
|
||||
using (MemoryStream m = new MemoryStream())
|
||||
{
|
||||
img.Save(m, img.RawFormat);
|
||||
byte[] imageBytes = m.ToArray();
|
||||
|
||||
// Convert byte[] to Base64 String
|
||||
base64String = Convert.ToBase64String(imageBytes);
|
||||
var result = new Result { imageString = base64String };
|
||||
return result;
|
||||
}
|
||||
```
|
||||
|
||||
Alternatively, the **WPF** app draws the bounding boxes on a [`Canvas`](https://docs.microsoft.com/en-us/dotnet/api/system.windows.controls.canvas?view=netcore-3.0) element that overlaps the streaming video playback.
|
||||
|
||||
```csharp
|
||||
DrawOverlays(filteredBoxes, WebCamImage.ActualHeight, WebCamImage.ActualWidth);
|
||||
|
||||
WebCamCanvas.Children.Clear();
|
||||
|
||||
foreach (var box in filteredBoxes)
|
||||
{
|
||||
var objBox = new Rectangle {/* ... */ };
|
||||
|
||||
var objDescription = new TextBlock {/* ... */ };
|
||||
|
||||
var objDescriptionBackground = new Rectangle {/* ... */ };
|
||||
|
||||
WebCamCanvas.Children.Add(objDescriptionBackground);
|
||||
WebCamCanvas.Children.Add(objDescription);
|
||||
WebCamCanvas.Children.Add(objBox);
|
||||
}
|
||||
```
|
||||
|
||||
## Note on accuracy
|
||||
|
||||
Tiny YOLOv2 is significantly less accurate than the full YOLOv2 model, but the tiny version is sufficient for this sample app.
|
До Ширина: | Высота: | Размер: 17 KiB После Ширина: | Высота: | Размер: 17 KiB |
До Ширина: | Высота: | Размер: 1.9 MiB После Ширина: | Высота: | Размер: 1.9 MiB |
До Ширина: | Высота: | Размер: 965 KiB После Ширина: | Высота: | Размер: 965 KiB |
До Ширина: | Высота: | Размер: 584 KiB После Ширина: | Высота: | Размер: 584 KiB |
До Ширина: | Высота: | Размер: 439 KiB После Ширина: | Высота: | Размер: 439 KiB |