{"id":3726,"date":"2017-07-31T11:17:44","date_gmt":"2017-07-31T18:17:44","guid":{"rendered":"https:\/\/www.microsoft.com\/reallifecode\/?p=3726"},"modified":"2020-03-19T14:06:01","modified_gmt":"2020-03-19T21:06:01","slug":"using-object-detection-complex-image-classification-scenarios","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/ise\/using-object-detection-complex-image-classification-scenarios\/","title":{"rendered":"Using Object Detection for Complex Image Classification Scenarios"},"content":{"rendered":"<p>We recently worked with\u00a0<a href=\"http:\/\/www.smart-it.com\/\">SMART Business<\/a>, a Ukrainian consulting services company,\u00a0along with their partner, a large manufacturer of confectionery products in Central &amp; Eastern Europe, to build a machine learning model which validates whether distributors are stocking chocolates correctly.<\/p>\n<p>This code story provides an overview of different image classification approaches for various levels of complexity that we explored while developing our solution.<\/p>\n<h2 id=\"background\">Background<\/h2>\n<p>The company we worked with has a huge distribution network of supermarket chains across over fourteen countries. Each of these distributors is required to arrange chocolates on their stands according to standardized policies. Each policy describes what shelf a given chocolate should be on and in what order it should be stocked.<\/p>\n<p>There are huge costs associated with \u201croutine\u201d audit activities to enforce these policies. SMART Business\u00a0wanted to develop a system in which an auditor or store manager could take a picture and be told immediately whether the shelf was stocked correctly, like in the image below.<\/p>\n<p><figure id=\"attachment_3728\" aria-labelledby=\"figcaption_attachment_3728\" class=\"wp-caption aligncenter\" ><img decoding=\"async\" class=\"wp-image-3728\" src=\"https:\/\/devblogs.microsoft.com\/cse\/wp-content\/uploads\/sites\/55\/2020\/03\/valid_invalid.jpg\" alt=\"example images of correct and incorrect shelf stocking\" width=\"413\" height=\"276\" \/><figcaption id=\"figcaption_attachment_3728\" class=\"wp-caption-text\">Valid policy (left), invalid policy (right)<\/figcaption><\/figure><\/p>\n<h2 id=\"opportunities-for-reuse\">Investigation<\/h2>\n<p>During our scoping, we investigated a couple of approaches to image classification including Microsoft&#8217;s <a href=\"https:\/\/www.customvision.ai\">Custom Vision Service<\/a>, <a href=\"https:\/\/docs.microsoft.com\/en-us\/cognitive-toolkit\/Build-your-own-image-classifier-using-Transfer-Learning\">Transfer Learning using CNTK ResNet<\/a>, and <a href=\"https:\/\/docs.microsoft.com\/en-us\/cognitive-toolkit\/Object-Detection-using-Fast-R-CNN\">Object Detection with CNTK Fast-RCNN<\/a>. While\u00a0<a href=\"https:\/\/en.wikipedia.org\/wiki\/Object_detection\">Object Detection<\/a> with Fast-RCNN ended up being the best fit, during our investigation we determined that each of these approaches involved different levels of complexity, each with its own strengths and weaknesses.<\/p>\n<h3>Custom Vision Service<\/h3>\n<p>Training and consuming a REST-based service is dramatically easier than training, deploying and updating a custom computer vision model. As a result, we first investigated Microsoft&#8217;s\u00a0<a href=\"https:\/\/www.customvision.ai\">Custom Vision Service<\/a>. The Custom Vision Service is a tool for building custom image classifiers and improving them over time. \u00a0We trained a model on a sample data set of 882 images (sorted into 505 valid and 377 invalid images) containing standalone shelves of chocolate.<\/p>\n<p><img decoding=\"async\" class=\"aligncenter size-full wp-image-10899\" src=\"https:\/\/devblogs.microsoft.com\/cse\/wp-content\/uploads\/sites\/55\/2017\/07\/roshen-custom-vision-data-scaled.jpg\" alt=\"Image roshen custom vision data\" width=\"2560\" height=\"1296\" srcset=\"https:\/\/devblogs.microsoft.com\/ise\/wp-content\/uploads\/sites\/55\/2017\/07\/roshen-custom-vision-data-scaled.jpg 2560w, https:\/\/devblogs.microsoft.com\/ise\/wp-content\/uploads\/sites\/55\/2017\/07\/roshen-custom-vision-data-300x152.jpg 300w, https:\/\/devblogs.microsoft.com\/ise\/wp-content\/uploads\/sites\/55\/2017\/07\/roshen-custom-vision-data-1024x518.jpg 1024w, https:\/\/devblogs.microsoft.com\/ise\/wp-content\/uploads\/sites\/55\/2017\/07\/roshen-custom-vision-data-768x389.jpg 768w, https:\/\/devblogs.microsoft.com\/ise\/wp-content\/uploads\/sites\/55\/2017\/07\/roshen-custom-vision-data-1536x777.jpg 1536w, https:\/\/devblogs.microsoft.com\/ise\/wp-content\/uploads\/sites\/55\/2017\/07\/roshen-custom-vision-data-2048x1037.jpg 2048w\" sizes=\"(max-width: 2560px) 100vw, 2560px\" \/><\/p>\n<p>We were able to train a relatively strong baseline model using the Custom Vision Service with the following benchmarks below:<\/p>\n<p><img decoding=\"async\" class=\"aligncenter size-full wp-image-10900\" src=\"https:\/\/devblogs.microsoft.com\/cse\/wp-content\/uploads\/sites\/55\/2017\/07\/roshen-custom-vision-results.jpg\" alt=\"Image roshen custom vision results\" width=\"991\" height=\"817\" srcset=\"https:\/\/devblogs.microsoft.com\/ise\/wp-content\/uploads\/sites\/55\/2017\/07\/roshen-custom-vision-results.jpg 991w, https:\/\/devblogs.microsoft.com\/ise\/wp-content\/uploads\/sites\/55\/2017\/07\/roshen-custom-vision-results-300x247.jpg 300w, https:\/\/devblogs.microsoft.com\/ise\/wp-content\/uploads\/sites\/55\/2017\/07\/roshen-custom-vision-results-768x633.jpg 768w\" sizes=\"(max-width: 991px) 100vw, 991px\" \/><\/p>\n<p>Additionally, we ran the model on an unseen dataset of 500 images to supplement this data and make sure that we had a consistent baseline.<\/p>\n<div>For more information on how to benchmark and put a Custom Vision Service Model into production, please see our previous code story, <a href=\"\/developerblog\/2017\/05\/12\/food-classification-custom-vision-service\/\">Food Classification with Custom Vision Service<\/a>. For a detailed explanation of standard classification metrics, please see\u00a0<a href=\"http:\/\/machinelearningmastery.com\/metrics-evaluate-machine-learning-algorithms-python\/\">Metrics To Evaluate Machine Learning Algorithms in Python<\/a>.<\/div>\n<div>\n<table style=\"height: 170px; width: 625px;\">\n<tbody>\n<tr>\n<td>\u00a0<strong>Label<\/strong><\/td>\n<td><strong>Precision\u00a0<\/strong><\/td>\n<td><strong>\u00a0Recall<\/strong><\/td>\n<td><strong>\u00a0F-1 Score<\/strong><\/td>\n<td><strong>Support<\/strong><\/td>\n<\/tr>\n<tr>\n<td>\u00a0Invalid<\/td>\n<td>0.71<\/td>\n<td>0.74<\/td>\n<td>0.72<\/td>\n<td>170<\/td>\n<\/tr>\n<tr>\n<td>Valid<\/td>\n<td>0.87<\/td>\n<td>0.85<\/td>\n<td>0.86<\/td>\n<td>353<\/td>\n<\/tr>\n<tr>\n<td>Avg \/ Total<\/td>\n<td>0.82<\/td>\n<td>0.81<\/td>\n<td>0.82<\/td>\n<td>523<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<div><strong>Confusion matrix<\/strong><\/div>\n<div><\/div>\n<table style=\"width: 129px; height: 30px;\">\n<tbody>\n<tr>\n<td>125<\/td>\n<td>45<\/td>\n<\/tr>\n<tr>\n<td>52<\/td>\n<td>301<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<div>While the Custom Vision Service performed well for the given scenario and proved itself to be a very powerful image classification tool, the service had some limitations that made it prohibitive for a production use case.<\/div>\n<p>These limitations are best expressed in the following quote from the official\u00a0<a href=\"https:\/\/docs.microsoft.com\/en-us\/azure\/cognitive-services\/custom-vision-service\/home\">Custom Vision Service documentation<\/a>:<\/p>\n<blockquote><p><span style=\"font-size: 1.0625rem;\">The methods Custom Vision Service uses are robust to differences, which allows you to start prototyping with so little data. In theory v<\/span>ery few images are required to create a classifier &#8212; 30 images per class is enough to start your prototype. However, this means Custom Vision Service is usually not well suited to scenarios where you want to detect very subtle differences.<\/p><\/blockquote>\n<p>The\u00a0Custom Vision Service worked well when we narrowed down the scope of our policy problem to one policy and standalone shelves of chocolate, however, the service&#8217;s 1000 image training limit narrowed our ability to fine-tune the model around certain consistent but subtle policy edge cases.<\/p>\n<p>For example, the Custom Vision Service excelled at detecting the flagrant policy violations that represented the majority of our dataset, like the examples below:<\/p>\n<p><img decoding=\"async\" class=\"alignnone wp-image-3982\" src=\"https:\/\/devblogs.microsoft.com\/cse\/wp-content\/uploads\/sites\/55\/2020\/03\/170103_000018025_000000019_001-e1498485378740.jpg\" alt=\"\" width=\"146\" height=\"305\" \/>\u00a0 \u00a0<img decoding=\"async\" class=\"alignnone wp-image-3985\" src=\"https:\/\/devblogs.microsoft.com\/cse\/wp-content\/uploads\/sites\/55\/2020\/03\/o-94c3fec911354f6c945ed7b83dca75c6-e1498485838915-161x300.jpg\" alt=\"\" width=\"165\" height=\"307\" \/><\/p>\n<p>However, it consistently failed to recognize more subtle yet persistent violations that were off by one chocolate, such as the first shelf in this picture:<\/p>\n<p><img decoding=\"async\" class=\"aligncenter size-full wp-image-10898\" src=\"https:\/\/devblogs.microsoft.com\/cse\/wp-content\/uploads\/sites\/55\/2017\/07\/o-8e4df48d62a143bb8ba6b5cf88e87a7a-e1498485982237.jpg\" alt=\"Image o 8e4df48d62a143bb8ba6b5cf88e87a7a e1498485982237\" width=\"512\" height=\"988\" srcset=\"https:\/\/devblogs.microsoft.com\/ise\/wp-content\/uploads\/sites\/55\/2017\/07\/o-8e4df48d62a143bb8ba6b5cf88e87a7a-e1498485982237.jpg 512w, https:\/\/devblogs.microsoft.com\/ise\/wp-content\/uploads\/sites\/55\/2017\/07\/o-8e4df48d62a143bb8ba6b5cf88e87a7a-e1498485982237-155x300.jpg 155w\" sizes=\"(max-width: 512px) 100vw, 512px\" \/><\/p>\n<p>To transcend the limitations of the Custom Vision Service, we toyed with the idea of creating multiple models and then ensembling their results using a <a href=\"http:\/\/scikit-learn.org\/stable\/modules\/generated\/sklearn.ensemble.VotingClassifier.html\">voting classifier<\/a>. While this would have no doubt improved the results of the model, and may be worth investigating for other scenarios, it would have also increased the API cost, as well as run-time. In addition, it would still not be scalable beyond one or two policies, since the\u00a0Custom Vision Service caps the number of models per account at nineteen.<\/p>\n<h3><strong>Transfer Learning using CNTK and ResNet<\/strong><\/h3>\n<p>To work around the dataset limits of the Custom Vision Service, we next investigated building an image recognition model with CNTK and <a href=\"https:\/\/en.wikipedia.org\/wiki\/Transfer_learning\">Transfer learning<\/a> on top of ResNet with the following <a href=\"https:\/\/docs.microsoft.com\/en-us\/cognitive-toolkit\/Build-your-own-image-classifier-using-Transfer-Learning\">tutorial<\/a>. <a href=\"https:\/\/arxiv.org\/abs\/1512.03385\">ResNet <\/a>is a deep convolutional neural network architecture developed by Microsoft for the image-net competition in 2015.<\/p>\n<p>In this evaluation, our training dataset contained two sets of 795 images representing valid and invalid policy.<\/p>\n<p><figure class=\"wp-caption aligncenter\" ><img decoding=\"async\" src=\"https:\/\/devblogs.microsoft.com\/cse\/wp-content\/uploads\/sites\/55\/2020\/03\/021717_1842_QuickStartG4.png\" alt=\"\" width=\"542\" height=\"193\" \/><figcaption class=\"wp-caption-text\">Fig. 3: Representation of a ResNet CNN with an image from ImageNet. The input is an RGB image of a cat, the output is a probability vector, whose maximum corresponds to the label \u201ctabby cat\u201d.<\/figcaption><\/figure><\/p>\n<p>Since we did not have enough data (i.e., tens of thousands of samples) or the processing power to train our own large-scale CNN\u00a0model from scratch, we decided to leverage ResNet by retraining its output layer on our train dataset.<\/p>\n<h4><strong>Results<\/strong><\/h4>\n<p>We\u00a0ran the transfer learning ResNet model on three runs of 20, 200, and 2000 epochs respectively. We received the best results on our test dataset during our run of 2000 epochs.<\/p>\n<table style=\"height: 170px; width: 625px;\">\n<tbody>\n<tr>\n<td>\u00a0<strong>Label<\/strong><\/td>\n<td><strong>Precision\u00a0<\/strong><\/td>\n<td><strong>\u00a0Recall<\/strong><\/td>\n<td><strong>\u00a0F-1 Score<\/strong><\/td>\n<td><strong>Support<\/strong><\/td>\n<\/tr>\n<tr>\n<td>\u00a0Invalid<\/td>\n<td>0.38<\/td>\n<td>0.96<\/td>\n<td>0.54<\/td>\n<td>171<\/td>\n<\/tr>\n<tr>\n<td>Valid<\/td>\n<td>0.93<\/td>\n<td>0.23<\/td>\n<td>0.37<\/td>\n<td>353<\/td>\n<\/tr>\n<tr>\n<td>Avg \/ Total<\/td>\n<td>0.75<\/td>\n<td>0.47<\/td>\n<td>0.43<\/td>\n<td>524<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Confusion Matrix<\/p>\n<table style=\"width: 129px; height: 30px;\">\n<tbody>\n<tr>\n<td>165<\/td>\n<td>6<\/td>\n<\/tr>\n<tr>\n<td>272<\/td>\n<td>81<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>As we can see, the transfer learning approach severely under performed compared to the Custom Vision Service.<\/p>\n<p>Transfer learning on ResNet can be a powerful tool\u00a0for training strong object recognition with limited datasets. However, when we apply the model to new images that stray too far from the domain of the original 1000 ImageNet classes, the model struggles to capture new representative features due to its reused abstract features &#8220;learned&#8221; from the <a href=\"http:\/\/www.image-net.org\/\">ImageNet<\/a> training set.<\/p>\n<h4><strong>Investigation Conclusions<\/strong><\/h4>\n<p>We saw promising results classifying individual policies using object recognition approaches, such as the Custom Vision Service considering the large quantity of brands and potential ways to order them on a shelf, it was prohibitive to determine policy adherence from an image using standard object recognition with the amount of available data.<\/p>\n<p>The complexity of the validation corner cases\u00a0combined with SMART Business&#8217; concern about the ease of building new models for each policy using standard object recognition methods encouraged us to examine a more creative solution for their problem domain.<\/p>\n<div class=\"postbody\">\n<h2 id=\"the-solution\">The Solution<\/h2>\n<h3><strong>Object Detection and Fast R<\/strong><strong>&#8211;<\/strong><strong>CNN<\/strong><\/h3>\n<p>To strengthen the policy signal while maintaining classification accuracy, we decided to use<a href=\"https:\/\/docs.microsoft.com\/en-us\/cognitive-toolkit\/Object-Detection-using-Fast-R-CNN\"> Object Detection and Fast R-CNN <\/a>with\u00a0<a href=\"https:\/\/papers.nips.cc\/paper\/4824-imagenet-classification-with-deep-convolutional-neural-networks\">AlexNet <\/a>to detect valid shelves in images. If we detected all valid shelves in a picture, then we could consider that stand as valid. In this way, we were not only able to classify our images but also to reuse pre-classified shelves to generate new configurable policies. We decided to use Fast R-CNN over alternatives such as Faster R-CNN since the implementation and evaluation pipeline was already proven to work well on top of CNTK see <a href=\"\/developerblog\/2017\/04\/10\/object-detection-using-cntk\/\">Object Detection Using CNTK<\/a>.<\/p>\n<p>First, we used the new image support feature of the <a href=\"https:\/\/github.com\/CatalystCode\/VOTT\">Visual Object Tagging Tool (VoTT)<\/a> to tag a valid policy on a slightly larger 2600\u00a0image dataset. For an explanation of how to tag image directories with VOTT, see\u00a0<a href=\"https:\/\/github.com\/CatalystCode\/VoTT#tagging-an-image-directory\">Tagging an Image Directory<\/a>.<\/p>\n<p><figure id=\"attachment_3994\" aria-labelledby=\"figcaption_attachment_3994\" class=\"wp-caption aligncenter\" ><img decoding=\"async\" class=\"wp-image-3994 size-medium\" src=\"https:\/\/devblogs.microsoft.com\/cse\/wp-content\/uploads\/sites\/55\/2017\/07\/vott-examplel-300x252.jpg\" alt=\"\" width=\"300\" height=\"252\" \/><figcaption id=\"figcaption_attachment_3994\" class=\"wp-caption-text\">Note all three shelves are validly stocked, therefore this image represents a valid policy.<\/figcaption><\/figure><\/p>\n<p style=\"text-align: left;\">By tweaking the filtration aspect ratios, quantity and minimum sizes of our region of interest, we were able to get quality results from our dataset.<\/p>\n<\/div>\n<h2 id=\"code\">Results<\/h2>\n<p><img decoding=\"async\" class=\"aligncenter size-full wp-image-10901\" src=\"https:\/\/devblogs.microsoft.com\/cse\/wp-content\/uploads\/sites\/55\/2017\/07\/roshen-results.jpg\" alt=\"Image roshen results\" width=\"935\" height=\"506\" srcset=\"https:\/\/devblogs.microsoft.com\/ise\/wp-content\/uploads\/sites\/55\/2017\/07\/roshen-results.jpg 935w, https:\/\/devblogs.microsoft.com\/ise\/wp-content\/uploads\/sites\/55\/2017\/07\/roshen-results-300x162.jpg 300w, https:\/\/devblogs.microsoft.com\/ise\/wp-content\/uploads\/sites\/55\/2017\/07\/roshen-results-768x416.jpg 768w\" sizes=\"(max-width: 935px) 100vw, 935px\" \/><\/p>\n<p>While the results of this model at first glance seem to be marginally worse than the results of the Custom Vision Service solution, the gains\u00a0in modularity\u00a0and the ability to generalize across certain persistent edge cases encouraged SMART Business to continue with the more advanced object detection approach.<\/p>\n<p>&nbsp;<\/p>\n<div class=\"postbody\">\n<h2 id=\"opportunities-for-reuse\">Opportunities for Reuse<\/h2>\n<div class=\"postbody\">\n<p>Below is an outline\u00a0of the advantages and disadvantages of the contextual methodologies for image classification that we investigated, in order of complexity.<\/p>\n<\/div>\n<div class=\"postbody\">\n<table>\n<tbody>\n<tr bgcolor=\"#4470b2\">\n<td width=\"133\"><b>Methodology<\/b><\/td>\n<td width=\"200\"><b>Advantages<\/b><\/td>\n<td width=\"200\"><b>Disadvantages <\/b><\/td>\n<td width=\"200\"><b>When to use<\/b><\/td>\n<\/tr>\n<tr bgcolor=\"#cfd5e9\">\n<td width=\"133\">Custom Vision Service<\/td>\n<td width=\"200\">\n<ul>\n<li>Easy to get started from small datasets. No GPU required.<\/li>\n<li>Evaluated images can be retagged to improve model<\/li>\n<li>Supported service with one click to production.<\/li>\n<\/ul>\n<\/td>\n<td width=\"200\">\n<ul>\n<li>Struggles to detect subtle changes<\/li>\n<li>Cannot run model locally<\/li>\n<li>Limit of 1000 training images<\/li>\n<\/ul>\n<\/td>\n<td width=\"200\">\n<ul>\n<li>Cloud services such as the Custom Vision Service are well suited for Object Classification Problems where there is limited training data. This is the simplest methodology to attempt.<\/li>\n<li>In our investigations, the service performed the best on our dataset but struggled to scale to multiple policies and detect persistent edge cases.<\/li>\n<\/ul>\n<\/td>\n<\/tr>\n<tr bgcolor=\"#e9ebf5\">\n<td width=\"133\">CNN \/ Transfer Learning<\/td>\n<td width=\"200\">\n<ul>\n<li>Leverages existing model layers so that model doesn\u2019t need to be trained from scratch.<\/li>\n<li>Easy to train, just need to point train script to sorted image directories.<\/li>\n<li>No training set size limit and models can run offline<\/li>\n<\/ul>\n<\/td>\n<td width=\"200\">\n<ul>\n<li>Struggles to classify data whose abstract features are dissimilar to those learned from Image-Net<\/li>\n<li>Needs GPU to train.<\/li>\n<li>More complex to put in production than the computer vision service.<\/li>\n<\/ul>\n<\/td>\n<td width=\"200\">\n<ul>\n<li>CNN Transfer learning on pre-trained models such as ResNet or Inception works best with mid-size datasets that share similar properties with ImageNet categories. Note if you have a large dataset with at least tens of thousands of samples it may be worthwhile retraining all the layers in a model.<\/li>\n<li>Of the methodologies we investigated transfer learning performed the worst for our complex classification scenario.<\/li>\n<\/ul>\n<\/td>\n<\/tr>\n<tr bgcolor=\"#cfd5e9\">\n<td width=\"133\">Object Detection using VoTT<\/td>\n<td width=\"200\">\n<ul>\n<li>Better suited for detecting subtle differences between image classes.<\/li>\n<li>Detected regions are modular and can be reused if complex classification criteria changes.<\/li>\n<li>No training set size limit and models can run offline<\/li>\n<\/ul>\n<\/td>\n<td width=\"200\">\n<ul>\n<li>Requires annotating bounding boxes on all images (though this is made easier with the VoTT tool).<\/li>\n<li>Needs GPU to train<\/li>\n<li>More complex to put in production than the computer vision service.<\/li>\n<li>Algorithms such as FastRCNN struggle to detect small areas.<\/li>\n<\/ul>\n<\/td>\n<td>\n<ul>\n<li>Using a combination of object detection and heuristics for image classification is well suited for scenarios where users have a midsized dataset yet need to detect subtle differences to differentiate image classes.<\/li>\n<li>Of the methodologies outlined this was the most complex to implement but provided the most robust results across our test set. This is the approach that SMART Business\u00a0decided to use.<\/li>\n<\/ul>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>The Deep Learning ecosystem is rapidly advancing with new algorithms being quickly iterated upon and released daily. After reading about state of the art performance on traditional benchmarks, it can be tempting to throw the newest available DNN algorithm at a classification problem. However, it is equally (if not more) important to evaluate these new technologies within the context of their application.\u00a0Too\u00a0often in machine learning, the use\u00a0of &#8220;new algorithms&#8221; overshadows the importance of well thought out\u00a0methodology.<\/p>\n<\/div>\n<p>The methodologies investigated during our collaboration revealed a spectrum of classification methodologies of varying complexity and with trade-offs that are useful to consider when building image classification systems.<\/p>\n<p>Our investigations show that it is important to\u00a0weigh the trade-offs of factors such as implementation\u00a0complexity, scalability, and optimization potential against others like dataset size, class instance variation, class similarity, and performance.<\/p>\n<\/div>\n<div class=\"postbody\"><\/div>\n<div class=\"postbody\"><\/div>\n","protected":false},"excerpt":{"rendered":"<p>An overview of different image classification approaches including Microsoft Azure Custom Vision Service and CNTK for various levels of classification complexity.<\/p>\n","protected":false},"author":21353,"featured_media":10897,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[19],"tags":[123,279],"class_list":["post-3726","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-machine-learning","tag-cntk","tag-object-detection"],"acf":[],"blog_post_summary":"<p>An overview of different image classification approaches including Microsoft Azure Custom Vision Service and CNTK for various levels of classification complexity.<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/ise\/wp-json\/wp\/v2\/posts\/3726","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/ise\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/ise\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/ise\/wp-json\/wp\/v2\/users\/21353"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/ise\/wp-json\/wp\/v2\/comments?post=3726"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/ise\/wp-json\/wp\/v2\/posts\/3726\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/ise\/wp-json\/wp\/v2\/media\/10897"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/ise\/wp-json\/wp\/v2\/media?parent=3726"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/ise\/wp-json\/wp\/v2\/categories?post=3726"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/ise\/wp-json\/wp\/v2\/tags?post=3726"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}