{"id":2120,"date":"2017-01-01T16:00:00","date_gmt":"2017-01-01T16:00:00","guid":{"rendered":"https:\/\/www.microsoft.com\/reallifecode\/index.php\/2017\/01\/01\/predicting-expense-type-from-receipts-with-microsoft-cognitive-services\/"},"modified":"2020-03-15T06:20:18","modified_gmt":"2020-03-15T13:20:18","slug":"predicting-expense-type-from-receipts-with-microsoft-cognitive-services","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/ise\/predicting-expense-type-from-receipts-with-microsoft-cognitive-services\/","title":{"rendered":"Predicting Expense Type from Receipts with Microsoft Cognitive Services"},"content":{"rendered":"<p>This post explores how we can leverage machine learning techniques to help partially automate the processes of accounting and expenditure reimbursement. Often, such methods require manual input of information from an invoice or receipt, such as total amount spent, tax amount, type of expenditure, transaction date, etc. This code story will demonstrate how multiclass classification algorithms and Optical Character Recognition (OCR) can be leveraged to predict the type of expense from an imaged receipt automatically. By the end of this post, reader will be able to build a Xamarin-based expense recognition from imaged receipt with model built using Azure ML Studio deployed as a web service.<\/p>\n<p>Before we can predict or recognize the type of expense from a receipt, we must first convert a database of imaged receipts into structured data via OCR to extract the information into text format. This information is then used to train a predictive model.<\/p>\n<h2 id=\"overall-structure\">Overall Structure<\/h2>\n<p>The figure below shows the overall structure of the solution in <a href=\"https:\/\/studio.azureml.net\/\">Azure Machine Learning (ML) Studio<\/a>, with the following assumptions:<\/p>\n<ul>\n<li>A database of imaged receipts already exists<\/li>\n<li>Images are stored in <a href=\"https:\/\/azure.microsoft.com\/en-gb\/services\/storage\/blobs\/\">Azure Blob Storage<\/a><\/li>\n<\/ul>\n<p><img decoding=\"async\" src=\"https:\/\/devblogs.microsoft.com\/cse\/wp-content\/uploads\/sites\/55\/2017\/01\/mlstudio-main.jpg\" alt=\"Jpg: mlstudio-overall\" \/><\/p>\n<p>This example will load training images from blob storage and extract text using <a href=\"https:\/\/www.microsoft.com\/cognitive-services\/en-us\/computer-vision-api\"><strong>OCR<\/strong><\/a>. The data is then used to train a predictive model using a multiclass neural network (with default settings), and finally published as a web service.<\/p>\n<h2 id=\"dataset\">Dataset<\/h2>\n<p>We are basing our example on a private dataset of ~1200 images of receipts of different expense types, such as snacks, groceries, dining, clothes, fuel and entertainment. The figure below shows the distribution of these six classes.\n<img decoding=\"async\" src=\"https:\/\/devblogs.microsoft.com\/cse\/wp-content\/uploads\/sites\/55\/2017\/01\/data-distribution-6-classes.jpg\" alt=\"jpg: data-distribution-6-classes\" \/><\/p>\n<h2 id=\"extract-text-via-ocr\">Extract Text via OCR<\/h2>\n<p>Below is an example of how you can call Microsoft\u2019s Cognitive Services from within Azure ML Studio using the <strong><code class=\"highlighter-rouge\">Execute Python Script<\/code><\/strong> module. The Python code below will extract texts out from those images via Microsoft\u2019s OCR. This code should reside within the <strong><code class=\"highlighter-rouge\">Execute Python Script<\/code><\/strong> module.<\/p>\n<p>The snippet below shows the required packages and sets the URL for OCR in the Vision API from Microsoft Cognitive Services.<\/p>\n<div class=\"language-python highlighter-rouge\">\n<pre class=\"highlight\"><code>  <span class=\"c\"># The script MUST contain a function named azureml_main<\/span>\r\n  <span class=\"c\"># which is the entry point for this module.<\/span>\r\n\r\n  <span class=\"c\"># imports up here can be used to <\/span>\r\n  <span class=\"kn\">import<\/span> <span class=\"nn\">pandas<\/span> <span class=\"kn\">as<\/span> <span class=\"nn\">pd<\/span>\r\n  <span class=\"kn\">import<\/span> <span class=\"nn\">json<\/span>\r\n  <span class=\"kn\">import<\/span> <span class=\"nn\">time<\/span>\r\n  <span class=\"kn\">import<\/span> <span class=\"nn\">requests<\/span>\r\n  <span class=\"kn\">from<\/span> <span class=\"nn\">io<\/span> <span class=\"kn\">import<\/span> <span class=\"n\">StringIO<\/span>\r\n  \r\n  <span class=\"c\"># url for Microsoft's Cognitive Services - Vision API - OCR<\/span>\r\n  <span class=\"c\">#_url = 'https:\/\/api.projectoxford.ai\/vision\/v1.0\/ocr' # previous url, still work<\/span>\r\n  <span class=\"n\">_url<\/span> <span class=\"o\">=<\/span> <span class=\"s\">'https:\/\/westus.api.cognitive.microsoft.com\/vision\/v1.0\/ocr'<\/span> <span class=\"c\"># latest url<\/span>\r\n  \r\n  <span class=\"c\"># maximum number of retries when posting a request<\/span>\r\n  <span class=\"n\">_maxNumRetries<\/span> <span class=\"o\">=<\/span> <span class=\"mi\">10<\/span>\r\n<\/code><\/pre>\n<\/div>\n<p>Below is the entry point function for <strong><code class=\"highlighter-rouge\">Execute Python Script<\/code><\/strong> module within Azure ML Studio experiment. It sets up parameters for the OCR API, processes requests, and returns a new data frame which contains text extracted from a receipt, and its associated label (that is, its expensing category).<\/p>\n<div class=\"language-python highlighter-rouge\">\n<pre class=\"highlight\"><code>  <span class=\"c\"># The entry point function can contain up to two input arguments:<\/span>\r\n  <span class=\"c\">#   Param&lt;dataframe1&gt;: a pandas.DataFrame<\/span>\r\n  <span class=\"c\">#   Param&lt;dataframe2&gt;: a pandas.DataFrame<\/span>\r\n  <span class=\"k\">def<\/span> <span class=\"nf\">azureml_main<\/span><span class=\"p\">(<\/span><span class=\"n\">dataframe1<\/span> <span class=\"o\">=<\/span> <span class=\"bp\">None<\/span><span class=\"p\">,<\/span> <span class=\"n\">dataframe2<\/span> <span class=\"o\">=<\/span> <span class=\"bp\">None<\/span><span class=\"p\">):<\/span>\r\n\r\n      <span class=\"c\"># Get the OCR key<\/span>\r\n      <span class=\"n\">VISION_API_KEY<\/span> <span class=\"o\">=<\/span> <span class=\"nb\">str<\/span><span class=\"p\">(<\/span><span class=\"n\">dataframe2<\/span><span class=\"p\">[<\/span><span class=\"s\">'Col1'<\/span><span class=\"p\">][<\/span><span class=\"mi\">0<\/span><span class=\"p\">])<\/span>\r\n      \r\n      <span class=\"c\"># Load the file containing image url and label<\/span>\r\n      <span class=\"n\">df_url_label<\/span> <span class=\"o\">=<\/span> <span class=\"n\">dataframe1<\/span>\r\n            \r\n      <span class=\"c\"># create an empty pandas data frame<\/span>\r\n      <span class=\"n\">df<\/span> <span class=\"o\">=<\/span> <span class=\"n\">pd<\/span><span class=\"o\">.<\/span><span class=\"n\">DataFrame<\/span><span class=\"p\">({<\/span><span class=\"s\">'Text'<\/span> <span class=\"p\">:<\/span> <span class=\"p\">[],<\/span> <span class=\"s\">'Category'<\/span> <span class=\"p\">:<\/span> <span class=\"p\">[],<\/span> <span class=\"s\">'ReceiptID'<\/span> <span class=\"p\">:<\/span> <span class=\"p\">[]})<\/span>\r\n      \r\n      <span class=\"c\"># extract image url, setting OCR API parameters, process request<\/span>\r\n      <span class=\"k\">for<\/span> <span class=\"n\">index<\/span><span class=\"p\">,<\/span> <span class=\"n\">row<\/span> <span class=\"ow\">in<\/span> <span class=\"n\">df_url_label<\/span><span class=\"o\">.<\/span><span class=\"n\">iterrows<\/span><span class=\"p\">():<\/span>\r\n          <span class=\"n\">imageurl<\/span> <span class=\"o\">=<\/span> <span class=\"n\">row<\/span><span class=\"p\">[<\/span><span class=\"s\">'Url'<\/span><span class=\"p\">]<\/span>\r\n          \r\n          <span class=\"c\"># setting OCR parameters<\/span>\r\n          <span class=\"n\">params<\/span> <span class=\"o\">=<\/span> <span class=\"p\">{<\/span> <span class=\"s\">'language'<\/span><span class=\"p\">:<\/span> <span class=\"s\">'en'<\/span><span class=\"p\">,<\/span> <span class=\"s\">'detectOrientation '<\/span><span class=\"p\">:<\/span> <span class=\"s\">'true'<\/span><span class=\"p\">}<\/span> \r\n          <span class=\"n\">headers<\/span> <span class=\"o\">=<\/span> <span class=\"nb\">dict<\/span><span class=\"p\">()<\/span>\r\n          <span class=\"n\">headers<\/span><span class=\"p\">[<\/span><span class=\"s\">'Ocp-Apim-Subscription-Key'<\/span><span class=\"p\">]<\/span> <span class=\"o\">=<\/span>  <span class=\"n\">VISION_API_KEY<\/span>\r\n          <span class=\"n\">headers<\/span><span class=\"p\">[<\/span><span class=\"s\">'Content-Type'<\/span><span class=\"p\">]<\/span> <span class=\"o\">=<\/span> <span class=\"s\">'application\/json'<\/span> \r\n          \r\n          <span class=\"n\">image_url<\/span> <span class=\"o\">=<\/span> <span class=\"p\">{<\/span> <span class=\"s\">'url'<\/span><span class=\"p\">:<\/span> <span class=\"n\">imageurl<\/span> <span class=\"p\">}<\/span> <span class=\"p\">;<\/span> \r\n          <span class=\"n\">image_file<\/span> <span class=\"o\">=<\/span> <span class=\"bp\">None<\/span>\r\n          <span class=\"n\">result<\/span> <span class=\"o\">=<\/span> <span class=\"n\">processRequest<\/span><span class=\"p\">(<\/span> <span class=\"n\">image_url<\/span><span class=\"p\">,<\/span> <span class=\"n\">image_file<\/span><span class=\"p\">,<\/span> <span class=\"n\">headers<\/span><span class=\"p\">,<\/span> <span class=\"n\">params<\/span> <span class=\"p\">)<\/span>\r\n          \r\n          <span class=\"k\">if<\/span> <span class=\"n\">result<\/span> <span class=\"ow\">is<\/span> <span class=\"ow\">not<\/span> <span class=\"bp\">None<\/span><span class=\"p\">:<\/span>\r\n              <span class=\"c\"># extract text<\/span>\r\n              <span class=\"n\">text<\/span> <span class=\"o\">=<\/span> <span class=\"n\">extractText<\/span><span class=\"p\">(<\/span><span class=\"n\">result<\/span><span class=\"p\">);<\/span> \r\n              \r\n              <span class=\"c\"># populate dataframe<\/span>\r\n              <span class=\"n\">df<\/span><span class=\"o\">.<\/span><span class=\"n\">loc<\/span><span class=\"p\">[<\/span><span class=\"n\">index<\/span><span class=\"p\">,<\/span><span class=\"s\">'Text'<\/span><span class=\"p\">]<\/span> <span class=\"o\">=<\/span> <span class=\"n\">text<\/span>\r\n          <span class=\"k\">else<\/span><span class=\"p\">:<\/span>\r\n              <span class=\"c\"># populate dataframe<\/span>\r\n              <span class=\"n\">df<\/span><span class=\"o\">.<\/span><span class=\"n\">loc<\/span><span class=\"p\">[<\/span><span class=\"n\">index<\/span><span class=\"p\">,<\/span><span class=\"s\">'Text'<\/span><span class=\"p\">]<\/span> <span class=\"o\">=<\/span> <span class=\"bp\">None<\/span>\r\n            \r\n          <span class=\"c\"># 'Category' is the label<\/span>\r\n          <span class=\"n\">df<\/span><span class=\"o\">.<\/span><span class=\"n\">loc<\/span><span class=\"p\">[<\/span><span class=\"n\">index<\/span><span class=\"p\">,<\/span><span class=\"s\">'Category'<\/span><span class=\"p\">]<\/span> <span class=\"o\">=<\/span> <span class=\"n\">row<\/span><span class=\"p\">[<\/span><span class=\"s\">'Category'<\/span><span class=\"p\">]<\/span>\r\n          <span class=\"n\">df<\/span><span class=\"o\">.<\/span><span class=\"n\">loc<\/span><span class=\"p\">[<\/span><span class=\"n\">index<\/span><span class=\"p\">,<\/span><span class=\"s\">'ReceiptID'<\/span><span class=\"p\">]<\/span> <span class=\"o\">=<\/span> <span class=\"n\">imageurl<\/span><span class=\"p\">[<\/span><span class=\"o\">-<\/span><span class=\"mi\">17<\/span><span class=\"p\">:<\/span><span class=\"o\">-<\/span><span class=\"mi\">4<\/span><span class=\"p\">]<\/span>\r\n          \r\n      <span class=\"c\"># Return value must be a sequence of pandas.DataFrame<\/span>\r\n      <span class=\"k\">return<\/span> <span class=\"n\">df<\/span>\r\n<\/code><\/pre>\n<\/div>\n<p><code class=\"highlighter-rouge\">extractText<\/code> seeks and extracts only texts recognized by OCR, and ignores other information such as <code class=\"highlighter-rouge\">regions<\/code>, <code class=\"highlighter-rouge\">lines<\/code> and <code class=\"highlighter-rouge\">words<\/code>. While that information is not utilized in this example, it could be useful if the location of the text is of interest.<\/p>\n<div class=\"language-python highlighter-rouge\">\n<pre class=\"highlight\"><code>  <span class=\"c\"># Extract text only from OCR's response<\/span>\r\n  <span class=\"k\">def<\/span> <span class=\"nf\">extractText<\/span><span class=\"p\">(<\/span><span class=\"n\">result<\/span><span class=\"p\">):<\/span>\r\n      <span class=\"n\">text<\/span> <span class=\"o\">=<\/span> <span class=\"s\">\"\"<\/span>\r\n      <span class=\"k\">for<\/span> <span class=\"n\">region<\/span> <span class=\"ow\">in<\/span> <span class=\"n\">result<\/span><span class=\"p\">[<\/span><span class=\"s\">'regions'<\/span><span class=\"p\">]:<\/span>\r\n          <span class=\"k\">for<\/span> <span class=\"n\">line<\/span> <span class=\"ow\">in<\/span> <span class=\"n\">region<\/span><span class=\"p\">[<\/span><span class=\"s\">'lines'<\/span><span class=\"p\">]:<\/span>\r\n              <span class=\"k\">for<\/span> <span class=\"n\">word<\/span> <span class=\"ow\">in<\/span> <span class=\"n\">line<\/span><span class=\"p\">[<\/span><span class=\"s\">'words'<\/span><span class=\"p\">]:<\/span>\r\n                  <span class=\"n\">text<\/span> <span class=\"o\">=<\/span> <span class=\"n\">text<\/span> <span class=\"o\">+<\/span> <span class=\"s\">\" \"<\/span> <span class=\"o\">+<\/span> <span class=\"n\">word<\/span><span class=\"o\">.<\/span><span class=\"n\">get<\/span><span class=\"p\">(<\/span><span class=\"s\">'text'<\/span><span class=\"p\">)<\/span>\r\n      <span class=\"k\">return<\/span> <span class=\"n\">text<\/span>\r\n<\/code><\/pre>\n<\/div>\n<p><code class=\"highlighter-rouge\">processRequest<\/code> processes the REST API request to the OCR API. For more information on this routine, see <a href=\"https:\/\/github.com\/Microsoft\/Cognitive-Vision-Python\/blob\/master\/Jupyter%20Notebook\/Computer%20Vision%20API%20Example.ipynb\">an example<\/a> on GitHub.<\/p>\n<div class=\"language-python highlighter-rouge\">\n<pre class=\"highlight\"><code>  <span class=\"c\"># Process request<\/span>\r\n  <span class=\"k\">def<\/span> <span class=\"nf\">processRequest<\/span><span class=\"p\">(<\/span> <span class=\"n\">image_url<\/span><span class=\"p\">,<\/span> <span class=\"n\">image_file<\/span><span class=\"p\">,<\/span> <span class=\"n\">headers<\/span><span class=\"p\">,<\/span> <span class=\"n\">params<\/span> <span class=\"p\">):<\/span>\r\n\r\n      <span class=\"s\">\"\"\"\r\n      Ref: https:\/\/github.com\/Microsoft\/Cognitive-Vision-Python\/blob\/master\/Jupyter<\/span><span class=\"si\">%20<\/span><span class=\"s\">Notebook\/Computer<\/span><span class=\"si\">%20<\/span><span class=\"s\">Vision<\/span><span class=\"si\">%20<\/span><span class=\"s\">API<\/span><span class=\"si\">%20<\/span><span class=\"s\">Example.ipynb\r\n      Helper function to process the request to Project Oxford\r\n      Parameters:\r\n      json: Used when processing images from its URL. See API Documentation\r\n      data: Used when processing image read from disk. See API Documentation\r\n      headers: Used to pass the key information and the data type request\r\n      \"\"\"<\/span>\r\n\r\n      <span class=\"n\">retries<\/span> <span class=\"o\">=<\/span> <span class=\"mi\">0<\/span>\r\n      <span class=\"n\">result<\/span> <span class=\"o\">=<\/span> <span class=\"bp\">None<\/span>\r\n\r\n      <span class=\"k\">while<\/span> <span class=\"bp\">True<\/span><span class=\"p\">:<\/span>\r\n          <span class=\"n\">response<\/span> <span class=\"o\">=<\/span> <span class=\"n\">requests<\/span><span class=\"o\">.<\/span><span class=\"n\">request<\/span><span class=\"p\">(<\/span> <span class=\"s\">'post'<\/span><span class=\"p\">,<\/span> <span class=\"n\">_url<\/span><span class=\"p\">,<\/span> <span class=\"n\">json<\/span> <span class=\"o\">=<\/span> <span class=\"n\">image_url<\/span><span class=\"p\">,<\/span> <span class=\"n\">data<\/span> <span class=\"o\">=<\/span> <span class=\"n\">image_file<\/span><span class=\"p\">,<\/span> <span class=\"n\">headers<\/span> <span class=\"o\">=<\/span> <span class=\"n\">headers<\/span><span class=\"p\">,<\/span> <span class=\"n\">params<\/span> <span class=\"o\">=<\/span> <span class=\"n\">params<\/span> <span class=\"p\">)<\/span>\r\n          \r\n          <span class=\"k\">if<\/span> <span class=\"n\">response<\/span><span class=\"o\">.<\/span><span class=\"n\">status_code<\/span> <span class=\"o\">==<\/span> <span class=\"mi\">429<\/span><span class=\"p\">:<\/span> \r\n              <span class=\"k\">print<\/span><span class=\"p\">(<\/span> <span class=\"s\">\"Message: <\/span><span class=\"si\">%<\/span><span class=\"s\">s\"<\/span> <span class=\"o\">%<\/span> <span class=\"p\">(<\/span> <span class=\"n\">response<\/span><span class=\"o\">.<\/span><span class=\"n\">json<\/span><span class=\"p\">()[<\/span><span class=\"s\">'message'<\/span><span class=\"p\">]<\/span> <span class=\"p\">)<\/span> <span class=\"p\">)<\/span>\r\n\r\n              <span class=\"k\">if<\/span> <span class=\"n\">retries<\/span> <span class=\"o\">&lt;=<\/span> <span class=\"n\">_maxNumRetries<\/span><span class=\"p\">:<\/span> \r\n                  <span class=\"n\">time<\/span><span class=\"o\">.<\/span><span class=\"n\">sleep<\/span><span class=\"p\">(<\/span><span class=\"mi\">1<\/span><span class=\"p\">)<\/span> \r\n                  <span class=\"n\">retries<\/span> <span class=\"o\">+=<\/span> <span class=\"mi\">1<\/span>\r\n                  <span class=\"k\">continue<\/span>\r\n              <span class=\"k\">else<\/span><span class=\"p\">:<\/span> \r\n                  <span class=\"k\">print<\/span><span class=\"p\">(<\/span> <span class=\"s\">'Error: failed after retrying!'<\/span> <span class=\"p\">)<\/span>\r\n                  <span class=\"k\">break<\/span>\r\n\r\n          <span class=\"k\">elif<\/span> <span class=\"n\">response<\/span><span class=\"o\">.<\/span><span class=\"n\">status_code<\/span> <span class=\"o\">==<\/span> <span class=\"mi\">200<\/span> <span class=\"ow\">or<\/span> <span class=\"n\">response<\/span><span class=\"o\">.<\/span><span class=\"n\">status_code<\/span> <span class=\"o\">==<\/span> <span class=\"mi\">201<\/span><span class=\"p\">:<\/span>\r\n              <span class=\"k\">if<\/span> <span class=\"s\">'content-length'<\/span> <span class=\"ow\">in<\/span> <span class=\"n\">response<\/span><span class=\"o\">.<\/span><span class=\"n\">headers<\/span> <span class=\"ow\">and<\/span> <span class=\"nb\">int<\/span><span class=\"p\">(<\/span><span class=\"n\">response<\/span><span class=\"o\">.<\/span><span class=\"n\">headers<\/span><span class=\"p\">[<\/span><span class=\"s\">'content-length'<\/span><span class=\"p\">])<\/span> <span class=\"o\">==<\/span> <span class=\"mi\">0<\/span><span class=\"p\">:<\/span> \r\n                  <span class=\"n\">result<\/span> <span class=\"o\">=<\/span> <span class=\"bp\">None<\/span> \r\n              <span class=\"k\">elif<\/span> <span class=\"s\">'content-type'<\/span> <span class=\"ow\">in<\/span> <span class=\"n\">response<\/span><span class=\"o\">.<\/span><span class=\"n\">headers<\/span> <span class=\"ow\">and<\/span> <span class=\"nb\">isinstance<\/span><span class=\"p\">(<\/span><span class=\"n\">response<\/span><span class=\"o\">.<\/span><span class=\"n\">headers<\/span><span class=\"p\">[<\/span><span class=\"s\">'content-type'<\/span><span class=\"p\">],<\/span> <span class=\"nb\">str<\/span><span class=\"p\">):<\/span> \r\n                  <span class=\"k\">if<\/span> <span class=\"s\">'application\/json'<\/span> <span class=\"ow\">in<\/span> <span class=\"n\">response<\/span><span class=\"o\">.<\/span><span class=\"n\">headers<\/span><span class=\"p\">[<\/span><span class=\"s\">'content-type'<\/span><span class=\"p\">]<\/span><span class=\"o\">.<\/span><span class=\"n\">lower<\/span><span class=\"p\">():<\/span> \r\n                      <span class=\"n\">result<\/span> <span class=\"o\">=<\/span> <span class=\"n\">response<\/span><span class=\"o\">.<\/span><span class=\"n\">json<\/span><span class=\"p\">()<\/span> <span class=\"k\">if<\/span> <span class=\"n\">response<\/span><span class=\"o\">.<\/span><span class=\"n\">content<\/span> <span class=\"k\">else<\/span> <span class=\"bp\">None<\/span> \r\n                  <span class=\"k\">elif<\/span> <span class=\"s\">'image'<\/span> <span class=\"ow\">in<\/span> <span class=\"n\">response<\/span><span class=\"o\">.<\/span><span class=\"n\">headers<\/span><span class=\"p\">[<\/span><span class=\"s\">'content-type'<\/span><span class=\"p\">]<\/span><span class=\"o\">.<\/span><span class=\"n\">lower<\/span><span class=\"p\">():<\/span> \r\n                      <span class=\"n\">result<\/span> <span class=\"o\">=<\/span> <span class=\"n\">response<\/span><span class=\"o\">.<\/span><span class=\"n\">content<\/span>\r\n          <span class=\"k\">else<\/span><span class=\"p\">:<\/span>\r\n              <span class=\"k\">print<\/span><span class=\"p\">(<\/span><span class=\"n\">response<\/span><span class=\"o\">.<\/span><span class=\"n\">json<\/span><span class=\"p\">())<\/span> \r\n              <span class=\"k\">print<\/span><span class=\"p\">(<\/span> <span class=\"s\">\"Error code: <\/span><span class=\"si\">%<\/span><span class=\"s\">d\"<\/span> <span class=\"o\">%<\/span> <span class=\"p\">(<\/span> <span class=\"n\">response<\/span><span class=\"o\">.<\/span><span class=\"n\">status_code<\/span> <span class=\"p\">)<\/span> <span class=\"p\">);<\/span> \r\n              <span class=\"k\">print<\/span><span class=\"p\">(<\/span> <span class=\"s\">\"Message: <\/span><span class=\"si\">%<\/span><span class=\"s\">s\"<\/span> <span class=\"o\">%<\/span> <span class=\"p\">(<\/span> <span class=\"n\">response<\/span><span class=\"o\">.<\/span><span class=\"n\">json<\/span><span class=\"p\">()[<\/span><span class=\"s\">'message'<\/span><span class=\"p\">]<\/span> <span class=\"p\">)<\/span> <span class=\"p\">);<\/span> \r\n\r\n          <span class=\"k\">break<\/span>\r\n          \r\n      <span class=\"k\">return<\/span> <span class=\"n\">result<\/span>\r\n<\/code><\/pre>\n<\/div>\n<p>All the above snippets should be included in the <strong><code class=\"highlighter-rouge\">Execute Python Script<\/code><\/strong> module within the Azure ML Studio experiment.<\/p>\n<h2 id=\"results\">Results<\/h2>\n<p>The <strong><a href=\"https:\/\/msdn.microsoft.com\/en-us\/library\/azure\/Dn905963.aspx\">multiclass decision jungle<\/a><\/strong> and <strong><a href=\"https:\/\/msdn.microsoft.com\/en-us\/library\/azure\/Dn906030.aspx\">multiclass neural network<\/a><\/strong> modules have been tested, and the results are as shown below:<\/p>\n<table>\n<thead>\n<tr>\n<th style=\"text-align: center\">Algorithm<\/th>\n<th style=\"text-align: center\">Decision Jungle<\/th>\n<th style=\"text-align: center\">Neural Network<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td style=\"text-align: center\">Overall Accuracy<\/td>\n<td style=\"text-align: center\">0.786517<\/td>\n<td style=\"text-align: center\">0.837079<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<table>\n<thead>\n<tr>\n<th style=\"text-align: center\">Decision Jungle<\/th>\n<th style=\"text-align: center\">Neural Network<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td style=\"text-align: center\"><img decoding=\"async\" src=\"https:\/\/devblogs.microsoft.com\/cse\/wp-content\/uploads\/sites\/55\/2017\/01\/6-class-decision-jungle-tuned-confusion-matrix.jpg\" alt=\"jpg: 6-class-decision-jungle-tuned-confusion-matrix\" \/><\/td>\n<td style=\"text-align: center\"><img decoding=\"async\" src=\"https:\/\/devblogs.microsoft.com\/cse\/wp-content\/uploads\/sites\/55\/2017\/01\/6-class-nn-tuned-confusion-matrix.jpg\" alt=\"jpg: 6-class-nn-tuned-confusion-matrix\" \/><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h2 id=\"integration-into-a-mobile-app\">Integration into a Mobile App<\/h2>\n<p>The creation of a mobile app to consume the published expense predictor can be achieved by using this <a href=\"https:\/\/github.com\/CatalystCode\/receipt-recognition\">Xamarin-based mobile phone app<\/a> under the <code class=\"highlighter-rouge\">MobileApp<\/code> folder (in our example). The app will take a picture of a receipt, send it to the web service, and a predicted type of expense will be returned.<\/p>\n<h2 id=\"experiment-settings\">Experiment Settings<\/h2>\n<p>This section provides detial information about the experiment settings. Readers are welcome do to experiment with different settings and see how they affect the model performance.<\/p>\n<table>\n<thead>\n<tr>\n<th style=\"text-align: center\">Text Preprocessing<\/th>\n<th style=\"text-align: center\">Feature Hashing<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td style=\"text-align: center\"><img decoding=\"async\" src=\"https:\/\/devblogs.microsoft.com\/cse\/wp-content\/uploads\/sites\/55\/2017\/01\/preprocess-text.jpg\" alt=\"jpg:preprocess-text.jpg\" \/><\/td>\n<td style=\"text-align: center\"><img decoding=\"async\" src=\"https:\/\/devblogs.microsoft.com\/cse\/wp-content\/uploads\/sites\/55\/2017\/01\/feature-hashing.jpg\" alt=\"jpg: feature-hashing\" \/><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<table>\n<thead>\n<tr>\n<th style=\"text-align: center\">Neural Network<\/th>\n<th style=\"text-align: center\">Decision Jungle<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td style=\"text-align: center\"><img decoding=\"async\" src=\"https:\/\/devblogs.microsoft.com\/cse\/wp-content\/uploads\/sites\/55\/2017\/01\/multiclass-nn-settings.jpg\" alt=\"jpg:multiclass-nn-settings\" \/><\/td>\n<td style=\"text-align: center\"><img decoding=\"async\" src=\"https:\/\/devblogs.microsoft.com\/cse\/wp-content\/uploads\/sites\/55\/2017\/01\/multiclass-decision-jungle-settings.jpg\" alt=\"jpg: multiclass-decision-jungle-settings\" \/><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p><strong>Hyperparameter Tuning<\/strong><\/p>\n<table>\n<thead>\n<tr>\n<th style=\"text-align: center\">Neural Network<\/th>\n<th style=\"text-align: center\">Decision Jungle<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td style=\"text-align: center\"><img decoding=\"async\" src=\"https:\/\/devblogs.microsoft.com\/cse\/wp-content\/uploads\/sites\/55\/2017\/01\/multiclass-nn-tuning.jpg\" alt=\"jpg:multiclass-nn-tuning.jpg\" \/><\/td>\n<td style=\"text-align: center\"><img decoding=\"async\" src=\"https:\/\/devblogs.microsoft.com\/cse\/wp-content\/uploads\/sites\/55\/2017\/01\/multiclass-decision-jungle-tuning.jpg\" alt=\"jpg: multiclass-decision-jungle-tuning\" \/><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h2 id=\"conclusions\">Conclusions<\/h2>\n<p>An Optical Character Recognition application can be built and developed using Azure ML Studio for easy model development and deployment as a web service, interfacing with Microsoft Cognitive Services Vison API via the <code class=\"highlighter-rouge\">Execute Python Script<\/code> module for custom Python codes, and Xamarin as the front-end user interface.<\/p>\n<h3 id=\"further-information\">Further Information<\/h3>\n<p>Please see <a href=\"https:\/\/channel9.msdn.com\/Blogs\/Seth-Juarez\/Automate-your-expense-tracking-with-AgitareTech-and-Azure-Technologies\">Channel 9 video<\/a> for story behind this project.<\/p>\n<h3 id=\"code\">Code<\/h3>\n<p><a href=\"https:\/\/github.com\/CatalystCode\/receipt-recognition\">Receipt-recognition<\/a> is the related GitHub repository.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Using Microsoft Cognitive Services Vision API Optical Character Recognition within Azure ML Studio to Predict Expense Type from Receipts.<\/p>\n","protected":false},"author":21356,"featured_media":11055,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[19],"tags":[66,81,250,284,393],"class_list":["post-2120","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-machine-learning","tag-azure-blob-storage","tag-azure-machine-learning-ml-studio","tag-microsoft-cognitive-services","tag-optical-character-recognition-ocr","tag-xamarin"],"acf":[],"blog_post_summary":"<p>Using Microsoft Cognitive Services Vision API Optical Character Recognition within Azure ML Studio to Predict Expense Type from Receipts.<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/ise\/wp-json\/wp\/v2\/posts\/2120","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/ise\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/ise\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/ise\/wp-json\/wp\/v2\/users\/21356"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/ise\/wp-json\/wp\/v2\/comments?post=2120"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/ise\/wp-json\/wp\/v2\/posts\/2120\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/ise\/wp-json\/wp\/v2\/media\/11055"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/ise\/wp-json\/wp\/v2\/media?parent=2120"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/ise\/wp-json\/wp\/v2\/categories?post=2120"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/ise\/wp-json\/wp\/v2\/tags?post=2120"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}