{"id":32010,"date":"2021-03-01T09:00:15","date_gmt":"2021-03-01T16:00:15","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/dotnet\/?p=32010"},"modified":"2021-02-26T13:46:53","modified_gmt":"2021-02-26T20:46:53","slug":"serve-ml-net-models-as-http-apis-with-minimal-configuration","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/dotnet\/serve-ml-net-models-as-http-apis-with-minimal-configuration\/","title":{"rendered":"Serve ML.NET Models as HTTP APIs with minimal configuration"},"content":{"rendered":"<h2>Introduction<\/h2>\n<p>One of the most difficult tasks of building machine learning applications is deploying them to production. The ML.NET team is exploring ways to simplify the process and would like to hear your feedback.<\/p>\n<p>When it comes to deploying machine learning models as web services, the bare minimum you need is a single endpoint to handle making predictions. One way to do that in .NET is using a technique known as &#8220;route-to-code.&#8221; In this post, I&#8217;ll show how &#8220;route-to-code&#8221; can help you quickly build highly scalable machine learning web services in about 60 lines of code!<\/p>\n<p>For more information on <a href=\"https:\/\/docs.microsoft.com\/aspnet\/core\/web-api\/route-to-code?view=aspnetcore-5.0\">&#8220;route-to-code&#8221;<\/a>, see the ASP.NET documentation.<\/p>\n<h2>Install the Microsoft.Extensions.ML NuGet package<\/h2>\n<p>For web applications, it&#8217;s recommended to use the <code>PredictionEnginePool<\/code> service. This is a scalable service which provides an <code>ObjectPool<\/code> containing <code>PredictionEngine<\/code> objects that use an ML.NET model to make predictions on new data. The <code>PredictionEnginePool<\/code> service is part of the <a href=\"https:\/\/www.nuget.org\/packages\/Microsoft.Extensions.ML\/\"><code>Microsoft.Extensions.ML<\/code> NuGet package<\/a>.<\/p>\n<p>If you&#8217;re using the CLI, you can use the following command to install the NuGet package.<\/p>\n<pre><code class=\"dotnetcli\">dotnet add package Microsoft.Extensions.ML\r\n<\/code><\/pre>\n<h3>Configure the web application<\/h3>\n<p>A standard ASP.NET convention is to configure your application&#8217;s services and request pipelines in a class called <code>Startup<\/code>. Because we&#8217;re working with a single service and endpoint, we&#8217;ll instead configure our application inside the <code>Program<\/code> class. Add the following code to your <code>Main<\/code> method.<\/p>\n<pre><code class=\"csharp\">WebHost.CreateDefaultBuilder()\r\n    .ConfigureServices(services =&gt; {\r\n        \/\/ Register PredictionEnginePool service \r\n        services.AddPredictionEnginePool&lt;Input,Output&gt;()\r\n            .FromUri(\"https:\/\/github.com\/dotnet\/samples\/raw\/master\/machine-learning\/models\/sentimentanalysis\/sentiment_model.zip\");\r\n    })\r\n    .Configure(app =&gt; {\r\n            app.UseHttpsRedirection();\r\n    })\r\n    .Build()\r\n    .Run();\r\n<\/code><\/pre>\n<p>This code defines and builds the application&#8217;s web host. It also registers a <code>PredictionEnginePool<\/code> for a model hosted on GitHub. Once registered you can use this service anywhere in your application using dependency injection.<\/p>\n<h3>Define model input and output schemas<\/h3>\n<p>Machine learning models use patterns learned from the training process to generate predictions using new data as input. The machine learning model used in this sample analyzes sentiment from input text and categorizes it as positive or negative.<\/p>\n<p>Define both of these classes in your application.<\/p>\n<pre><code class=\"csharp\">public class Input\r\n{\r\n    public string SentimentText;\r\n\r\n    [ColumnName(\"Label\")]\r\n    public bool Sentiment;\r\n}\r\n\r\npublic class Output\r\n{\r\n    [ColumnName(\"PredictedLabel\")]\r\n    public bool Prediction { get; set; }\r\n\r\n    public float Probability { get; set; }\r\n\r\n    public float Score { get; set; }\r\n}\r\n<\/code><\/pre>\n<p>The <code>Input<\/code> and <code>Output<\/code> classes in this case define the schema of the model&#8217;s input and output respectively. Given an input value in the <code>SentimentText<\/code> property, the model outputs a boolean value for the <code>Prediction<\/code>, where zero is negative and one is positive sentiment.<\/p>\n<h3>Create a handler to make predictions<\/h3>\n<p>To process incoming requests, you&#8217;ll want to create a handler. The handler is a method that leverages the <code>HttpContext<\/code> to access registered services (in this case <code>PredictionEnginePool<\/code>), read the request, and write out a response.<\/p>\n<pre><code class=\"csharp\">static async Task PredictHandler(HttpContext http)\r\n{\r\n    \/\/ Get PredictionEnginePool service\r\n    var predEngine = http.RequestServices.GetRequiredService&lt;PredictionEnginePool&lt;Input,Output&gt;&gt;();\r\n\r\n    \/\/ Deserialize HTTP request JSON body\r\n    var input = await JsonSerializer.DeserializeAsync&lt;Input&gt;(http.Request.Body);\r\n\r\n    \/\/ Predict using PredictionEnginePool service\r\n    var prediction = predEngine.Predict(input);\r\n\r\n    \/\/ Return prediction as JSON response\r\n    await http.Response.WriteAsJsonAsync(prediction);\r\n}\r\n<\/code><\/pre>\n<h3>Configure routes<\/h3>\n<p>Now that you have a handler, configure your application to route requests to your handler. Add the following code inside the <code>Configure<\/code> method.<\/p>\n<pre><code class=\"csharp\">app.UseRouting();\r\napp.UseEndpoints(endpoints =&gt; {\r\n    \/\/ Define prediction endpoint\r\n    endpoints.MapPost(\"\/predict\", PredictHandler);\r\n});\r\n<\/code><\/pre>\n<p>This code maps HTTP POST requests to the <code>predict<\/code> endpoint and uses the <code>PredictHandler<\/code> previously created to process those requests.<\/p>\n<p>Your final <code>Main<\/code> method should look like the following:<\/p>\n<pre><code class=\"csharp\">WebHost.CreateDefaultBuilder()\r\n    .ConfigureServices(services =&gt; {\r\n        \/\/ Register PredictionEnginePool service \r\n        services.AddPredictionEnginePool&lt;Input,Output&gt;()\r\n            .FromUri(\"https:\/\/github.com\/dotnet\/samples\/raw\/master\/machine-learning\/models\/sentimentanalysis\/sentiment_model.zip\");\r\n    })\r\n    .Configure(app =&gt; {\r\n        app.UseHttpsRedirection();\r\n        app.UseRouting();\r\n        app.UseEndpoints(endpoints =&gt; {\r\n            \/\/ Define prediction endpoint\r\n            endpoints.MapPost(\"\/predict\", PredictHandler);\r\n        });\r\n    })\r\n    .Build()\r\n    .Run();      \r\n<\/code><\/pre>\n<p>That&#8217;s all you need to serve your machine learning model as an HTTP API!<\/p>\n<h2>Test your API<\/h2>\n<p>To test your API, run the application and make an HTTP POST request with a JSON body containing the <code>SentimentText<\/code> property.<\/p>\n<pre><code class=\"json\">{\r\n    \"SentimentText\": \"This is a very bad steak\"\r\n}\r\n<\/code><\/pre>\n<p>You should receive an <code>Output<\/code> response similar to the following:<\/p>\n<pre><code class=\"json\">{\r\n  \"prediction\": false,\r\n  \"probability\": 0.5,\r\n  \"score\": 0\r\n}\r\n<\/code><\/pre>\n<p>The <code>false<\/code> value in <code>prediction<\/code> indicates that the <code>SentimentText<\/code> provided in the request is negative.<\/p>\n<h2>Conclusion<\/h2>\n<p>In this post, we showed how &#8220;route-to-code&#8221; can help you quickly write a highly scalable machine learning ASP.NET web service. Try deploying your own machine learning models using the minimal &#8220;route-to-code&#8221; method and <a href=\"https:\/\/github.com\/dotnet\/machinelearning-modelbuilder\/issues\">give us feedback<\/a> on how to make it better.<\/p>\n<p>You can find a complete version of this application in the <a href=\"https:\/\/github.com\/luisquintanilla\/mlnet-http-api\">ML.NET HTTP API GitHub repository<\/a>. In that repository you&#8217;ll also find a <a href=\"https:\/\/github.com\/luisquintanilla\/mlnet-http-api\/tree\/api-endpoints\">sample<\/a> that uses the API Endpoints NuGet package to enable model binding and OpenAPI\/Swagger.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Learn how to serve machine learning models from HTTP APIs using minimal configuration<\/p>\n","protected":false},"author":26108,"featured_media":32051,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[685,691],"tags":[4,93,96],"class_list":["post-32010","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-dotnet","category-ml-dotnet","tag-net","tag-machine-learning","tag-ml-net"],"acf":[],"blog_post_summary":"<p>Learn how to serve machine learning models from HTTP APIs using minimal configuration<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/posts\/32010","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/users\/26108"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/comments?post=32010"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/posts\/32010\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/media\/32051"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/media?parent=32010"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/categories?post=32010"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/tags?post=32010"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}