Announcing ‘Machine Learning .NET’ 0.5

Cesar De la Torre

Cesar

Today, coinciding with the .NET Conf 2018, we’re announcing the release of ML.NET 0.5. It’s been a few months already since we released ML.NET 0.1 at //Build 2018, a cross-platform, open source machine learning framework for .NET developers. While we’re evolving through new preview releases, we are getting great feedback and would like to thank the community for your engagement as we continue to develop ML.NET together in the open.

In this 0.5 release we are adding TensorFlow model scoring as a transform to ML.NET. This enables using an existing TensorFlow model within an ML.NET experiment. In addition we are also addressing a variety of issues and feedback we received from the community. We welcome feedback and contributions to the conversation: relevant issues can be found here.

As part of the upcoming road in ML.NET, we really want your feedback on making ML.NET easier to use. We are working on a new ML.NET API which improves flexibility and ease of use. When the new API is ready and good enough, we plan to deprecate the current LearningPipeline API. Because this will be a significant change we are sharing our proposals for the multiple API options and comparisons at the end of this blog post. We also want an open discussion where you can provide feedback and help shape the long-term API for ML.NET.

This blog post provides details about the following topics in ML.NET:

Added a TensorFlow model scoring transform (TensorFlowTransform)

TensorFlow is a popular deep learning and machine learning toolkit that enables training deep neural networks (and general numeric computations).

Deep learning is a subset of AI and machine learning that teaches programs to do what comes naturally to humans: learn by example.
Its main differentiator compared to traditional machine learning is that a deep learning model can learn to perform object detection and classification tasks directly from images, sound or text, or even deliver tasks such as speech recognition and language translation, whereas traditional ML approaches relied heavily on feature engineering and data processing.
Deep learning models need to be trained by using very large sets of labeled data and neural networks that contain multiple layers. Its current popularity is caused by several reasons. First, it just performs better on some tasks like Computer Vision and second because it can take advantage of huge amounts of data (and requires that volume in order to perform well) that are nowadays becoming available.

With ML.NET 0.5 we are starting to add support for Deep Learning in ML.NET. Today we are introducing the first level of integration with TensorFlow in ML.NET through the new TensorFlowTransform which enables taking an existing TensorFlow model, either trained by you or downloaded from somewhere else, and get the scores from the TensorFlow model in ML.NET.

This new TensorFlow scoring capability doesn’t require you to have a working knowledge of TensorFlow internal details. Longer term we will be working on making the experience for performing Deep Learning with ML.NET even easier.

The implementation of this transform is based on code from TensorFlowSharp.

As shown in the following diagram, you simply add a reference to the ML.NET NuGet packages in your .NET Core or .NET Framework apps. Under the covers, ML.NET includes and references the native TensorFlow library which allows you to write code that loads an existing trained TensorFlow model file for scoring.

TensorFlow-ML.NET application diagram

The following code snippet shows how to use the TensorFlow transform in the ML.NET pipeline:

You can find here the complete code example related to the above code-snippet using the TensorFlowTransform, the TensorFlow Inception v3 model and the existing LearningPipeline API.

The code example above uses the pre-trained TensorFlow model named Inception v3, that you can download from here. The Inception v3 is a very popular image recognition model trained on the ImageNet dataset where the TensorFlow model tries to classify entire images into a thousand classes, like “Umbrella”, “Jersey”, and “Dishwasher”.

The Inception v3 model can be classified as a deep convolutional neural network and can achieve reasonable performance on hard visual recognition tasks, matching or exceeding human performance in some domains. The model/algorithm was developed by multiple researchers and based on the original paper: “Rethinking the Inception Architecture for Computer Vision” by Szegedy, et. al.

In the next ML.NET releases, we will add functionality to enable identifying the expected inputs and outputs of TensorFlow models. For now, use the TensorFlow APIs or a tool like Netron to explore the TensorFlow model.

If you open the previous sample TensorFlow model file (tensorflow_inception_graph.pb) with Netron and explore the model’s graph, you can see how it correlates the InputColumn with the node’s input at the beginning of the graph:

TensorFlow model's input in graph

And how the OutputColumn correlates with softmax2_pre_activation node’s output almost at the end of the graph.

TensorFlow model's input in graph

Limitations: We are currently updating the ML.NET APIs for improved flexibility, as there are a few limitations to use TensorFlow in ML.NET today. For now (when using the LearningPipeline API), these scores can only be used within a LearningPipeline as inputs (numeric vectors) to a learner like a classifier learner. However, with the upcoming new ML.NET APIs, the TensorFlow model scores will be directly accessible, so you score with the TensorFlow model without the current need to add an additional learner and its related train process as implemented in this sample. It creates a multi-class classification ML.NET model based on a StochasticDualCoordinateAscentClassifier using a label (object name) related to a numeric vector feature generated/scored per image file by the TensorFlow model.

Take into account that the mentioned TensorFlow code examples using ML.NET are using the current LearningPipeline API available in v0.5. Moving forward, the ML.NET API enabling to use TensorFlow will be slightly different and not based on the “pipeline”. This is related to the next section of this blog post which focuses on the new upcoming API for ML.NET.

Finally, we also want to highlight that the ML.NET framework is currently surfacing TensorFlow, but in the future we might look into additional Deep Learning library integrations, such as Torch and CNTK.

You can find an additional code example/test using the TensorFlowTransform with the existing LearningPipeline API here.

Explore the upcoming new ML.NET API (after 0.5) and provide feedback

As mentioned at the beginning of this blog post, we are really looking forward to get your feedback as we create the new ML.NET API while crafting ML.NET. This evolution in ML.NET offers more flexible capabilities than what the current LearningPipeline API offers. The LearningPipeline API will be deprecated when this new API is ready and good enough.

The following links to some example feedback we got in the form of GitHub issues about the limitations when using the LearningPipeline API:

Therefore, based on feedback on the LearningPipeline API, quite a few weeks ago we decided to switch to a new ML.NET API that would address most of the limitations the LearningPipeline API currently has.

Design principles for this new ML.NET API

We are designing this new API based on the following principles of :

  • Using parallel terminology with other well-known frameworks like Scikit-Learn, TensorFlow and Spark and we will try to be consistent in terms of naming and concepts making it easier for developers to understand and learn ML.NET Core.

  • Keeping simple and concise ML scenarios such as simple train and predict.

  • Allowing advanced ML scenarios (not possible with the current LearningPipeline API as explained in the next section).

We have also explored API approaches like Fluent API, declarative, and imperative.
For additional deeper discussion on principles and required scenarios, check out this issue in GitHub.

Why ML.NET is switching from the LearningPipeline API to a new API?

As part of the preview version crafting process (remember that ML.NET is still in early previews), we’ve been getting LearningPipeline API feedback and discovered quite a few limitations we need to address by creating a more flexible API.

Specifically, the new ML.NET API offers attractive features which aren’t possible with the current LearningPipeline API:

  • Strongly-typed API: This new Strongly-typed API takes advantage of C# capabilities so errors can be discovered in compilation time along with improved Intellisense in the editors.

  • Better flexibility: This API provides a decomposable train and predict process, eliminating rigid and linear pipeline execution. With the new API, execute a certain code path and then fork the execution so multiple paths can re-use the initial common execution. For example, share a given transforms’ execution and transformed data with multiple learners and trainers, or decompose pipelines and add multiple learners.

This new API is based on concepts such as Estimators, Transforms and DataView, shown in the following code in this blog post.

  • Improved usability: Direct call to the APIs from your code, no more scaffolding or insolation layer creating an obscure separation between what the user/developer writes and the internal APIs. Entrypoints are no longer mandatory.

  • Ability to simply score with TensorFlow models. Thanks to the mentioned flexibility in the API, you can also simply load a TensorFlow model and score by using it without needing to add any additional learner and training process, as explained in the previous “Limitations” topic within the TensorFlow section.

  • Better visibility of the transformed data: You have better visibility of the data while applying transformers.

Comparison of strongly-typed API vs. LearningPipeline API

Another important comparison is related to the Strongly Typed API feature in the new API.
As an example of issues you can get when you don’t have strongly typed API, the LearningPipeline API (as illustrated in the following code) provides access to data columns by specifying the column’s names as strings, so if you make a typo (i.e. you wrote “Descrption” without the ‘i’ instead of “Description”, as the typo in the sample code), you will get a run-time exception:

However, when using the new ML.NET API, it is strongly typed, so if you make a typo, it will be caught in compilation time plus you can also take advatage of Intellisense in the editor.

Details on decomposable train and predict API

The following code snippet shows how the transforms and training process of the “GitHub issues labeler” sample app can be implemented with the new API in ML.NET.

This is our current proposal and based on your feedback this API will probably evolve accordingly.

New ML.NET API code example:

Compare with the following old LearningPipeline API code snippet that lacks flexibility because the pipeline execution is not decomposable but linear:

Old LearningPipeline API code example:

The old LearningPipeline API is a fully linear code path, so you can’t decompose it in multiple pieces.
For instance, the BikeSharing ML.NET sample (available at the machine-learning-samples GitHub repo) is using the current LearningPipeline API.

This sample compares the regression learner accuracy using the evaluators API by:

  • Performing several data transforms to the original dataset
  • Training and creating seven different ML.NET models based on seven different regression trainers/algorithms (such as FastTreeRegressor, FastTreeTweedieRegressor, StochasticDualCoordinateAscentRegressor, etc.)

The intent is to help you compare the regression learners for a given problem.

Since the data transformations are the same for those models, you might want to re-use the code execution related to transforms. However, because the the LearningPipeline API only provides a single linear execution, you need to run the same data transformation steps for every model you create/train, as shown in the following code excerpt from the BikeSharing ML.NET sample.

Where the BuildAndTrain() method needs to have both data transforms plus the different algorithm per case, as shown in the following code:

With the old LearningPipeline API, for every training using a different algorithm you need to run again the same process, performing the following steps again and again:

  • Load dataset from file
  • Make column transformations (concat, copy, or additional featurizers or dictionarizers, if needed)

But with the new ML.NET API based on Estimators and DataView you will be able to re-use parts of the execution, like in this case, re-using the data transforms execution as the base for multiple models using different algorithms.

You can also explore other “aspirational code examples” with the new API here

Because this will be a significant change in ML.NET we want to share our proposals and start an open discussion with you where you can provide your feedback and help shape the long-term API for ML.NET.

Provide your feedback on the new API

Provide feedback image with two people and a swimlane

Want to get involved? Start by providing feedback at this blog post comments below or through issues at the ML.NET GitHub repo

Get started!

If you haven’t already, get started with ML.NET here!

Next, explore some other great resources:

We look forward to your feedback and welcome you to file issues with any suggestions or enhancements in the ML.NET GitHub repo.

This blog was authored by Cesar de la Torre, Gal Oshri, John Alexander, and Ankit Asthana

Thanks,

The ML.NET Team

Cesar De la Torre
Cesar De la Torre

Principal Program Manager, .NET

Follow Cesar   

8 Comments
Lakshmi K
Lakshmi K 2019-02-22 04:14:36
Artificial Intelligence have taken the entire world by storm with its amazing inventions and innovative technologies. By observing the on-going advancements in this field, it will be no longer the power to create in one's mind to experience the world where robots and machine will dominate the society.Last year, we have witnessed the rise of ML algorithms on almost all major e-commerce portals and its associated mobile apps, which is further expected to spread across on all social networking platforms, dating websites, and matrimonial websites in 2018. AI is one of the most sought-after skills in the IT industry in 2019 and it will continue to dominate next few decades.    To know more visit : https://www.mytectra.com/
bhavya m
bhavya m 2019-03-12 22:26:48
Thanks for sharing about machine learning this is very useful It’s been a few months already since we released ML.NET 0.1 at //Build 2018, a cross-platform, open source machine learning framework for .NET developers. for more https://www.mytectra.com/
bhavya m
bhavya m 2019-03-13 04:42:40
This is very nice information about artificial intelligence to enroll artificial intelligence visit  : https://www.mytectra.com/ 
Soujanya Bargavi
Soujanya Bargavi 2019-05-01 23:58:37
Thanks for sharing info on Announcing ‘Machine Learning .NET’ 0.5 which provides Added a TensorFlow model scoring transform (TensorFlowTransform) to ML.NET v0.5 To know more visit : http://bit.ly/2DKDQMO
Maddison jams
Maddison jams 2019-05-10 03:24:29
The post you have shared is based upon the solid information and something that is super helpful for so many and gives you the pathway when you are feeling hesitant or helpless. I literally needed this because so many times I was racking my brain like what am I doing was not showing value. I also found some other interesting blogs on JanBask Training on What Is Machine Learning? They were also very interesting. for more information visit :~https://www.janbasktraining.com/ai-certification-training-online
Renuka peshwani
Renuka peshwani 2019-06-06 04:29:03
I totally appreciate the writter @Cesar De la Torre. Here, every thing is mensioned properly. Any body may be a fresher or an experience can understand the ML.NET 0.5 and Tenserflow. I personally like tensorflow and currently working on it.Tensorflow is the powerful Machine Learning Framework through which we can build a neural network line by line. Tensorflow has its its own Lite version for tiny devices, which brings model execution to a varity of devices, including mobile and IoT. Today, we can get Machine Learning on our Raspberry Pi or Mobile Phones. The applications of tensorflow will be increases in upcoming years. Hope, we all see some more inovation through it.
Huddle Rise
Huddle Rise 2019-06-24 05:44:21
Wonderful explanation about Machine language...keep sharing more articles
Raghavendra Sundar
Raghavendra Sundar 2019-07-15 06:35:59
Hi Cesar, Thanks for Sharing wonderful thing on Machine Learning. Lets try this. https://bit.ly/2XXmoAj