ML.NET is an open-source, cross-platform machine learning framework for .NET developers that enables integration of custom machine learning into .NET apps.
In this post, we’ll cover the following items:
Model Builder updates
Notebook Editor in Visual Studio
Interactive Notebooks are used extensively in data science and machine learning. They are great for data exploration and preparation, experimentation, model explainability, and even education.
Last year, .NET Interactive Notebooks were announced, and you can currently use .NET Interactive Notebooks in VS Code as an extension.
After talking to customers, the team decided to experiment with Interactive Notebooks in Visual Studio which has resulted in the new Notebook Editor extension!
Getting started with Notebook Editor
Notebook Editor is only available in Visual Studio 2022 starting with Preview 4 and is currently offered as an experimental (preview) extension.
To try it out, you should first:
- Install Visual Studio 2022 Preview 4 (or newer).
- Install the Notebook Editor extension from the Visual Studio Marketplace.
Then, there are two entry points to get started with Notebook Editor in Visual Studio.
The first entry point is from ML.NET Model Builder, where you can get a generated Notebook with content based on your own data and model.
To get a Notebook from Model Builder:
- Install the latest version of Model Builder for VS 2022.
- Train a model with Model Builder and go to the Consume step.
- Under Project templates, add the Notebook to your solution.
- Double click on the .ipynb file that is now in the Solution Explorer to open the Notebook in Notebook Editor.
The generated Notebook from Model Builder contains:
- The training pipeline for the model chosen by Model Builder so that you can see how your model was trained and easily re-train
- Plots and graphs for data exploration and model explainability techniques so that you can more easily understand and explain your data and model
The second entry point for Notebooks is to simply add a new Notebook from the Add New Item dialog.
- Right click on your project in the Solution Explorer.
- Select Add > New Item…
- In the Add New Item dialog, select Notebook and add it to your project.
This creates a blank Notebook with no content. You can try adding some C# code and running a cell in the Notebook like this:
This is the first version of Notebooks in Visual Studio! If you have any feedback, issues, or questions about Notebook Editor or Notebooks in Visual Studio, please file an issue in our GitHub repo.
Consumption code improvements
After you train a model in Model Builder, the Consumption file is generated and added to your project. This Consumption file contains a Predict() method which you can use to make predictions with your model in your end-user application.
This method abstracts away several steps that are needed to consume an ML.NET model:
- Initializing an MLContext
- Loading the model
- Creating a PredictionEngine
- Using the PredictionEngine and the model to make the prediction on the input data
In the previously generated model consumption code, these steps all happened inside the Predict() method, meaning that these all happened every time the Predict() method was called. This resulted in decreased performance on each prediction.
So, we updated the code to make it a lot more efficient where all of these steps only happen once when using the Predict() method.
The new code is demonstrated below:
public static ModelOutput Predict(ModelInput input)
{
var predEngine = PredictEngine.Value;
return predEngine.Predict(input);
}
private static PredictionEngine<ModelInput, ModelOutput> CreatePredictEngine()
{
var mlContext = new MLContext();
ITransformer mlModel = mlContext.Model.Load(MLNetModelPath, out var _);
return mlContext.Model.CreatePredictionEngine<ModelInput, ModelOutput>(mlModel);
}
Read more about all of this month’s updates in the Release Notes.
Progress on addressing ML.NET pain points
As mentioned in the last ML.NET blog post, the following items were found as top pain points or blockers in this year’s ML.NET customer development:
- Small ML.NET Community
- Afraid Microsoft will abandon the framework
- Lack of / quality docs and samples
- Lack of deep learning support
- Specific ML scenario or algorithm not supported by ML.NET
Below we have outlined the steps we’ve taken so far and progress we’ve made in each area.
Small ML.NET Community
The team continues to host the Machine Learning .NET Community Standup every other week to talk about what we’re working on and to educate and engage with the community. We’ve also added a new story to the ML.NET Customer Showcase and are working on adding more.
We are also encouraging contributions to ML.NET. The first good issue and up-for-grab issues on GitHub are a great place to start!
Additionally, following the previous .NET monthly themes of F#, Razor, and IoT, October will be focused on machine learning! The team is currently planning out lots of machine learning and ML.NET content and is looking forward to working with the community on this.
Afraid Microsoft will abandon the framework
ML.NET is .NET, and to make it feel more a part of .NET, we’ve decided to align with the .NET release schedule. This means that we will ship our next version of ML.NET (v1.7.0) with .NET 6.0 in November 2021 and will ship subsequent major releases (ML.NET 2.0, 3.0, etc.) with major releases of .NET. We will ship production-ready preview version releases in between so that we can continue adding new features to the framework throughout the year.
We are also taking steps to organize the dotnet/machinelearning repo and keep it up to date. We are currently revising our triage processes so that we can address your issues and feedback faster. Issues will be linked to version releases in the Projects section of the repo so you can see what we’re actively working on and when we plan to release.
Check out the roadmap to see what we have planned for ML.NET this year.
Lack of / quality docs and samples
We have invested more resources into content development to make sure our Docs stay up to date and that we add documentation for new features faster as well as add more relevant samples.
@luisquintanilla (Microsoft content developer) and @jwood803 (ML.NET community member and newly contracted docs/samples developer) have both been working hard to ensure that we increase the quality of ML.NET documentation. They have set several goals, including reducing the average days to close Docs issues and publishing documentation for new features no more than two weeks after a new feature is released.
In the past two months, 19 articles have been updated, and a new article on how to label images for object detection has been added. This month, the team is working on adding two new tutorials for image classification and recommendation in Model Builder to Docs as well as updating the samples to the newest version of ML.NET.
You can file issues and make suggestions for ML.NET documentation in the dotnet/docs repo and for ML.NET samples in the dotnet/machinelearning-samples repo.
Lack of deep learning support
This past year we’ve been working on our plan for deep learning in .NET, and now we are ready to execute that plan to expand ML.NET’s deep learning support.
As part of this plan, we will:
- Make it easier to consume ONNX models in ML.NET using the ONNX Runtime (RT)
- Fully support and productionize TorchSharp for building neural networks in .NET
- Build a bridge between TorchSharp and ML.NET
Read more about the deep learning plan and leave your feedback in this tracking issue.
Specific ML scenario or algorithm not supported by ML.NET
We have added Named Entity Recognition to the roadmap which has been a highly requested scenario since ML.NET was released.
The deep learning plan will also enable a variety of other scenarios so that you can train custom models for object detection, NLP tasks, and more in .NET.
If there is still a scenario or algorithm missing that is not covered in the roadmap, please let us know by filing an issue.
Get started and resources
Learn more about ML.NET and Model Builder in Microsoft Docs.
If you run into any issues, feature requests, or feedback, please file an issue in the ML.NET API repo or the ML.NET Tooling (Model Builder & ML.NET CLI) repo on GitHub.
Join the ML.NET Community Discord.
Tune in to the Machine Learning .NET Community Standup every other Wednesday at 10am Pacific Time.
I will never touch this because of pain point #2.
Can you elaborate? We are dedicated to empowering .NET Developers to leverage machine learning! We’d love to have you try it out!
Very glad to see the new release. Almost lost my patience when Microsoft.Data.Analysis 0.4.0 was moving to nowhere in more than 1 year.
Great news guys! Love to see the TorchSharp bridge!… Keep up the good work!
It would be a good start to be able to easily compile TorchSharp. Its difficult and time consuming due to many third party libraries and version constraints.
Glad to see DataFrame is not dead and you guys are still planning to do more work on it.
Link to Notebook Editor extension is broken.
Found this one:
https://marketplace.visualstudio.com/items?itemName=MLNET.notebook
Good catch! It’s been updated now. 🙂
https://marketplace.visualstudio.com/manage/publishers/mlnet/extensions/notebook/ is 404
right url: https://marketplace.visualstudio.com/items?itemName=MLNET.notebook
PS C:\Users\geffzhang\tke> dotnet tool install -g --add-source "https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet-tools/nuget/v3/index.json" Microsoft.dotnet-interactive
欢迎使用 .NET 6.0!
SDK 版本: 6.0.100-rc.1.21463.6
遥测
.NET 工具会收集用法数据,帮助我们改善你的体验。它由 Microsoft 收集并与社区共享。你可通过使用喜欢的 shell 将 DOTNET_CLI_TELEMETRY_OPTOUT 环境变量设置为 "1" 或 "true" 来选择退出遥测。
阅读有关 .NET CLI 工具遥测的更多信息: https://aka.ms/dotnet-cli-telemetry
已安装 ASP.NET Core HTTPS 开发证书。
若要信任该证书,请运行 "dotnet dev-certs https --trust" (仅限 Windows 和 macOS)。
了解 HTTPS: https://aka.ms/dotnet-https
编写你的第一个应用: https://aka.ms/dotnet-hello-world
查找新增功能: https://aka.ms/dotnet-whats-new
浏览文档: https://aka.ms/dotnet-docs
在 GitHub 上报告问题和查找源: https://github.com/dotnet/core
使用 "dotnet --help" 查看可用命令或访问: https://aka.ms/dotnet-cli
C:\Program Files\dotnet\sdk\6.0.100-rc.1.21463.6\NuGet.targets(130,5): error : 无法加载源 http://118.25.132.153:30020/v3/index.json 的服务索引。 [d:\Temp\o2zvszea.ovl\restore.csproj]
C:\Program Files\dotnet\sdk\6.0.100-rc.1.21463.6\NuGet.targets(130,5): error : 由于目标计算机积极拒绝,无法连接。 (118.25.132.153:30020)...