Crossgen2 is an exciting new platform addition and part of the .NET 6 release. It is a new tool that enables both generating and optimizing code in a new way.
The crossgen2 project is a significant effort, and is the focus of multiple engineers. I thought it might be interesting to try a more conversational approach to exploring new features. I sent a set of questions to the team. Simon Nattress offered to tell us more about crossgen2. Let’s see what he said. I’ll provide my own thoughts, too.
What is crossgen for and when should it be used?
Simon: Crossgen is a tool that provides ahead-of-time (AOT) compilation for your code so that the need for JITing at runtime is reduced. When publishing your application, Crossgen runs the JIT over all assemblies and stores the JITted code in an extra section that can be quickly fetched at runtime. Crossgen should be used in scenarios where fast startup is important.
Rich: You might see crossgen and readytorun terms used interchangeably. Crossgen is a tool that generates native code in (at least today) the readytorun format. The readytorun format is primarily oriented on being compatible across assemblies, and having the same compatibility guarantee as IL, while offering the performance benefits of ahead-of-time compiled code. Starting with crossgen2, it has some other modes with other characteristics.
Why are we making a new version of crossgen? What are our goals?
Simon: Crossgen’s pedigree comes from the early .NET Framework days. Its implementation is tightly coupled with the runtime (it essentially is just the runtime and JIT attached to a PE file emitter). We are building a new version of Crossgen – Crossgen 2 – which starts with a new code base architected to be a compiler that can perform analysis and optimizations not possible with the previous version.
Rich: As the .NET Core project became more mature and we saw usage grow across multiple application scenarios, we realized that crossgen’s limitation of only really being able to produce native code of one flavor with one set of characteristics was going to be a big problem. For example, we might want to generate code with different characteristics for Windows desktop on one hand and Linux containers on the other. The need for that level of code generation diversity is what motivated the project.
Is crossgen -> crossgen2 similar to the native code csc -> managed Roslyn transition? How long has it been worked on?
Simon: The Roslyn transition to managed was not just a rewrite in a different language. It defined an analysis platform for using CSC as an API. It can be used as a compiler and as a source code analyzer in an editor. Similarly, Crossgen2 is not simply a rewrite in managed. The architecture uses a graph to drive analysis and compilation. This allows scanners, optimizers, analyzers to all work off a common representation of the assembly being compiled. The project has been worked on for 2 years – the origins of the Crossgen2 compiler began as a research project around 2016.
Rich: We have a lot of people on the team that primarily write C/C++ (even assembly), but most people like writing C# better and are more productive. Every release, more of the product gets moved to C# for this and other reasons.
What are the key benefits and also the drawbacks from writing crossgen in C#?
Simon: Writing in C# gives us access to a rich set of .NET APIs as well as memory safety guarantees provided by using a managed language. A drawback of using C# is increased processing time when using Crossgen2 on many small assemblies at once because of the overhead of starting the runtime many times. Fortunately, we can mitigate much of that by running Crossgen2 on itself!
Rich: It is also super helpful being on the same team as the folks adding new capabilities to C# and .NET libraries. There is a lot of shared thinking and collaboration on low-level scenarios to enable C# to be a high-performance language. The more challenges we run into to make low-level code fast, the more we add features to fix that. It’s a virtuous cycle.
Can you describe some of the projects that are planned that are made possible with crossgen2?
Simon: Crossgen2 (unlike native Crossgen) allows us to analyze and compile multiple assemblies at once as a single servicing unit with extra optimizations allowed within the compile set.
Rich: Version bubbles is the feature that Simon is referring to, and is one of my favorite new features. By default, readytorun code is versionable, and that’s a great characteristic. I work a lot on containers and they have a key characteristic of immutability, which makes versionability unimportant. Version bubbles trade versionability for performance. That’s perfect for scenarios like containers where you’d much prefer greater performance and don’t have to give anything up for it. I’m looking forward to offering more nuanced and opinionated code in scenarios where it makes sense.
Rich: Versionability is a big topic, but I feel the need to expand on it a little. Let’s start with the book of the runtime. “When changes to managed code are made, we have to make sure that all the artifacts in a native code image only depend on information in other modules that cannot change without breaking the compatibility rules. What is interesting about this problem is that the constraints only come into play when you cross module boundaries.” Inlining is the perfect example. Methods can be inlined within the same assembly (equivalent to “module”) because the method being inlined and the method it is being inlined into reside within the same compatibility boundary. You cannot update one without updating the other. If you inline across assemblies boundaries, then the original code (that was inlined) could change and then a performance optimization is now exhibiting functionally incorrect behavior. That’s very bad. Version bubbles enable redefining the version boundary, but it is up to you to maintain that contract, and it isn’t a .NET code generation bug if you don’t.
Rich: Cross-compilation is another really important feature. You’ll be able to produce native code for Arm64 on an x64 machine and vice versa. For example, when you want to generate Arm64 code on an x64 machine, the SDK will acquire the Arm64 RyuJIT compiled for x64 so that it will run on an x64 machine. Cross-compilation is a key tenet of the architecture.
Could crossgen2 ever be used to target a runtime other than CoreCLR? For example, to enable the native AOT form factor?
Simon: Yes – much of the current Crossgen2 code is shared with the NativeAOT project which targets a different runtime. The managed type system implementation has been designed with extension points to allow for this flexibility.
What’s with the name? What’s the name you would prefer and why?
Simon: Crossgen originally started life as a cross-architecture AOT code generator for Windows Phone.
Rich: At one point, I tried to rename the tool “genr2r”, like “generator” but “r2r” at the end for “ready-to-run” but no one else was keen on that idea. At this point, I’m hoping that we’ll revert to just calling the tool “crossgen” after we’ve dropped our use of the existing crossgen tool.
Closing
First, thanks Simon for taking some time to tell us all about crossgen2. We also appreciate all your efforts on crossgen2. Simon has since moved to the Cosmos DB team. They use .NET, too!
While many of you will not use crossgen2 directly, you will certainly take advantage of the .NET platform being more optimized with this new tool. Going forward, crossgen2 will enables us even more options to make higher performance choices for the platform and for your code.
This post was the first one that I’ve posted in a conversational style. Did you like it? Should we do this again? If so, which topics should we have a conversation about next?
Thank you for taking the time to write to us Appreciate the useful info.
So this is a complicated quasi-AOT, where the complexity involves maintaining IL which is needed for versioning and compatibility. Is the purpose for large componentized systems like visual studio, where workloads need to be compatible with each other?
That would explain why Microsoft is very interested in this Crossgen technology (if it's needed to build VS), but app developers want a true AOT technology (because they just want to compile an app to native code specifically...
Great question and insight, Charles. Yes and no.
The key design scenario of readytorun isn't componentized systems, although it lends itself really well to that. It is servicing. There is so much context to describe here. I'll do my best. Unfortunately, some history is required, but I'll try to be brief.
With .NET Framework, we had NGEN. NGEN images are always what we call "fragile". It has no concept of version bubbles. Unlike readytorun where the default...
Thanks for the very thorough and thoughtful response!
.Net Core is an amazing platform, and C# is an amazing language. Yet Microsoft chose go to build Dapr. Is it because of political reason or .Net Core is not as good as GO?
In addition to what Richard has said, Go is probably a reasonably good fit at this stage from a technical point of view too. Dapr is essentially a set of fairly lightweight components that run in sidecars, and for these, Go does a few things right:
Essentially, it's just a good way to build lightweight components for sidecars. IMO C# is a more capable language in general, and often quite a bit faster, but you'd have...
It is pretty simple, actually. This is all from my point of view, so the Dapr team might say something different, but I think they'd agree.
Dapr was intended to reach the hearts and minds of the CNCF community where golang is popular. When you are working with another community, it is a good idea to reduce friction as much as possible. Writing Dapr in Go helps with that. Microsoft is a seen as a leading...
A few other important new capabilities of crossgen2 are:
1. Ability to build native composite binaries — With this all precompiled code (including the Framework libraries + application code) will be placed in a single binary. This enables further optimizations like inlining cross assembly boundaries, and compiling code for generic instantiations. We have seen measurable startup gain on Linux with composite mode.
2. Ability to specify native instruction sets like AVX, AVX2.
What is the difference between crossgen and ngen?
ngen is a technology which is included within .net Framework. Its also a precompilation technique but runs on the target machine. Crossgen (and latest crossgen2) are available in .net core and can be run during build time to generate native code based on OS/architecture.
Thank you, now I get it. I perceive crossgenX as being similar to cross compilation on Linux. To build native code for a different platform/architecture than the host platform/architecture. Whereas ngen was/is a .NET Framework and Windows specific tool used after building apps. If I remember correctly the main purpose for ngen was to generate native code for the BCL of the .NET Framework and third-party libraries.
Here is the way I think of it. There are three key characteristics:
Questions:
- Is the tool just a separate build of the runtime or a specialized tool?
- Can it cross-target to other OSes and architectures?
- Can it generate native images in the build?
- Can it generate images in multiple flavors?
Answers:
- NGEN and Crossgen 1 are a separate build of the runtime, and Crossgen 2 is a specialized tools.
- NGEN...
Thanks for the update; always nice to read what you are working on!
With regards to the format, I think it worked quite well for this and would like to see be used again 👍.
Thanks. I have two more already planned on related topics. Might as well stick with a similar theme.
While interesting, the juicy part was left out, what is on the roadmap for .NET Native, it is going to be left unmaintained in its current state and we should just move into desktop for anything beyond C# 7?
According to here
https://github.com/dotnet/runtimelab/issues/1120
There's no real roadmap or plans beside being experimental. As I mentioned to other people with the same problem as me, I think is time to move on with Vala+GTK+Glade (for xml UI similar to what we have in the XAML). There were quite a few nice apps there made in Vala which is very very similar to C#. It translates the software to C and compile it. You have two advantages...
This project and .NET Native don't have a lot of overlap. Assuming we're talking about the same thing, .NET Native is used for UWP apps. We're waiting on the Project Reunion plan to come to fruition before making any changes for UWP and related apps. In terms of .NET Native itself, it is maintained but has a low level of investment. We will not be enabling .NET Native with Project Reunion. Any new native AOT...
We will not be enabling .NET Native with Project Reunion.
This is actually a large problem and puts MAUI in a difficult situation. Their current samples use Project Reunion: https://github.com/dotnet/net6-mobile-samples/blob/main/HelloWinUI3/HelloWinUI3/HelloWinUI3.csproj
Maybe they could avoid Project Reunion and use WinUI with UWP (which is AOT-compatible I believe). But then they might have to make a separate WPF platform as Xamarin currently does.
I'm not sure I have all the details but there is no point in Project...
Thanks for replying.
The lack of public information, just confirms that after being burned with XNA, Silverlight, WinRT => UAP => UWP rewrites, C++/CX being dropped without proper replacement (C++/WinRT with no VS tooling ain't it), and now .NET Native uncertainty, that focusing on Win32 is the only safe bet to avoid the continuous rewrite stream coming out from some Microsoft teams.
My long term experience on Microsoft eco-system has taught me to understand such reply as...
Just curious, what’s wrong with .Net Native? It is already working stable technology giving great performance boost, sometimes upto x5-x6 on modern hardware. Why managers decided to not extend it to whole .Net?
Check out the form factors doc. A lot of the context is there.
Briefly:
This is great news.
Does this benefit those of us who use .NET Core for building AWS Lambda functions? In the past ReadyToRun was suggested as a way of optimizing performance on such setups. Can we except an improvement if we were to use crossgen2 when AWS Lambda starts supporting .NET6?
I'm on the AWS .NET team and I'm excited about this work.
This is the type of work in .NET Core that can really help Lambda cold starts. Lambda is a perfect use case for what Rich called a Version Bubble as Lambda is an immutable environment. So anything crossgen2 can do even going across module boundary I'm all for it when it comes to Lambda.
Besides the performance aspects the cross-compilation will really help the developer...
Norm,
That’s great.
Our team is moving from running dockerized ASP.NET Core APIs on ECS to a serverless approach. It’s been a recurring discussion point whether .NET is the right platform to build for Lambdas; especially given the cold start delays.
We are excited to see you guys are actively looking in to these aspects.
Cold start has been a weaker aspect of .NET. The features landing .NET 6 and .NET 7 should help a lot.
Richard,
That’s great to hear.
I guess, we’ll still not be able to get the benefits of .NET 7 on AWS Lambda as it’s not going to be LTS; just as we are stuck with 3.1 right now 🙁
Looking forward to 6.