RyuJIT CTP2: Getting Ready for Prime-time
This post announces an updated preview of the .NET team’s new 64-bit Just-In-Time (JIT) compiler. It was written by Mani Ramaswamy, Program Manager for the .NET Dynamic Code Execution Team.
Note: RyuJIT CTP3 is available here: http://blogs.msdn.com/b/dotnet/archive/2014/04/03/the-next-generation-of-net.aspx.
The developer preview of RyuJIT, CTP1, received a thunderous response (so much so we had to post a FAQ soon after). Two questions commonly asked were when would there be an update and when would it support feature X or Y that is in the existing 64-bit .NET JIT compiler. CTP2 answers both questions. This release of RyuJIT has equivalent functionality of existing JIT64: there aren’t any feature differences between RyuJIT and the existing JIT64 at this point. RyuJIT generates code that’s on average better than the existing JIT64, while it continues to maintain the 2X throughput wins over JIT64.
Improvements: Features, Reliability, Performance
The two main features which weren’t supported in CTP1 were “opportunistic” tail calls and Edit & Continue. With CTP2, both of these features are supported. Additionally, a host of other features have been added to achieve functional parity with JIT64. Along the way, we’ve (the .NET Code Generation team) also added a number of performance tweaks and optimizations so that code generated using RyuJIT is generated fast (the throughput metric) and runs fast (the code quality metric).
But why stop there? We have thrown every test at our disposal at RyuJIT and it has come out with flying colors – whether it be running common server software using IKVM.NET (a Java Virtual Machine implemented in .NET), or complex ASP.NET workloads, or even simple Windows Store apps. Thanks to everyone who tried out the first CTP of RyuJIT and filed bug reports – we’ve fixed every single one of them, and at this point, RyuJIT doesn’t have any known bugs.
We continue to look at ways to improve the overall quality of RyuJIT, and will likely discover a few more bugs along the way. From the enthusiastic response we got from the first CTP, we’ll surely hear back with a few more bugs from our early adopters, i.e. you.
When it comes to performance, CTP1 demonstrated that RyuJIT handily beats JIT64 on throughput (how fast the compiler generates code) by a factor of 2X. We’ve been careful to maintain our throughput wins, and this CTP should yield similar throughput numbers. With CTP1, the focus was on throughput and to get some early feedback, and not so much on code quality (how fast the generated code executes).
While with CTP1, we were in the same ball park as JIT64, we were still 10-20% slower on code quality, with some outliers. With CTP2, we’ve addressed that – at this point, on average we should be at par or beating JIT64 on code quality. If during your evaluation, you find a benchmark where RyuJIT is trailing JIT64 performance significantly, please reach out to us – by the time we’re done, RyuJIT should be producing code that’s better than what JIT64 produced. This is not to say that there couldn’t be a few micro-benchmarks where JIT64 produces more optimal code, but rather to say that on average RyuJIT should be on par or better, and in the few (rare) cases it does trail JIT64 performance, it trails by only a few percentage points. We tried out many common code quality benchmark suites internally, and found that RyuJIT code quality on average is better than the existing .NET JIT64 compiler – thus if you do find an outlier, we’re most interested.
The chart below shows our performance, relative to JIT64’s across a number of benchmarks, some very small, others fairly large. Positive numbers indicate RyuJIT performing better than JIT64. Negative numbers indicate the opposite. The gray section is the limit of “statistical noise” for each benchmark, so any bar that is within the gray area indicates effectively identical performance. Check the CodeGen blog within a day or two for a detailed description of the methodology and specifics about the benchmarks we’re running. Overall, we’re doing quite well, with only a handful of losses, and some very nice wins!
While we needed to first get all the functionality and quality metrics lined up and achieve parity on performance (code quality) with JIT64 (we’re already 2X faster on throughput, in case you forgot), our re-architecture puts us in a great place for optimizing .NET dynamic code execution scenarios. Over the next few months, you will continu