{"id":231151,"date":"2024-06-26T10:55:09","date_gmt":"2024-06-26T17:55:09","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/java\/?p=231151"},"modified":"2024-06-26T10:57:34","modified_gmt":"2024-06-26T17:57:34","slug":"improving-openjdk-scalar-replacement-part-2-3","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/java\/improving-openjdk-scalar-replacement-part-2-3\/","title":{"rendered":"Improving OpenJDK Scalar Replacement &#8211; Part 2\/3"},"content":{"rendered":"<p style=\"text-align: justify;\"><span data-contrast=\"auto\">In the <a href=\"https:\/\/devblogs.microsoft.com\/java\/improving-openjdk-scalar-replacement-part-1-3\/\">previous part of this blog series<\/a>, we explored the foundational concepts and purpose of scalar replacement (SR) in OpenJDK, laying the groundwork for understanding how this optimization can boost the performance of Java applications. Now, in the second installment of the series, we shift our focus to the specific enhancements that we have introduced to the SR implementation, highlighting how it lifts the constraints identified in the original SR implementation in the C2 compiler.<\/span><\/p>\n<h2><span data-ccp-props=\"{}\">The Improvement\u00a0<\/span><\/h2>\n<p style=\"text-align: justify;\"><span data-contrast=\"auto\">Listing 1 shows a version of the <\/span><i><span data-contrast=\"auto\">CompositeChecksum<\/span><\/i> <span data-contrast=\"auto\">method that\u2019s slightly different from the previous one. In this version, there is a check to see if the message that we obtained from the list is null or not. If it is null, it creates a message object with the <\/span><i><span data-contrast=\"auto\">Clear<\/span><\/i><span data-contrast=\"auto\"> string as payload; otherwise, it just creates the message object using the <\/span><i><span data-contrast=\"auto\">msg<\/span><\/i><span data-contrast=\"auto\"> payload. This is quite a simple piece of code and a somewhat common pattern to write, however before our work to improve C2 scalar replacement, C2 would not scalar replace the <\/span><i><span data-contrast=\"auto\">Message <\/span><\/i><span data-contrast=\"auto\">objects in cases like this.\u00a0<\/span><span data-ccp-props=\"{&quot;335551550&quot;:6,&quot;335551620&quot;:6}\">\u00a0<\/span><\/p>\n<p style=\"text-align: center;\"><span data-ccp-props=\"{&quot;335551550&quot;:2,&quot;335551620&quot;:2}\"><a href=\"https:\/\/devblogs.microsoft.com\/java\/wp-content\/uploads\/sites\/51\/2024\/06\/listing1.png\"><img decoding=\"async\" class=\"alignnone wp-image-231231\" src=\"https:\/\/devblogs.microsoft.com\/java\/wp-content\/uploads\/sites\/51\/2024\/06\/listing1-300x143.png\" alt=\"Image listing1\" width=\"566\" height=\"270\" srcset=\"https:\/\/devblogs.microsoft.com\/java\/wp-content\/uploads\/sites\/51\/2024\/06\/listing1-300x143.png 300w, https:\/\/devblogs.microsoft.com\/java\/wp-content\/uploads\/sites\/51\/2024\/06\/listing1-1024x488.png 1024w, https:\/\/devblogs.microsoft.com\/java\/wp-content\/uploads\/sites\/51\/2024\/06\/listing1-768x366.png 768w, https:\/\/devblogs.microsoft.com\/java\/wp-content\/uploads\/sites\/51\/2024\/06\/listing1.png 1480w\" sizes=\"(max-width: 566px) 100vw, 566px\" \/><\/a><\/span><\/p>\n<p style=\"text-align: center;\"><span data-ccp-props=\"{&quot;335551550&quot;:2,&quot;335551620&quot;:2}\">\u00a0<\/span><b><span data-contrast=\"auto\">Listing 1:<\/span><\/b><span data-contrast=\"auto\"> The CompositeChecksum control flow merge version.<\/span><span data-ccp-props=\"{&quot;335551550&quot;:2,&quot;335551620&quot;:2}\">\u00a0<\/span><\/p>\n<p style=\"text-align: justify;\"><span data-contrast=\"auto\">To understand why this is a more complicated scenario for scalar replacing the objects, let us see how C2 represents the method shown in <\/span><i><span data-contrast=\"auto\">Listing 2<\/span><\/i><span data-contrast=\"auto\">, which is basically the first line of the loop in Listing 1. In the method two <\/span><i><span data-contrast=\"auto\">Message <\/span><\/i><span data-contrast=\"auto\">objects are allocated in different branches of an \u201c<\/span><i><span data-contrast=\"auto\">if<\/span><\/i><span data-contrast=\"auto\">\u201d but only a single field of one of the objects is later used.<\/span><span data-ccp-props=\"{&quot;335551550&quot;:6,&quot;335551620&quot;:6}\">\u00a0<\/span><\/p>\n<p style=\"text-align: center;\"><span data-ccp-props=\"{}\"><a href=\"https:\/\/devblogs.microsoft.com\/java\/wp-content\/uploads\/sites\/51\/2024\/06\/listing2.png\"><img decoding=\"async\" class=\"alignnone wp-image-231232\" src=\"https:\/\/devblogs.microsoft.com\/java\/wp-content\/uploads\/sites\/51\/2024\/06\/listing2-300x154.png\" alt=\"Image listing2\" width=\"560\" height=\"287\" srcset=\"https:\/\/devblogs.microsoft.com\/java\/wp-content\/uploads\/sites\/51\/2024\/06\/listing2-300x154.png 300w, https:\/\/devblogs.microsoft.com\/java\/wp-content\/uploads\/sites\/51\/2024\/06\/listing2-1024x526.png 1024w, https:\/\/devblogs.microsoft.com\/java\/wp-content\/uploads\/sites\/51\/2024\/06\/listing2-768x395.png 768w, https:\/\/devblogs.microsoft.com\/java\/wp-content\/uploads\/sites\/51\/2024\/06\/listing2.png 1160w\" sizes=\"(max-width: 560px) 100vw, 560px\" \/><\/a><\/span><\/p>\n<p style=\"text-align: center;\"><b><span data-contrast=\"auto\">Listing 2:<\/span><\/b><span data-contrast=\"auto\"> A simple method with an allocation merge.<\/span><span data-ccp-props=\"{&quot;335551550&quot;:2,&quot;335551620&quot;:2}\">\u00a0<\/span><\/p>\n<p style=\"text-align: justify;\"><span data-contrast=\"auto\">From C2\u2019s \u201cperspective\u201d there is more than one place (allocation site) where the object that <\/span><i><span data-contrast=\"auto\">m<\/span><\/i><span data-contrast=\"auto\"> will eventually reference may be allocated. To be more precise, the variable <\/span><i><span data-contrast=\"auto\">m<\/span><\/i><span data-contrast=\"auto\"> will reference the object allocated in the <\/span><i><span data-contrast=\"auto\">then<\/span><\/i><span data-contrast=\"auto\"> block or the object allocated in the <em>else<\/em> block of the conditional. Note that this is a simple example where there are only two allocation sites; in practice, there may be many places where an object is assigned to the same reference. There are other cases where you would expect C2 to scalar replace objects, but it does not do so. However, the allocation merge issue was the most prominent at the time that we started our investigation. See this article where we discuss those cases: <\/span><a href=\"https:\/\/cr.openjdk.org\/~cslucas\/escape-analysis\/EscapeAnalysis.html\"><span data-contrast=\"none\">HotSpot Escape Analysis and Scalar Replacement Status<\/span><\/a><span data-contrast=\"auto\">.<\/span><span data-ccp-props=\"{&quot;335551550&quot;:6,&quot;335551620&quot;:6}\">\u00a0<\/span><\/p>\n<p style=\"text-align: justify;\"><span data-contrast=\"auto\">A detailed explanation of why C2 was not scalar replacing objects in such a code pattern is a topic for a more technical compiler-related blog post. However, just to give an idea why this is a complicated piece of code to optimize, consider these points:\u00a0<\/span><span data-ccp-props=\"{&quot;335551550&quot;:6,&quot;335551620&quot;:6}\">\u00a0<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/p>\n<ol>\n<li><span data-contrast=\"auto\">There may be more than two allocation sites in the allocation merge.<\/span><span data-ccp-props=\"{&quot;335551550&quot;:6,&quot;335551620&quot;:6}\">\u00a0<\/span><\/li>\n<li><span data-contrast=\"auto\">Some of the allocation sites may be an object being returned by another method.\u00a0<\/span><span data-ccp-props=\"{&quot;335551550&quot;:6,&quot;335551620&quot;:6}\">\u00a0<\/span><\/li>\n<li><span data-contrast=\"auto\">The allocation merge may be of Arrays, not of single dimensional (scalar) objects.<\/span><span data-ccp-props=\"{&quot;335551550&quot;:6,&quot;335551620&quot;:6}\">\u00a0<\/span><\/li>\n<li><span data-contrast=\"auto\">An allocation site can simply be a <\/span><i><span data-contrast=\"auto\">null<\/span><\/i><span data-contrast=\"auto\"> expression.<\/span><span data-ccp-props=\"{&quot;335551550&quot;:6,&quot;335551620&quot;:6}\">\u00a0<\/span><\/li>\n<li><span data-contrast=\"auto\">Objects can also be used, including changed, by other methods before they are assigned to <\/span><i><span data-contrast=\"auto\">m<\/span><\/i><span data-contrast=\"auto\">.\u00a0<\/span><span data-ccp-props=\"{&quot;335551550&quot;:6,&quot;335551620&quot;:6}\">\u00a0<\/span><\/li>\n<li><span data-contrast=\"auto\">In more extreme cases the object may be stored in global variables and accessed by other threads before the assignment to <\/span><i><span data-contrast=\"auto\">m<\/span><\/i><span data-contrast=\"auto\">.<\/span><span data-ccp-props=\"{&quot;335551550&quot;:6,&quot;335551620&quot;:6}\">\u00a0<\/span><\/li>\n<li><span data-contrast=\"auto\">During scalar replacement, the compiler must track the values of each field of each potential object. There may be many fields and some of these fields are other objects.<\/span><span data-ccp-props=\"{&quot;335551550&quot;:6,&quot;335551620&quot;:6}\">\u00a0<\/span><\/li>\n<li><span data-contrast=\"auto\">The objects may be created from different classes.<\/span><span data-ccp-props=\"{&quot;335551550&quot;:6,&quot;335551620&quot;:6}\">\u00a0<\/span><\/li>\n<li><span data-contrast=\"auto\">C2 will need to handle objects used in deoptimization points.<\/span><span data-ccp-props=\"{&quot;335551550&quot;:6,&quot;335551620&quot;:6}\">\u00a0<\/span><\/li>\n<\/ol>\n<p style=\"text-align: justify;\"><span data-contrast=\"auto\">Another complicating factor is the approach used by C2 to represent the code of the method while compiling it. C2 uses a concept called sea-of-nodes to represent methods internally. That representation has a property called <\/span><a href=\"https:\/\/en.wikipedia.org\/wiki\/Static_single-assignment_form\"><span data-contrast=\"none\">Static Single-Assignment<\/span><\/a><span data-contrast=\"auto\"> (SSA) which basically means that every piece of data (variable, object, etc.) can only be created in one specific point in the method. One way of making sure that this property is maintained is by labelling every piece of <em>data<\/em> in the method with a <\/span><i><span data-contrast=\"auto\">version<\/span><\/i><span data-contrast=\"auto\"> number. You can imagine a version as <em>roughly<\/em> the line number of where the data is created. If the data (variable, object, etc.) can be created in more than one place it will have different <\/span><i><span data-contrast=\"auto\">versions<\/span><\/i><span data-contrast=\"auto\">. At some points in the method, it will be necessary to \u201cmerge\u201d the versions of the variables and to do that a thing called a <\/span><i><span data-contrast=\"auto\">Phi <\/span><\/i><span data-contrast=\"auto\">function or <\/span><i><span data-contrast=\"auto\">Phi <\/span><\/i><span data-contrast=\"auto\">node is used. When the compiler is parsing a method, and it notes that a variable can point to more than one version of data, it will insert a <\/span><i><span data-contrast=\"auto\">Phi<\/span><\/i><span data-contrast=\"auto\"> node that will represent all data versions that the variable can point to.<\/span><span data-ccp-props=\"{&quot;335551550&quot;:6,&quot;335551620&quot;:6}\">\u00a0<\/span><\/p>\n<p style=\"text-align: justify;\"><span data-contrast=\"auto\">Figure 1 shows a partial illustration of how C2 will represent the method shown in Listing 2. In this illustration rectangles and squares represent instructions in the compiler internal representation, while arrows connecting these instructions means that the target of the arrow uses data created by the source of the arrow. The two <\/span><i><span data-contrast=\"auto\">Allocate<\/span><\/i><span data-contrast=\"auto\"> nodes represent instructions used to allocate memory regions for each of the two allocation sites in the method. The <\/span><i><span data-contrast=\"auto\">CheckCastPP<\/span><\/i><span data-contrast=\"auto\"> nodes represent instructions to cast the pointers to the allocated memory regions to pointers to actual Message objects. Note that this is how the compiler <\/span><i><span data-contrast=\"auto\">represents<\/span><\/i><span data-contrast=\"auto\"> the method, it does not mean that both object allocations will always happen \u2013 it only means that they <\/span><i><span data-contrast=\"auto\">can<\/span><\/i><span data-contrast=\"auto\"> happen.<\/span><span data-ccp-props=\"{&quot;335551550&quot;:6,&quot;335551620&quot;:6}\">\u00a0<\/span><\/p>\n<p style=\"text-align: justify;\"><span data-contrast=\"auto\">In our illustration C2 inserted the <\/span><i><span data-contrast=\"auto\">Phi<\/span><\/i><span data-contrast=\"auto\"> node to represent the fact that the variable <\/span><i><span data-contrast=\"auto\">m<\/span><\/i><span data-contrast=\"auto\"> can point to an object allocated in one of two different allocation sites. This <\/span><i><span data-contrast=\"auto\">Phi<\/span><\/i><span data-contrast=\"auto\"> node is what we were referring to earlier as Allocation Merge. The <\/span><i><span data-contrast=\"auto\">Phi<\/span><\/i><span data-contrast=\"auto\"> node itself is also versioned and so the instructions after it, in this case, refer to a single version. Just for completeness, the AddP instruction computes the address of a field of an object, the Load instruction fetches data from that field, and the Return instruction represents returning some data to the caller of the method.\u00a0<\/span><span data-ccp-props=\"{&quot;335551550&quot;:6,&quot;335551620&quot;:6}\">\u00a0<\/span><span data-ccp-props=\"{&quot;335551550&quot;:6,&quot;335551620&quot;:6}\">\u00a0<\/span><\/p>\n<p><span data-ccp-props=\"{&quot;335551550&quot;:2,&quot;335551620&quot;:2}\"><a href=\"https:\/\/devblogs.microsoft.com\/java\/wp-content\/uploads\/sites\/51\/2024\/06\/Untitled-1.png\"><img decoding=\"async\" class=\" wp-image-231155 aligncenter\" src=\"https:\/\/devblogs.microsoft.com\/java\/wp-content\/uploads\/sites\/51\/2024\/06\/Untitled-1-231x300.png\" alt=\"Image Untitled 1\" width=\"325\" height=\"422\" srcset=\"https:\/\/devblogs.microsoft.com\/java\/wp-content\/uploads\/sites\/51\/2024\/06\/Untitled-1-231x300.png 231w, https:\/\/devblogs.microsoft.com\/java\/wp-content\/uploads\/sites\/51\/2024\/06\/Untitled-1-789x1024.png 789w, https:\/\/devblogs.microsoft.com\/java\/wp-content\/uploads\/sites\/51\/2024\/06\/Untitled-1-768x996.png 768w, https:\/\/devblogs.microsoft.com\/java\/wp-content\/uploads\/sites\/51\/2024\/06\/Untitled-1.png 952w\" sizes=\"(max-width: 325px) 100vw, 325px\" \/><\/a>\n<\/span><\/p>\n<p style=\"text-align: center;\"><b><span data-contrast=\"auto\">Figure 1:<\/span><\/b><span data-contrast=\"auto\"> How C2 internally represents part of the <\/span><i><span data-contrast=\"auto\">whichPayload<\/span><\/i><span data-contrast=\"auto\"> method.<\/span><span data-ccp-props=\"{&quot;335551550&quot;:2,&quot;335551620&quot;:2}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">Scalar replacement, however, needs to know exactly which object is being used at every point where it needs to patch the code. If there is ambiguity, or a <\/span><i><span data-contrast=\"auto\">Phi<\/span><\/i><span data-contrast=\"auto\"> node is present, that involves the object that we want to scalar replace then we need to use a more complex algorithm to decide which object it should use and moreover to patch the code of the method<\/span><b><span data-contrast=\"auto\">.<\/span><\/b><span data-ccp-props=\"{&quot;335551550&quot;:6,&quot;335551620&quot;:6}\">\u00a0<\/span><\/p>\n<p style=\"text-align: justify;\"><span data-contrast=\"auto\">We tried different approaches but, in the end, it turns out that the most obvious solution was the easiest one to implement and the one that performed better in the benchmarks. The idea is very simple: instead of using the conditional to only decide which object we should load the field from, we replicate the field load itself inside the branches of the conditional so that every branch has a field load and then the <\/span><i><span data-contrast=\"auto\">Phi<\/span><\/i><span data-contrast=\"auto\"> node is used to represent that we can use either of the field loads, not either of the objects.\u00a0<\/span><span data-ccp-props=\"{&quot;335551550&quot;:6,&quot;335551620&quot;:6}\">\u00a0<\/span><\/p>\n<p><span data-ccp-props=\"{}\"><a href=\"https:\/\/devblogs.microsoft.com\/java\/wp-content\/uploads\/sites\/51\/2024\/06\/Figure2.png\"><img decoding=\"async\" class=\" wp-image-231157 aligncenter\" src=\"https:\/\/devblogs.microsoft.com\/java\/wp-content\/uploads\/sites\/51\/2024\/06\/Figure2-231x300.png\" alt=\"Image Figure2\" width=\"353\" height=\"458\" srcset=\"https:\/\/devblogs.microsoft.com\/java\/wp-content\/uploads\/sites\/51\/2024\/06\/Figure2-231x300.png 231w, https:\/\/devblogs.microsoft.com\/java\/wp-content\/uploads\/sites\/51\/2024\/06\/Figure2-789x1024.png 789w, https:\/\/devblogs.microsoft.com\/java\/wp-content\/uploads\/sites\/51\/2024\/06\/Figure2-768x996.png 768w, https:\/\/devblogs.microsoft.com\/java\/wp-content\/uploads\/sites\/51\/2024\/06\/Figure2.png 952w\" sizes=\"(max-width: 353px) 100vw, 353px\" \/><\/a><\/span><\/p>\n<p style=\"text-align: center;\"><b><span data-contrast=\"auto\">Figure 2:<\/span><\/b><span data-contrast=\"auto\"> How we solved the allocation merge problem.<\/span><span data-ccp-props=\"{&quot;335551550&quot;:2,&quot;335551620&quot;:2}\">\u00a0<\/span><\/p>\n<p style=\"text-align: justify;\"><span data-contrast=\"auto\">Figure 2 shows an illustration of what I am referring to and Listing 3 shows the Java version of it. Note that there are 2 field loads, one for each object, and the <\/span><i><span data-contrast=\"auto\">Phi <\/span><\/i><span data-contrast=\"auto\">node now uses the values loaded from each of the objects. So now what we are doing is using the <\/span><i><span data-contrast=\"auto\">Phi<\/span><\/i><span data-contrast=\"auto\"> node to decide which of the loaded values we should return, not which of the objects we should load the field from. After that transformation, the earlier scalar replacement logic in C2 can scalar replace both objects represented at the top because the objects themselves are not used by <\/span><i><span data-contrast=\"auto\">Phi <\/span><\/i><span data-contrast=\"auto\">nodes. Listing 4 shows the code after scalar replacement and after other optimizations kick in to just simplify the code.<\/span><span data-ccp-props=\"{&quot;335551550&quot;:6,&quot;335551620&quot;:6}\">\u00a0<\/span><\/p>\n<p><a href=\"https:\/\/devblogs.microsoft.com\/java\/wp-content\/uploads\/sites\/51\/2024\/06\/carbon-3.png\"><img decoding=\"async\" class=\"wp-image-231239 aligncenter\" src=\"https:\/\/devblogs.microsoft.com\/java\/wp-content\/uploads\/sites\/51\/2024\/06\/carbon-3-300x150.png\" alt=\"Image carbon 3\" width=\"662\" height=\"331\" srcset=\"https:\/\/devblogs.microsoft.com\/java\/wp-content\/uploads\/sites\/51\/2024\/06\/carbon-3-300x150.png 300w, https:\/\/devblogs.microsoft.com\/java\/wp-content\/uploads\/sites\/51\/2024\/06\/carbon-3-1024x513.png 1024w, https:\/\/devblogs.microsoft.com\/java\/wp-content\/uploads\/sites\/51\/2024\/06\/carbon-3-768x385.png 768w, https:\/\/devblogs.microsoft.com\/java\/wp-content\/uploads\/sites\/51\/2024\/06\/carbon-3.png 1130w\" sizes=\"(max-width: 662px) 100vw, 662px\" \/><\/a><\/p>\n<p style=\"text-align: center;\"><span data-ccp-props=\"{&quot;335551550&quot;:2,&quot;335551620&quot;:2}\">\u00a0<\/span><b><span data-contrast=\"auto\">Listing 3:<\/span><\/b><span data-contrast=\"auto\"> Merging field loads instead of objects.<\/span><span data-ccp-props=\"{&quot;335551550&quot;:2,&quot;335551620&quot;:2}\">\u00a0<\/span><\/p>\n<p style=\"text-align: center;\"><span data-ccp-props=\"{&quot;335551550&quot;:2,&quot;335551620&quot;:2}\"><a href=\"https:\/\/devblogs.microsoft.com\/java\/wp-content\/uploads\/sites\/51\/2024\/06\/listing4.png\"><img decoding=\"async\" class=\" wp-image-231234 aligncenter\" src=\"https:\/\/devblogs.microsoft.com\/java\/wp-content\/uploads\/sites\/51\/2024\/06\/listing4-300x125.png\" alt=\"Image listing4\" width=\"610\" height=\"254\" srcset=\"https:\/\/devblogs.microsoft.com\/java\/wp-content\/uploads\/sites\/51\/2024\/06\/listing4-300x125.png 300w, https:\/\/devblogs.microsoft.com\/java\/wp-content\/uploads\/sites\/51\/2024\/06\/listing4-1024x427.png 1024w, https:\/\/devblogs.microsoft.com\/java\/wp-content\/uploads\/sites\/51\/2024\/06\/listing4-768x320.png 768w, https:\/\/devblogs.microsoft.com\/java\/wp-content\/uploads\/sites\/51\/2024\/06\/listing4.png 1160w\" sizes=\"(max-width: 610px) 100vw, 610px\" \/><\/a><\/span><b><span data-contrast=\"auto\">Listing 4:<\/span><\/b><span data-contrast=\"auto\"> After scalar replacing untangled allocation merges.<\/span><span data-ccp-props=\"{&quot;335551550&quot;:2,&quot;335551620&quot;:2}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">As I mentioned earlier, the idea was simple. However, as expected the implementation was the most difficult part. I showed just the overall idea that we used to solve the problem and the simplest cases where objects are used only to load a single field. In practice there are many more different use cases that we need to handle. The most obvious one is when you are loading several fields from the objects or storing values on these fields, as well as a mix of these operations. Other use cases, like the objects being used as a Monitor, also must be considered. Not only was the way that the objects are being used a challenge but getting the representation of the method updated correctly was particularly challenging. In practice these representations can have thousands of nodes. It is also necessary to cooperate with other optimizations that change the representation before and after scalar replacement.\u00a0<\/span><span data-ccp-props=\"{&quot;335551550&quot;:6,&quot;335551620&quot;:6}\">\u00a0<\/span><\/p>\n<p style=\"text-align: justify;\"><span data-contrast=\"auto\">Two other big challenges were the handling of the optimizations, like what happens when compiled code is executing and something happens that needs to cause the execution of the method to go to the interpreter. That was something that needed a lot of work and we had to do it before everything else that we worked on. The other major challenge was updating and\/or traversing the memory graph. By that I mean which value I should use when I scalar replace an object in the example that I showed earlier &#8211; the field loads and the use of the data were very close, but in a bigger method these things can be very far apart so you need to traverse the graph and find the right node that has the correct value that you should use for the field. Of course, we should not break any of the existing code, especially not the existing scalar replacement code.<\/span><span data-ccp-props=\"{&quot;335551550&quot;:6,&quot;335551620&quot;:6}\">\u00a0<\/span><\/p>\n<h1 aria-level=\"2\"><span data-contrast=\"none\">How to Give it a Try<\/span><span data-ccp-props=\"{&quot;134245418&quot;:true,&quot;134245529&quot;:true,&quot;335559738&quot;:160,&quot;335559739&quot;:80}\">\u00a0<\/span><\/h2>\n<p style=\"text-align: justify;\"><span data-contrast=\"auto\">These improvements are enabled by default in OpenJDK Tip and Microsoft Build of OpenJDK versions 11, 17 and 21. To use them, download our JDK and launch your application as you normally would! If you want to <em>disable<\/em> the optimization, let\u2019s say for benchmarking purposes, or because you found an issue, you can do so by adding these options to the JVM launch configuration: <\/span><i><span data-contrast=\"auto\">-XX:+UnlockDiagnosticVMOptions \u2013XX:-ReduceAllocationMerges<\/span><\/i><span data-contrast=\"auto\">.<\/span><span data-ccp-props=\"{&quot;335551550&quot;:6,&quot;335551620&quot;:6}\">\u00a0<\/span><\/p>\n<h1 aria-level=\"2\">Conclusion<\/h2>\n<p style=\"text-align: justify;\">In summary, by transforming object allocation merges into merges of object fields, we make it possible for the C2 compiler to scalar replace the objects and also perform additional optimizations to the code. These enhancements make applications run smoother and faster, using resources more effectively. <em><strong>In the final part of this series, we\u2019ll share the results of our work and how these changes positively impact application performance.<\/strong><\/em><\/p>\n","protected":false},"excerpt":{"rendered":"<p>In the previous part of this blog series, we explored the foundational concepts and purpose of scalar replacement (SR) in OpenJDK, laying the groundwork for understanding how this optimization can boost the performance of Java applications. Now, in the second installment of the series, we shift our focus to the specific enhancements that we have [&hellip;]<\/p>\n","protected":false},"author":125793,"featured_media":227205,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"image","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[751,248,319,819],"class_list":["post-231151","post","type-post","status-publish","format-image","has-post-thumbnail","hentry","category-java","tag-garbage-collection","tag-java","tag-openjdk","tag-scalar-replacement","post_format-post-format-image"],"acf":[],"blog_post_summary":"<p>In the previous part of this blog series, we explored the foundational concepts and purpose of scalar replacement (SR) in OpenJDK, laying the groundwork for understanding how this optimization can boost the performance of Java applications. Now, in the second installment of the series, we shift our focus to the specific enhancements that we have [&hellip;]<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/java\/wp-json\/wp\/v2\/posts\/231151","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/java\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/java\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/java\/wp-json\/wp\/v2\/users\/125793"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/java\/wp-json\/wp\/v2\/comments?post=231151"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/java\/wp-json\/wp\/v2\/posts\/231151\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/java\/wp-json\/wp\/v2\/media\/227205"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/java\/wp-json\/wp\/v2\/media?parent=231151"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/java\/wp-json\/wp\/v2\/categories?post=231151"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/java\/wp-json\/wp\/v2\/tags?post=231151"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}