{"id":35259,"date":"2025-05-13T10:00:25","date_gmt":"2025-05-13T10:00:25","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/cppblog\/?p=35259"},"modified":"2025-04-23T18:37:36","modified_gmt":"2025-04-23T18:37:36","slug":"introducing-the-forceinterlockedfunctions-switch-for-arm64","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/cppblog\/introducing-the-forceinterlockedfunctions-switch-for-arm64\/","title":{"rendered":"Introducing the \/forceInterlockedFunctions switch for ARM64"},"content":{"rendered":"<p>In Visual Studio 2022 17.14, we are introducing the <code>\/forceInterlockedFunctions[-]<\/code> switch, which generates and links with out-of-line atomics that select Armv8.1+ Large System Extension (LSE) atomic instructions based on CPU support.<\/p>\n<p>This switch is on by default for Armv8.0 and off for Armv8.1+. Outlining is necessary in Armv8.0 because this version&#8217;s interlocked intrinsics use exclusive instructions\u2014<code>LoadExcl<\/code>\/<code>StoreExcl<\/code>\u2014that do not guarantee forward progress. This can cause performance issues due to intermittent livelocks. <a href=\"https:\/\/developer.arm.com\/documentation\/ddi0487\/ka\">See Arm Architecture Reference Manual for A-profile architecture<\/a>, section &#8220;B2.17.5 Load-Exclusive and Store-Exclusive instruction usage restrictions&#8221; for examples of when the <code>LoadExcl<\/code>\/<code>StoreExcl<\/code> loop may not make forward progress.<\/p>\n<p>Below is an example of code that was previously generated when using the <code>_InterlockedAdd64<\/code> intrinsic. You can see the <code>ldaxr<\/code> and <code>stlxr<\/code> instructions being used.<\/p>\n<table style=\"width: 74.6316%;\">\n<tbody>\n<tr>\n<td style=\"width: 49.3177%;\">Main.cpp<\/td>\n<td style=\"width: 115.99%;\">Main.asm snippet<\/td>\n<\/tr>\n<tr>\n<td style=\"width: 49.3177%;\">\n<pre class=\"prettyprint language-cpp\"><code class=\"language-cpp\">#include &lt;intrin.h&gt;\r\n#include &lt;stdio.h&gt;\r\n#include &lt;Windows.h&gt;\r\n\r\nvoid main() {\r\n    volatile __int64 Addend = 5;\r\n    __int64 Value = 1; _InterlockedAdd64(&amp;Addend, Value);\r\n}<\/code><\/pre>\n<\/td>\n<td style=\"width: 115.99%;\">\n<pre class=\"prettyprint language-default\"><code class=\"language-default\">; _InterlockedAdd64(&amp;Addend, Value);\r\nldr x10,[sp]\r\nadd x9,sp,#8\r\n|$LN3@main|\r\nldaxr x8,[x9]\r\nadd x8,x8,x10\r\nstlxr wip0,x8,[x9]\r\ncbnz wip0,|$LN3@main|\r\ndmb ish<\/code><\/pre>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>With the <code>\/forceInterlockedFunctions<\/code> option you can see that the <code>ldaxr<\/code> and <code>stlxr<\/code> are gone and have been replaced with a <code>bl _InterlockedAdd64<\/code>\u00a0instruction<\/p>\n<table>\n<tbody>\n<tr>\n<td>Main.cpp<\/td>\n<td>Main.asm snippet<\/td>\n<\/tr>\n<tr>\n<td>\n<pre class=\"prettyprint language-cpp\"><code class=\"language-cpp\">#include &lt;intrin.h&gt;\r\n#include &lt;stdio.h&gt;\r\n#include &lt;Windows.h&gt;\r\n\r\nvoid main() {\r\n    volatile __int64 Addend = 5;\r\n    __int64 Value = 1; _InterlockedAdd64(&amp;Addend, Value);\r\n}<\/code><\/pre>\n<\/td>\n<td>\n<pre class=\"prettyprint language-default\"><code class=\"language-default\">; InterlockedAdd64(&amp;Addend, Value);\r\nldr x1,[sp,#0x10]\r\nadd x0,sp,#0x18\r\nbl _InterlockedAdd64\r\nnop<\/code><\/pre>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>The <code>\/forceInterlockedFunctions<\/code> option only applies to Arm64 and will be ignored if not applicable. Additionally, enabling the LSE feature will override the default outlining behavior in Armv8.0.<\/p>\n<p>Note that the option is on by default for all Arm64EC versions. We would not recommend turning the option off for ARM64EC, as outlining helps address the memory model differences between Arm64 and x64.<\/p>\n<p>&nbsp;<\/p>\n<p>This flag impacts the following interlocked intrinsics:<\/p>\n<p>Key:<\/p>\n<ul>\n<li>Full: supports plain,\u00a0<code>_acq<\/code>,\u00a0<code>_rel<\/code>, and\u00a0<code>_nf<\/code>\u00a0forms.<\/li>\n<li>None: Not supported<\/li>\n<\/ul>\n<table>\n<tbody>\n<tr>\n<td><strong>Operation<\/strong><\/td>\n<td><strong>8<\/strong><\/td>\n<td><strong>16<\/strong><\/td>\n<td><strong>32<\/strong><\/td>\n<td><strong>64<\/strong><\/td>\n<td><strong>128<\/strong><\/td>\n<td><strong>Pointer<\/strong><\/td>\n<\/tr>\n<tr>\n<td><code>Add<\/code><\/td>\n<td>None<\/td>\n<td>None<\/td>\n<td>Full<\/td>\n<td>Full<\/td>\n<td>None<\/td>\n<td>None<\/td>\n<\/tr>\n<tr>\n<td><code>And<\/code><\/td>\n<td>Full<\/td>\n<td>Full<\/td>\n<td>Full<\/td>\n<td>Full<\/td>\n<td>None<\/td>\n<td>None<\/td>\n<\/tr>\n<tr>\n<td><code>CompareExchange<\/code><\/td>\n<td>Full<\/td>\n<td>Full<\/td>\n<td>Full<\/td>\n<td>Full<\/td>\n<td>Full<\/td>\n<td>Full<\/td>\n<\/tr>\n<tr>\n<td><code>Decrement<\/code><\/td>\n<td>None<\/td>\n<td>Full<\/td>\n<td>Full<\/td>\n<td>Full<\/td>\n<td>None<\/td>\n<td>None<\/td>\n<\/tr>\n<tr>\n<td><code>Exchange<\/code><\/td>\n<td>Full<\/td>\n<td>Full<\/td>\n<td>Full<\/td>\n<td>Full<\/td>\n<td>None<\/td>\n<td>Full<\/td>\n<\/tr>\n<tr>\n<td><code>ExchangeAdd<\/code><\/td>\n<td>Full<\/td>\n<td>Full<\/td>\n<td>Full<\/td>\n<td>Full<\/td>\n<td>None<\/td>\n<td>None<\/td>\n<\/tr>\n<tr>\n<td><code>Increment<\/code><\/td>\n<td>None<\/td>\n<td>Full<\/td>\n<td>Full<\/td>\n<td>Full<\/td>\n<td>None<\/td>\n<td>None<\/td>\n<\/tr>\n<tr>\n<td><code>Or<\/code><\/td>\n<td>Full<\/td>\n<td>Full<\/td>\n<td>Full<\/td>\n<td>Full<\/td>\n<td>None<\/td>\n<td>None<\/td>\n<\/tr>\n<tr>\n<td><code>Xor<\/code><\/td>\n<td>Full<\/td>\n<td>Full<\/td>\n<td>Full<\/td>\n<td>Full<\/td>\n<td>None<\/td>\n<td>None<\/td>\n<\/tr>\n<tr>\n<td><code>bittestandset<\/code><\/td>\n<td>None<\/td>\n<td>None<\/td>\n<td>Full<\/td>\n<td>Full<\/td>\n<td>None<\/td>\n<td>None<\/td>\n<\/tr>\n<tr>\n<td><code>bittestandreset<\/code><\/td>\n<td>None<\/td>\n<td>None<\/td>\n<td>Full<\/td>\n<td>Full<\/td>\n<td>None<\/td>\n<td>None<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h4><strong>See also<\/strong><\/h4>\n<p><a href=\"https:\/\/learn.microsoft.com\/cpp\/build\/reference\/force-interlocked-functions?view=msvc-170\">\/forceInterlockedFunctions | Microsoft Learn<\/a><\/p>\n<p><a href=\"https:\/\/learn.microsoft.com\/cpp\/intrinsics\/arm64-intrinsics?view=msvc-170\">ARM64 intrinsics | Microsoft Learn<\/a><\/p>\n<p><a href=\"https:\/\/learn.microsoft.com\/cpp\/build\/reference\/feature-arm64?view=msvc-170\">\/feature (ARM64) | Microsoft Learn<\/a><\/p>\n<p><a href=\"https:\/\/learn.arm.com\/learning-paths\/servers-and-cloud-computing\/lse\/intro\/\">Introduction to Large System Extensions | Arm Learning Paths<\/a><\/p>\n<h4><strong>Feedback<\/strong><\/h4>\n<p>That\u2019s all about this new compiler option and default setting that you can find starting in Visual Studio 2022 version 17.14. Please give it a try and let us know how it goes! We always welcome feedback, questions, or concerns from the community, as it helps make Visual Studio better.<\/p>\n<p>Please share your thoughts, comments and questions with us through <a href=\"https:\/\/developercommunity.visualstudio.com\/home\">Developer Community<\/a>. You can also reach us on X <a href=\"https:\/\/x.com\/visualc\">@VisualC<\/a>, or via email at visualcpp@microsoft.com.<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In Visual Studio 2022 17.14, we are introducing the \/forceInterlockedFunctions[-] switch, which generates and links with out-of-line atomics that select Armv8.1+ Large System Extension (LSE) atomic instructions based on CPU support. This switch is on by default for Armv8.0 and off for Armv8.1+. Outlining is necessary in Armv8.0 because this version&#8217;s interlocked intrinsics use exclusive [&hellip;]<\/p>\n","protected":false},"author":184952,"featured_media":35994,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[270,3946,1],"tags":[],"class_list":["post-35259","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-announcement","category-backend","category-cplusplus"],"acf":[],"blog_post_summary":"<p>In Visual Studio 2022 17.14, we are introducing the \/forceInterlockedFunctions[-] switch, which generates and links with out-of-line atomics that select Armv8.1+ Large System Extension (LSE) atomic instructions based on CPU support. This switch is on by default for Armv8.0 and off for Armv8.1+. Outlining is necessary in Armv8.0 because this version&#8217;s interlocked intrinsics use exclusive [&hellip;]<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/posts\/35259","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/users\/184952"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/comments?post=35259"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/posts\/35259\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/media\/35994"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/media?parent=35259"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/categories?post=35259"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/tags?post=35259"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}