{"id":1165,"date":"2018-04-11T01:06:58","date_gmt":"2018-04-10T17:06:58","guid":{"rendered":"https:\/\/blogs.msdn.microsoft.com\/seteplia\/?p=1165"},"modified":"2019-06-11T21:48:16","modified_gmt":"2019-06-12T04:48:16","slug":"performance-traps-of-ref-locals-and-ref-returns-in-c","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/premier-developer\/performance-traps-of-ref-locals-and-ref-returns-in-c\/","title":{"rendered":"Performance traps of ref locals and ref returns in C#"},"content":{"rendered":"<p>The C# language from the very first version supported passing arguments by value or by reference. But before C# 7 the C# compiler supported only one way of returning a value from a method (or a property) &#8211; returning by value. This has been changed in C# 7 with two new features: ref returns and ref locals.<\/p>\n<p>But unlike other features that were recently added to the C# language I&#8217;ve found these two a bit more controversial than the others.<\/p>\n<h4>The motivation<\/h4>\n<p>There are many differences between the arrays and other collections from the CLR perspectives. The arrays were added to the CLR from the very beginning and you can think of them as of built-in generics. The CLR and the JIT-compiler are aware of the arrays but besides that, they&#8217;re special in one more aspect: <strong>the indexer of the array returns the element by reference, not by value<\/strong>.<\/p>\n<p>To demonstrate this behavior we have to go to the dark side &#8212; use a mutable value type:<\/p>\n<pre class=\"lang:default decode:true \">public struct Mutable\r\n{\r\n    private int _x;\r\n    public Mutable(int x) =&gt; _x = x;\r\n \r\n    public int X =&gt; _x;\r\n \r\n    public void IncrementX() { _x++; }\r\n}\r\n \r\n[Test]\r\npublic void CheckMutability()\r\n{\r\n    var ma = new[] {new Mutable(1)};\r\n    ma[0].IncrementX();\r\n    \/\/ X has been changed!\r\n    Assert.That(ma[0].X, Is.EqualTo(2));\r\n \r\n    var ml = new List&lt;Mutable&gt; {new Mutable(1)};\r\n    ml[0].IncrementX();\r\n    \/\/ X hasn't been changed!\r\n    Assert.That(ml[0].X, Is.EqualTo(1));\r\n}<\/pre>\n<p>The test will pass because the indexer of the array is quite different from the indexer of the <code>List&lt;T&gt;<\/code>.<\/p>\n<p>The C# compiler emits a special instruction for the arrays indexer &#8211; <a href=\"https:\/\/msdn.microsoft.com\/en-us\/library\/system.reflection.emit.opcodes.ldelema(v=vs.110).aspx\"><code>ldelema<\/code><\/a> that returns a managed reference to a given array&#8217;s element. Basically, array indexer returns an element by reference. But <code>List&lt;T&gt;<\/code> can&#8217;t have the same behavior because it wasn&#8217;t possible (*) to return an alias to the internal state in C#. That&#8217;s why the <code>List&lt;T&gt;<\/code> indexer returns the element by value, i.e. returning the copy of the given element.<\/p>\n<p>(*) As we&#8217;ll see in a moment, it is still impossible for the <code>List&lt;T&gt;<\/code>&#8216;s indexer to return an element by reference.<\/p>\n<p>This means that <code>ma[0].IncrementX()<\/code> calls a mutation method on the first element inside of the array, but <code>ml[0].IncrementX()<\/code> calls a mutation method on a copy, keeping the original list unchanged.<\/p>\n<h4>Ref locals and ref returns 101<\/h4>\n<p>The basic idea behind these features is very simple: <code>ref return<\/code>allows to return an alias to an existing variable and ref local can store the alias in a local variable.<\/p>\n<ol>\n<li>Simple example<\/li>\n<\/ol>\n<pre class=\"lang:default decode:true\">[Test]\r\npublic void RefLocalsAndRefReturnsBasics()\r\n{\r\n    int[] array = { 1, 2 };\r\n \r\n    \/\/ Capture an alias to the first element into a local\r\n    ref int first = ref array[0];\r\n    first = 42;\r\n    Assert.That(array[0], Is.EqualTo(42));\r\n \r\n    \/\/ Local function that returns the first element by ref\r\n    ref int GetByRef(int[] a) =&gt; ref a[0];\r\n    \/\/ Weird syntax: the result of a function call is assignable\r\n    GetByRef(array) = -1;\r\n    Assert.That(array[0], Is.EqualTo(-1));\r\n}<\/pre>\n<ol start=\"2\">\n<li>Ref returns and readonly ref returns<\/li>\n<\/ol>\n<p>Ref returns can return an alias to instance fields and starting from C# 7.2 you can return a readonly alias using <code>ref readonly<\/code>:<\/p>\n<pre class=\"lang:default decode:true \">class EncapsulationWentWrong\r\n{\r\n    private readonly Guid _guid;\r\n    private int _x;\r\n \r\n    public EncapsulationWentWrong(int x) =&gt; _x = x;\r\n \r\n    \/\/ Return an alias to the private field. No encapsulation any more.\r\n    public ref int X =&gt; ref _x;\r\n \r\n    \/\/ Return a readonly alias to the private field.\r\n    public ref readonly Guid Guid =&gt; ref _guid;\r\n}\r\n \r\n[Test]\r\npublic void NoEncapsulation()\r\n{\r\n    var instance = new EncapsulationWentWrong(42);\r\n    instance.X++;\r\n \r\n    Assert.That(instance.X, Is.EqualTo(43));\r\n \r\n    \/\/ Cannot assign to property 'EncapsulationWentWrong.Guid' because it is a readonly variable\r\n    \/\/ instance.Guid = Guid.Empty;\r\n}<\/pre>\n<ul>\n<li>Methods and properties could return an &#8220;alias&#8221; to an internal state. <strong>The property, in this case, could not have a setter.<\/strong><\/li>\n<li>Return by reference breaks the encapsulation because the client obtains the full control over the object&#8217;s internal state.<\/li>\n<li>Returning by readonly reference avoids a redundant copy for value types but prevents the client from mutating the internal state.<\/li>\n<li>You may use ref readonly for reference types even though it makes no sense for non-generic cases.<\/li>\n<\/ul>\n<ol start=\"3\">\n<li>Existing restrictions Returning an alias could be dangerous: using an alias to a stack-allocated variable after a method is finished will crash the app. To make the feature safe, the C# compiler enforces various restrictions:<\/li>\n<\/ol>\n<ul>\n<li>You can not return a reference to a local variable.<\/li>\n<li>You can not return a reference to <code>this<\/code> in structs.<\/li>\n<li>You can return a reference to heap-allocated variable (like class members).<\/li>\n<li>You can return a reference to ref\/out parameters.<\/li>\n<\/ul>\n<p>For more information see an amazing post <a href=\"http:\/\/mustoverride.com\/safe-to-return\/\">Safe to return rules for ref returns<\/a> by Vladimir Sadov, the author of this feature in the C# compiler.<\/p>\n<p>Now, once we know what these features are, let&#8217;s see when they can be useful.<\/p>\n<h4>Using ref returns for indexers<\/h4>\n<p>To test the performance impact of these features we&#8217;re going to create a custom immutable collection called <code>NaiveImmutableList&lt;T&gt;<\/code> and will compare it with the <code>T[]<\/code> and the <code>List&lt;T&gt;<\/code> for structs of different sizes (4, 16, 32 and 48).<\/p>\n<pre class=\"lang:default decode:true\">public class NaiveImmutableList&lt;T&gt;\r\n{\r\n    private readonly int _length;\r\n    private readonly T[] _data;\r\n    public NaiveImmutableList(params T[] data) \r\n        =&gt; (_data, _length) = (data, data.Length);\r\n \r\n    public ref readonly T this[int idx]\r\n        \/\/ R# 2017.3.2 is completely confused with this syntax!\r\n        \/\/ =&gt; ref (idx &gt;= _length ? ref Throw() : ref _data[idx]);\r\n        {\r\n            get\r\n            {\r\n                \/\/ Extracting 'throw' statement into a different\r\n                \/\/ method helps the jitter to inline a property access.\r\n                if ((uint)idx &gt;= (uint)_length)\r\n                    ThrowIndexOutOfRangeException();\r\n \r\n                return ref _data[idx];\r\n            }\r\n        }\r\n \r\n    private static void ThrowIndexOutOfRangeException() =&gt;\r\n        throw new IndexOutOfRangeException();\r\n}\r\n \r\nstruct LargeStruct_48\r\n{\r\n    public int N { get; }\r\n    private readonly long l1, l2, l3, l4, l5;\r\n \r\n    public LargeStruct_48(int n) : this()\r\n        =&gt; N = n;\r\n}\r\n \r\n\/\/ Other structs like LargeStruct_16, LargeStruct_32 etc<\/pre>\n<p>The benchmarks iterate over the collections and sum all the <code>N<\/code>property values for each elements:<\/p>\n<pre class=\"lang:default decode:true \">private const int elementsCount = 100_000;\r\nprivate static LargeStruct_48[] CreateArray_48() =&gt; \r\n    Enumerable.Range(1, elementsCount).Select(v =&gt; new LargeStruct_48(v)).ToArray();\r\nprivate readonly LargeStruct_48[] _array48 = CreateArray_48();\r\n \r\n[BenchmarkCategory(\"BigStruct_48\")]\r\n[Benchmark(Baseline = true)]\r\npublic int TestArray_48()\r\n{\r\n    int result = 0;\r\n    \/\/ Using elementsCound but not array.Length to force the bounds check\r\n    \/\/ on each iteration.\r\n    for (int i = 0; i &lt; elementsCount; i++)\r\n    {\r\n        result = _array48[i].N;\r\n    }\r\n \r\n    return result;\r\n}<\/pre>\n<p>And here the results:<\/p>\n<pre class=\"lang:default decode:true \">Method | Mean | Scaled | -------------------------- |---------:|-------:| \r\nTestArray_48 | 258.3 us | 1.00 | \r\nTestListOfT_48 | 488.9 us | 1.89 | \r\nTestNaiveImmutableList_48 | 444.8 us | 1.72 | \r\n| | | \r\nTestArray_32 | 174.4 us | 1.00 | \r\nTestListOfT_32 | 233.8 us | 1.34 | \r\nTestNaiveImmutableList_32 | 219.2 us | 1.26 | \r\n| | | \r\nTestArray_16 | 143.7 us | 1.00 | \r\nTestListOfT16 | 192.5 us | 1.34 | \r\nTestNaiveImmutableList16 | 167.8 us | 1.17 | \r\n| | | \r\nTestArray_4 | 121.7 us | 1.00 | \r\nTestListOfT_4 | 174.7 us | 1.44 | \r\nTestNaiveImmutableList_4 | 133.1 us | 1.09 |<\/pre>\n<p>Apparently, something is wrong! Our <code>NaiveImmutableList&lt;T&gt;<\/code>has effectively the same performance characteristics as <code>List&lt;T&gt;<\/code>. What happened?<\/p>\n<h4>Readonly ref returns under the hood<\/h4>\n<p>As you may noticed, the indexer of <code>NaiveImmutableList&lt;T&gt;<\/code>returns a readonly reference via <code>ref readonly<\/code>. This makes perfect sense because we want to restrict our clients from mutating the underlying state of the immutable collection. But the structs we&#8217;ve been using in our benchmarks are regular non-readonly structs.<\/p>\n<p>The following test will help us understand the underlying behavior:<\/p>\n<pre class=\"lang:default decode:true \">[Test]\r\npublic void CheckMutabilityForNaiveImmutableList()\r\n{\r\n    var ml = new NaiveImmutableList&lt;Mutable&gt;(new Mutable(1));\r\n    ml[0].IncrementX();\r\n    \/\/ X has been changed, right?\r\n    Assert.That(ml[0].X, Is.EqualTo(2));\r\n}<\/pre>\n<p>The test fails! Why? Because &#8220;readonly references&#8221; are similar to <code>in<\/code>-modifiers and <code>readonly<\/code> fields in respect to structs: the compiler emits a defensive copy every time a struct member is used. It means that <code>ml[0].<\/code> still creates a copy of the first element but not by the indexer: the copy is created in the call site.<\/p>\n<p>In fact, the behavior is very reasonable. The C# compiler supports passing arguments by value, by reference, and by &#8220;readonly reference&#8221; using <code>in<\/code>-modifier (for more details see my post <a href=\"https:\/\/blogs.msdn.microsoft.com\/seteplia\/2018\/03\/07\/the-in-modifier-and-the-readonly-structs-in-c\/\">The <code>in<\/code>-modifier and the readonly structs in C#<\/a>). And now the compiler supports 3 different ways of returning a value from a method: by value, by reference and by readonly reference.<\/p>\n<p>&#8220;Readonly references&#8221; are so similar, that the compiler reuses the same <code>InAttribute<\/code> to distinguish readonly and non-readonly return values:<\/p>\n<pre class=\"lang:default decode:true \">private int _n;\r\npublic ref readonly int ByReadonlyRef() =&gt; ref _n;<\/pre>\n<p>In this case the method <code>ByReadonlyRef<\/code> is effectively compiled to:<\/p>\n<pre class=\"lang:default decode:true \">[InAttribute]\r\n[return: IsReadOnly]\r\npublic int* ByReadonlyRef()\r\n{\r\n    return ref this._n;\r\n}<\/pre>\n<p>The similarity between <code>in<\/code>-modifier and readonly references means that these features are not friendly to regular structs and could cause performance issues. Here is an example:<\/p>\n<pre class=\"lang:default decode:true \">public struct BigStruct\r\n{\r\n    \/\/ Other fields\r\n    public int X { get; }\r\n    public int Y { get; }\r\n}\r\n \r\nprivate BigStruct _bigStruct;\r\npublic ref readonly BigStruct GetBigStructByRef() =&gt; ref _bigStruct;\r\n \r\nref readonly var bigStruct = ref GetBigStructByRef();\r\nint result = bigStruct.X + bigStruct.Y;<\/pre>\n<p>Besides a weird syntax of variable declaration for <code>bigStruct<\/code> the code looks good. The intent is clear: <code>BigStruct<\/code> is returned by reference for performance reasons. Unfortunately, because <code>BigStruct<\/code> is a non-readonly struct, each time a member is accessed, the defensive copy is created.<\/p>\n<h4>Using ref returns for indexers. Attempt #2<\/h4>\n<p>Let&#8217;s try the same set of benchmarks with <strong>readonly structs<\/strong> of different sizes:<\/p>\n<pre class=\"lang:default decode:true \">Method | Mean | Scaled | \r\n-------------------------- |---------:|-------:| \r\nTestArray_48 | 265.1 us | 1.00 | \r\nTestListOfT_48 | 490.6 us | 1.85 | \r\nTestNaiveImmutableList_48 | 300.6 us | 1.13 | \r\n| | | \r\nTestArray_32 | 177.8 us | 1.00 | \r\nTestListOfT_32 | 233.4 us | 1.31 | \r\nTestNaiveImmutableList_32 | 218.0 us | 1.23 | \r\n| | | \r\nTestArray_16 | 144.7 us | 1.00 | \r\nTestListOfT16 | 191.8 us | 1.33 | \r\nTestNaiveImmutableList16 | 168.8 us | 1.17 | \r\n| | | \r\nTestArray_4 | 121.3 us | 1.00 | \r\nTestListOfT_4 | 178.9 us | 1.48 | \r\nTestNaiveImmutableList_4 | 145.3 us | 1.20 |<\/pre>\n<p>Now the results make much more sense. The time still grows for bigger structs, but that is expected because iterating over 100K structs of bigger size take a longer amount of time. But now the timings for <code>NaiveimmutableList&lt;T&gt;<\/code> is very close to <code>T[]<\/code> and reasonably faster than <code>List&lt;T&gt;<\/code>.<\/p>\n<h4>Conclusion<\/h4>\n<ul>\n<li>Be cautious with ref returns because they can break encapsulation.<\/li>\n<li>Be cautious with readonly ref returns because they&#8217;re more performant only for readonly structs and could cause performance issues for regular structs.<\/li>\n<li>Be cautious with readonly ref locals because they also could cause performance issues for non-readonly structs causing defensive copy each time the variable is used.<\/li>\n<\/ul>\n<p>Ref locals and ref returns are useful features for library authors and developers working on infrastructure code. But in the case of library code, these features are quite dangerous: in order to use a collection that returns elements by readonly reference efficiently every library user should know the implications: readonly reference for a non-readonly struct causes a defensive copy &#8220;at the call site&#8221;. This can negate all performance gains at best, or can cause severe perf degradation when a readonly ref local variable is accessed multiple times.<\/p>\n<p>P.S. Readonly references are coming to the BCL. The following PR for corefx repo (<a href=\"https:\/\/github.com\/dotnet\/corefx\/pull\/25738\/files#diff-fa508ecac55e620b269a8853de2cfd66\">Implementing ItemRef API Proposal<\/a>) introduced readonly ref methods to access the elements of immutable collections. So it is quite important for everyone to understand the implication of these features and to understand how to to use it and when to use it.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The C# language from the very first version supported passing arguments by value or by reference. But before C# 7 the C# compiler supported only one way of returning a value from a method (or a property) &#8211; returning by value. This has been changed in C# 7 with two new features: ref returns and [&hellip;]<\/p>\n","protected":false},"author":4004,"featured_media":37840,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[6699,128],"tags":[6695],"class_list":["post-1165","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-c","category-performance","tag-seteplia"],"acf":[],"blog_post_summary":"<p>The C# language from the very first version supported passing arguments by value or by reference. But before C# 7 the C# compiler supported only one way of returning a value from a method (or a property) &#8211; returning by value. This has been changed in C# 7 with two new features: ref returns and [&hellip;]<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-json\/wp\/v2\/posts\/1165","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-json\/wp\/v2\/users\/4004"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-json\/wp\/v2\/comments?post=1165"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-json\/wp\/v2\/posts\/1165\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-json\/wp\/v2\/media\/37840"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-json\/wp\/v2\/media?parent=1165"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-json\/wp\/v2\/categories?post=1165"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-json\/wp\/v2\/tags?post=1165"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}