{"id":1035,"date":"2017-10-16T11:43:23","date_gmt":"2017-10-16T03:43:23","guid":{"rendered":"https:\/\/blogs.msdn.microsoft.com\/seteplia\/?p=1035"},"modified":"2019-06-11T22:26:22","modified_gmt":"2019-06-12T05:26:22","slug":"dissecting-the-pattern-matching-in-c-7","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/premier-developer\/dissecting-the-pattern-matching-in-c-7\/","title":{"rendered":"Dissecting the pattern matching in C# 7"},"content":{"rendered":"<p>C# 7 finally introduced a long-awaited feature called &#8220;pattern matching&#8221;. If you&#8217;re familiar with functional languages like F# you may be slightly disappointed with this feature in its current state, but even today it can simplify your code in a variety of different scenarios.<\/p>\n<p>Every new feature is fraught with danger for a developer working on a performance critical application. New levels of abstractions are good but in order to use them effectively, you should know what is happening under the hood. Today we&#8217;re going to explore pattern matching and look under the covers to understand how it is implemented.<\/p>\n<p>The C# language introduced the notion of a pattern that can be used in <code>is<\/code>-expression and inside a <code>case<\/code> block of a <code>switch<\/code> statement.<\/p>\n<p>There are 3 types of patterns:<\/p>\n<ul>\n<li>The const pattern<\/li>\n<li>The type pattern<\/li>\n<li>The <code>var<\/code> pattern<\/li>\n<\/ul>\n<h4>Pattern matching in <code>is<\/code>-expressions<\/h4>\n<pre class=\"lang:default decode:true \">public void IsExpressions(object o)\r\n{\r\n    \/\/ Alternative way checking for null\r\n    if (o is null) Console.WriteLine(\"o is null\");\r\n \r\n    \/\/ Const pattern can refer to a constant value\r\n    const double value = double.NaN;\r\n    if (o is value) Console.WriteLine(\"o is value\");\r\n \r\n    \/\/ Const pattern can use a string literal\r\n    if (o is \"o\") Console.WriteLine(\"o is \\\"o\\\"\");\r\n \r\n    \/\/ Type pattern\r\n    if (o is int n) Console.WriteLine(n);\r\n \r\n    \/\/ Type pattern and compound expressions\r\n    if (o is string s &amp;&amp; s.Trim() != string.Empty)\r\n        Console.WriteLine(\"o is not blank\");\r\n}<\/pre>\n<p><code>is<\/code>-expression can check if the value is equal to a constant and a type check can optionally specify the <strong>pattern variable<\/strong>.<\/p>\n<p>I&#8217;ve found few interesting aspects related to pattern matching in <code>is<\/code>-expressions:<\/p>\n<ul>\n<li>Variable introduced in an <code>if<\/code> statement is lifted to the outer scope.<\/li>\n<li>Variable introduced in an <code>if<\/code> statement is definitely assigned only when the pattern is matched.<\/li>\n<li>Current implementation of the const pattern matching in <code>is<\/code>-expressions is not very efficient.<\/li>\n<\/ul>\n<p>Let&#8217;s check the first two cases first:<\/p>\n<pre class=\"lang:default decode:true \">public void ScopeAndDefiniteAssigning(object o)\r\n{\r\n    if (o is string s &amp;&amp; s.Length != 0)\r\n    {\r\n        Console.WriteLine(\"o is not empty string\");\r\n    }\r\n \r\n    \/\/ Can't use 's' any more. 's' is already declared in the current scope.\r\n    if (o is int n || (o is string s2 &amp;&amp; int.TryParse(s2, out n)))\r\n    {\r\n        Console.WriteLine(n);\r\n    }\r\n}<\/pre>\n<p>The first <code>if<\/code> statement introduces a variable <code>s<\/code> and the variable is visible inside the whole method. This is reasonable but will complicate the logic if the other if-statements in the same block will try to reuse the same name once again. In this case, you <strong>have<\/strong> to use another name to avoid the collision.<\/p>\n<p>The variable introduced in the <code>is<\/code>-expression is definitely assigned only when the predicate is <code>true<\/code>. It means that the <code>n<\/code> variable in the second if-statement is not assigned in the right operand but because the variable is already declared we can use it as the <code>out<\/code> variable in the <code>int.TryParse<\/code> method.<\/p>\n<p>The third aspect mentioned above is the most concerning one. Consider the following code:<\/p>\n<pre class=\"lang:default decode:true \">public void BoxTwice(int n)\r\n{\r\n    if (n is 42) Console.WriteLine(\"n is 42\");\r\n}<\/pre>\n<p>In most cases the <code>is<\/code>-expression is translated to the <code>object.Equals(constValue, variable)<\/code> (even though the spec says that <code>operator==<\/code> should be used for primitive types):<\/p>\n<pre class=\"lang:default decode:true \">public void BoxTwice(int n)\r\n{\r\n    if (object.Equals(42, n))\r\n    {\r\n        Console.WriteLine(\"n is 42\");\r\n    }\r\n}<\/pre>\n<p>This code causes 2 boxing allocations that can reasonable affect performance if used in the application&#8217;s critical path. It used to be the case that <code>o is null<\/code>was causing the boxing allocation if <code>o<\/code> is a nullable value type (see <a href=\"https:\/\/github.com\/dotnet\/roslyn\/issues\/13247\">Suboptimal code for e is null<\/a>) so I really hope that this behavior will be fixed (here is <a href=\"https:\/\/github.com\/dotnet\/roslyn\/issues\/20642\">an issue on github<\/a>).<\/p>\n<p>If the <code>n<\/code> variable is of type <code>object<\/code> the <code>o is 42<\/code> will cause one boxing allocation (for the literal <code>42<\/code>), even though the similar switch-based code would not cause any allocations.<\/p>\n<h4>The <code>var<\/code> patterns in <code>is<\/code>-expressions<\/h4>\n<p>The <code>var<\/code> pattern is a special case of the type pattern with one major distinction: the pattern will match any value, even if the value is <code>null<\/code>.<\/p>\n<pre class=\"lang:default decode:true \">public void IsVar(object o)\r\n{\r\n    if (o is var x) Console.WriteLine($\"x: {x}\");\r\n}<\/pre>\n<p><code>o is object<\/code> is <code>true<\/code> when <code>o<\/code> is not <code>null<\/code>, but <code>o is var x<\/code> is always <code>true<\/code>. The compiler knows about that and in the Release mode (*), it removes the if-clause altogether and just leaves the <code>Console<\/code> method call. Unfortunately, the compiler does not warn you that the code is unreachable in the following case: <code>if (!(o is var x)) Console.WriteLine(\"Unreachable\")<\/code>. Hopefully, this will be fixed as well.<\/p>\n<p>(*) It is not clear why the behavior is different in the Release mode only. But I think all the issues falls into the same bucket: the initial implementation of the feature is suboptimal. But based on <a href=\"https:\/\/github.com\/dotnet\/roslyn\/issues\/22654#issuecomment-336329881\">this comment<\/a> by Neal Gafter, this is going to change: &#8220;The pattern-matching lowering code is being rewritten from scratch (to support recursive patterns, too). I expect most of the improvements you seek here will come for &#8220;free&#8221; in the new code. But it will be some time before that rewrite is ready for prime time.&#8221;.<\/p>\n<p>The lack of <code>null<\/code> check makes this case very special and potentially dangerous. But if you know what exactly is going on you may find this pattern useful. It can be used for introducing a temporary variable inside the expression:<\/p>\n<div style=\";line-height: 19px; background-color: #1e1e1e;\">\n<div><span style=\"font-family: Consolas;\"><span style=\"color: #569cd6;\"><span style=\"font-size: 10pt;\">public<\/span><\/span><span style=\"font-size: 10pt;\"><span style=\"color: #d4d4d4;\"> void <\/span><span style=\"color: #dcdcaa;\">VarPattern<\/span><span style=\"color: #d4d4d4;\">(<\/span><span style=\"color: #4ec9b0;\">IEnumerable<\/span><span style=\"color: #d4d4d4;\">&lt;<\/span><span style=\"color: #569cd6;\">string<\/span><\/span><span style=\"font-size: 10pt; color: #d4d4d4;\">&gt; s)<\/span><\/span><\/div>\n<div><span style=\"font-family: Consolas;\"><span style=\"font-size: 10pt; color: #d4d4d4;\">{<\/span><\/span><\/div>\n<div><span style=\"font-family: Consolas;\"><span style=\"color: #c586c0;\"><span style=\"font-size: 10pt;\">if<\/span><\/span><span style=\"font-size: 10pt;\"><span style=\"color: #d4d4d4;\"> (<\/span><span style=\"color: #9cdcfe;\">s<\/span><span style=\"color: #d4d4d4;\">.<\/span><span style=\"color: #dcdcaa;\">FirstOrDefault<\/span><span style=\"color: #d4d4d4;\">(o =&gt; <\/span><span style=\"color: #9cdcfe;\">o<\/span><span style=\"color: #d4d4d4;\"> != <\/span><span style=\"color: #569cd6;\">null<\/span><span style=\"color: #d4d4d4;\">) <\/span><span style=\"color: #569cd6;\">is<\/span><span style=\"color: #d4d4d4;\"> var <\/span><span style=\"color: #9cdcfe;\">v<\/span><\/span> <\/span><\/div>\n<div><span style=\"font-family: Consolas;\"><span style=\"color: #d4d4d4;\"><span style=\"font-size: 10pt;\">&amp;&amp; <\/span><\/span><span style=\"font-size: 10pt;\"><span style=\"color: #9cdcfe;\">int<\/span><span style=\"color: #d4d4d4;\">.<\/span><span style=\"color: #dcdcaa;\">TryParse<\/span><span style=\"color: #d4d4d4;\">(<\/span><span style=\"color: #9cdcfe;\">v<\/span><span style=\"color: #d4d4d4;\">, <\/span><span style=\"color: #569cd6;\">out<\/span> <span style=\"color: #569cd6;\">var<\/span><\/span><span style=\"font-size: 10pt; color: #d4d4d4;\"> n))<\/span><\/span><\/div>\n<div><span style=\"font-family: Consolas;\"><span style=\"font-size: 10pt; color: #d4d4d4;\">{<\/span><\/span><\/div>\n<div><span style=\"font-family: Consolas;\"><span style=\"color: #9cdcfe;\"><span style=\"font-size: 10pt;\">Console<\/span><\/span><span style=\"font-size: 10pt;\"><span style=\"color: #d4d4d4;\">.<\/span><span style=\"color: #dcdcaa;\">WriteLine<\/span><span style=\"color: #d4d4d4;\">(<\/span><span style=\"color: #9cdcfe;\">n<\/span><\/span><span style=\"font-size: 10pt; color: #d4d4d4;\">);<\/span><\/span><\/div>\n<div><span style=\"font-family: Consolas;\"><span style=\"font-size: 10pt; color: #d4d4d4;\">}<\/span><\/span><\/div>\n<div><span style=\"font-family: Consolas;\"><span style=\"font-size: 10pt; color: #d4d4d4;\">}<\/span><\/span><\/div>\n<\/div>\n<h4><code>Is<\/code>-expression meets &#8220;Elvis&#8221; operator<\/h4>\n<p>There is another use case that I&#8217;ve found very useful. The type pattern matches the value only when the value is not <code>null<\/code>. We can use this &#8220;filtering&#8221; logic with the null-propagating operator to make a code easier to read:<\/p>\n<pre class=\"lang:default decode:true \">public void WithNullPropagation(IEnumerable&lt;string&gt; s)\r\n{\r\n    if (s?.FirstOrDefault(str =&gt; str.Length &gt; 10)?.Length is int length)\r\n    {\r\n        Console.WriteLine(length);\r\n    }\r\n \r\n    \/\/ Similar to\r\n    if (s?.FirstOrDefault(str =&gt; str.Length &gt; 10)?.Length is var length2 &amp;&amp; length2 != null)\r\n    {\r\n        Console.WriteLine(length2);\r\n    }\r\n \r\n    \/\/ And similar to\r\n    var length3 = s?.FirstOrDefault(str =&gt; str.Length &gt; 10)?.Length;\r\n    if (length3 != null)\r\n    {\r\n        Console.WriteLine(length3);\r\n    }\r\n}<\/pre>\n<p>Note, that the same pattern can be used for both &#8211; value types and reference types.<\/p>\n<h4>Pattern matching in the <code>case<\/code> blocks<\/h4>\n<p>C# 7 extends the switch statement to use patterns in the case clauses:<\/p>\n<pre class=\"lang:default decode:true \">public static int Count&lt;T&gt;(this IEnumerable&lt;T&gt; e)\r\n{\r\n    switch (e)\r\n    {\r\n        case ICollection&lt;T&gt; c: return c.Count;\r\n        case IReadOnlyCollection&lt;T&gt; c: return c.Count;\r\n        \/\/ Matches concurrent collections\r\n        case IProducerConsumerCollection&lt;T&gt; pc: return pc.Count;\r\n        \/\/ Matches if e is not null\r\n        case IEnumerable&lt;T&gt; _: return e.Count();\r\n        \/\/ Default case is handled when e is null\r\n        default: return 0;\r\n    }\r\n}<\/pre>\n<p>The example shows the first set of changes to the switch statement.<\/p>\n<ol>\n<li>A variable of any type may be used in a switch statement.<\/li>\n<li>A case clause can specify a pattern.<\/li>\n<li>The order of the case clauses matters. The compiler emits an error if the previous clause matches a base type and the next clause matches a derived type.<\/li>\n<li>Non default clauses have an implicit null check (**). In the example before the very last case clause is valid because it matches only when the argument is not <code>null<\/code>.<\/li>\n<\/ol>\n<p>(**) The very last case clause shows another feature added to C# 7 called &#8220;discard&#8221; pattern. The name <code>_<\/code> is special and tells the compiler that the variable is not needed. The type pattern in a case clause requires an alias and if you don&#8217;t need it you can ignore it using <code>_<\/code>.<\/p>\n<p>The next snippet shows another feature of the switch-based pattern matching &#8211; an ability to use predicates:<\/p>\n<pre class=\"lang:default decode:true \">public static void FizzBuzz(object o)\r\n{\r\n    switch (o)\r\n    {\r\n        case string s when s.Contains(\"Fizz\") || s.Contains(\"Buzz\"):\r\n            Console.WriteLine(s);\r\n            break;\r\n        case int n when n % 5 == 0 &amp;&amp; n % 3 == 0:\r\n            Console.WriteLine(\"FizzBuzz\");\r\n            break;\r\n        case int n when n % 5 == 0:\r\n            Console.WriteLine(\"Fizz\");\r\n            break;\r\n        case int n when n % 3 == 0:\r\n            Console.WriteLine(\"Buzz\");\r\n            break;\r\n        case int n:\r\n            Console.WriteLine(n);\r\n            break;\r\n    }\r\n}<\/pre>\n<p>This is a weird version of the <a href=\"http:\/\/wiki.c2.com\/?FizzBuzzTest\">FizzBuzz<\/a> problem that processes an <code>object<\/code>instead of just a number.<\/p>\n<p>A switch can have more than one case clause with the same type. If this happens the compiler groups together all type checks to avoid redundant computations:<\/p>\n<pre class=\"lang:default decode:true \">public static void FizzBuzz(object o)\r\n{\r\n    \/\/ All cases can match only if the value is not null\r\n    if (o != null)\r\n    {\r\n        if (o is string s &amp;&amp;\r\n            (s.Contains(\"Fizz\") || s.Contains(\"Buzz\")))\r\n        {\r\n            Console.WriteLine(s);\r\n            return;\r\n        }\r\n \r\n        bool isInt = o is int;\r\n        int num = isInt ? ((int)o) : 0;\r\n        if (isInt)\r\n        {\r\n            \/\/ The type check and unboxing happens only once per group\r\n            if (num % 5 == 0 &amp;&amp; num % 3 == 0)\r\n            {\r\n                Console.WriteLine(\"FizzBuzz\");\r\n                return;\r\n            }\r\n            if (num % 5 == 0)\r\n            {\r\n                Console.WriteLine(\"Fizz\");\r\n                return;\r\n            }\r\n            if (num % 3 == 0)\r\n            {\r\n                Console.WriteLine(\"Buzz\");\r\n                return;\r\n            }\r\n \r\n            Console.WriteLine(num);\r\n        }\r\n    }\r\n}<\/pre>\n<p>But there are two things to keep in mind:<\/p>\n<ol>\n<li>The compiler will group together only consecutive type checks and if you&#8217;ll intermix cases for different types the compiler will generate less optimal code:<\/li>\n<\/ol>\n<pre class=\"lang:default decode:true \">switch (o)\r\n{\r\n    \/\/ The generated code is less optimal:\r\n    \/\/ If o is int, then more than one type check and unboxing operation\r\n    \/\/ may happen.\r\n    case int n when n == 1: return 1;\r\n    case string s when s == \"\": return 2;\r\n    case int n when n == 2: return 3;\r\n    default: return -1;\r\n}<\/pre>\n<p>The compiler will translate it effectively to the following:<\/p>\n<pre class=\"lang:default decode:true \">if (o is int n &amp;&amp; n == 1) return 1;\r\nif (o is string s &amp;&amp; s == \"\") return 2;\r\nif (o is int n2 &amp;&amp; n2 == 2) return 3;\r\nreturn -1;<\/pre>\n<ol start=\"2\">\n<li>The compiler tries it best to prevent common ordering issues.<\/li>\n<\/ol>\n<pre class=\"lang:default decode:true \">switch (o)\r\n{\r\n    case int n: return 1;\r\n    \/\/ Error: The switch case has already been handled by a previous case.\r\n    case int n when n == 1: return 2;\r\n}<\/pre>\n<p>But compiler doesn&#8217;t know that one predicate is stronger than the other and effectively supersedes the next cases:<\/p>\n<pre class=\"lang:default decode:true \">switch (o)\r\n{\r\n    case int n when n &gt; 0: return 1;\r\n    \/\/ Will never match, but the compiler won't warn you about it\r\n    case int n when n &gt; 1: return 2;\r\n}<\/pre>\n<h4>Pattern matching 101<\/h4>\n<ul>\n<li>C# 7 introduced the following patterns: the const pattern, the type pattern, the var pattern and the discard pattern.<\/li>\n<li>Patterns can be used in <code>is<\/code>-expressions and in case blocks.<\/li>\n<li>The implementation of the const pattern in <code>is<\/code>-expression for value types is far from perfect from the performance point of view.<\/li>\n<li>The <code>var<\/code>-pattern always match and you should be careful with them.<\/li>\n<li>A switch statement can be used for a set of type checks with additional predicates in <code>when<\/code> clauses.<\/li>\n<\/ul>\n<p>&nbsp;<\/p>\n<p>Discussions on <a href=\"https:\/\/www.reddit.com\/r\/programming\/comments\/76o02y\/dissecting_the_pattern_matching_in_c_7\/\">reddit<\/a> and <a href=\"https:\/\/news.ycombinator.com\/item?id=15480734\">hacker news<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>C# 7 finally introduced a long-awaited feature called &#8220;pattern matching&#8221;. If you&#8217;re familiar with functional languages like F# you may be slightly disappointed with this feature in its current state, but even today it can simplify your code in a variety of different scenarios. Every new feature is fraught with danger for a developer working [&hellip;]<\/p>\n","protected":false},"author":4004,"featured_media":37840,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[6699],"tags":[6695],"class_list":["post-1035","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-c","tag-seteplia"],"acf":[],"blog_post_summary":"<p>C# 7 finally introduced a long-awaited feature called &#8220;pattern matching&#8221;. If you&#8217;re familiar with functional languages like F# you may be slightly disappointed with this feature in its current state, but even today it can simplify your code in a variety of different scenarios. Every new feature is fraught with danger for a developer working [&hellip;]<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-json\/wp\/v2\/posts\/1035","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-json\/wp\/v2\/users\/4004"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-json\/wp\/v2\/comments?post=1035"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-json\/wp\/v2\/posts\/1035\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-json\/wp\/v2\/media\/37840"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-json\/wp\/v2\/media?parent=1035"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-json\/wp\/v2\/categories?post=1035"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-json\/wp\/v2\/tags?post=1035"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}