{"id":48604,"date":"2023-11-06T10:05:00","date_gmt":"2023-11-06T18:05:00","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/dotnet\/?p=48604"},"modified":"2023-11-06T19:24:11","modified_gmt":"2023-11-07T03:24:11","slug":"the-convenience-of-system-io","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/dotnet\/the-convenience-of-system-io\/","title":{"rendered":"The convenience of System.IO"},"content":{"rendered":"<p>Reading and writing files is very common, just like other forms of I\/O. File APIs are needed for reading application configuration, caching content, and loading data (from disk) into memory to do some computation like (today&#8217;s topic) word counting. <code>File<\/code>, <code>FileInfo<\/code>, <code>FileStream<\/code>, and related types do a lot of the heavy lifting for .NET developers needing to access files. In this post, we\u2019re going to look at the convenience and performance of reading text files with <a href=\"https:\/\/learn.microsoft.com\/dotnet\/api\/system.io\">System.IO<\/a>, with some help from <a href=\"https:\/\/learn.microsoft.com\/dotnet\/api\/system.text\">System.Text<\/a> APIs.<\/p>\n<p>We recently kicked off a series on the <a href=\"https:\/\/devblogs.microsoft.com\/dotnet\/the-convenience-of-dotnet\/\">Convenience of .NET<\/a> that describes our approach for providing convenient solutions to common tasks. <a href=\"https:\/\/devblogs.microsoft.com\/dotnet\/the-convenience-of-system-text-json\/\">The convenience of System.Text.Json<\/a> is another post in the series, about reading and writing JSON documents. <a href=\"https:\/\/devblogs.microsoft.com\/dotnet\/why-dotnet\/\">Why .NET?<\/a> describes the architectural choices that enable the solutions covered in these posts.<\/p>\n<p>This post analyzes the convenience and performance of file I\/O and text APIs being put to the task of counting lines, words, and bytes in a large novel. The results show that the high-level APIs are straightforward to use and deliver great performance, while the lower-level APIs require a little more effort and deliver excellent results. You&#8217;ll also see how <a href=\"https:\/\/learn.microsoft.com\/dotnet\/core\/deploying\/native-aot\">native AOT<\/a> shifts .NET into a new class of performance for application startup.<\/p>\n<h2>The APIs<\/h2>\n<p>The following <code>File<\/code> APIs (with their companions) are used in the benchmarks.<\/p>\n<ol>\n<li><a href=\"https:\/\/learn.microsoft.com\/dotnet\/api\/system.io.file.openhandle\"><code>File.OpenHandle<\/code><\/a> with <a href=\"https:\/\/learn.microsoft.com\/dotnet\/api\/system.io.randomaccess.read\"><code>RandomAccess.Read<\/code><\/a><\/li>\n<li><a href=\"https:\/\/learn.microsoft.com\/dotnet\/api\/system.io.file.open\"><code>File.Open<\/code><\/a> with <a href=\"https:\/\/learn.microsoft.com\/dotnet\/api\/system.io.filestream.read\"><code>FileStream.Read<\/code><\/a><\/li>\n<li><a href=\"https:\/\/learn.microsoft.com\/dotnet\/api\/system.io.file.opentext\"><code>File.OpenText<\/code><\/a> with <a href=\"https:\/\/learn.microsoft.com\/dotnet\/api\/system.io.streamreader.read\"><code>StreamReader.Read<\/code><\/a> and <a href=\"https:\/\/learn.microsoft.com\/dotnet\/api\/system.io.streamreader.readline\"><code>StreamReader.ReadLine<\/code><\/a><\/li>\n<li><a href=\"https:\/\/learn.microsoft.com\/dotnet\/api\/system.io.file.readlines\"><code>File.ReadLines<\/code><\/a> with <a href=\"https:\/\/learn.microsoft.com\/dotnet\/api\/system.collections.generic.ienumerable-1\"><code>IEnumerable&lt;string&gt;<\/code><\/a><\/li>\n<li><a href=\"https:\/\/learn.microsoft.com\/dotnet\/api\/system.io.file.readalllines\"><code>File.ReadAllLines<\/code><\/a> with <a href=\"https:\/\/learn.microsoft.com\/dotnet\/api\/system.array\"><code>string[]<\/code><\/a><\/li>\n<\/ol>\n<p>The APIs are listed from highest-control to most convenient. It&#8217;s OK if they are new to you. It should still be an interesting read.<\/p>\n<p>The lower-level benchmarks rely on the following <code>System.Text<\/code> types:<\/p>\n<ul>\n<li><a href=\"https:\/\/learn.microsoft.com\/dotnet\/api\/system.text.encoding\">Encoding<\/a><\/li>\n<li><a href=\"https:\/\/learn.microsoft.com\/dotnet\/api\/system.text.rune\">Rune<\/a><\/li>\n<\/ul>\n<p>I also used the new <a href=\"https:\/\/learn.microsoft.com\/dotnet\/api\/system.buffers.searchvalues\"><code>SearchValues<\/code><\/a> class to see if it provided a significant benefit over passing a <code>Span&lt;char&gt;<\/code> to <code>Span&lt;char&gt;.IndexOfAny<\/code>. It pre-computes the search strategy to avoid the upfront costs of <a href=\"https:\/\/learn.microsoft.com\/dotnet\/api\/system.memoryextensions.indexofany\"><code>IndexOfAny<\/code><\/a>. Spoiler: the impact is dramatic.<\/p>\n<p>Next, we\u2019ll look at an app that has been implemented multiple times \u2014 for each of those APIs \u2014 testing approachability and efficiency.<\/p>\n<h2>The App<\/h2>\n<p>The <a href=\"https:\/\/github.com\/richlander\/convenience\/tree\/main\/wordcount\/wordcount\">app<\/a> counts lines, words, and bytes in a text file. It is modeled on the behavior of <a href=\"https:\/\/en.wikipedia.org\/wiki\/Wc_(Unix)\"><code>wc<\/code><\/a>, a popular tool available on Unix-like systems.<\/p>\n<p>Word counting is an algorithm that requires looking at every character in a file. The counting is done by counting spaces and line breaks.<\/p>\n<blockquote>\n<p>A word is a non-zero-length sequence of printable characters delimited by white space.<\/p>\n<\/blockquote>\n<p>That&#8217;s from <code>wc --help<\/code>. The app code need to follows that recipe. Seems straightforward.<\/p>\n<p>The benchmarks count words in <a href=\"https:\/\/en.wikipedia.org\/wiki\/Clarissa\">Clarissa Harlowe; or the history of a young lady<\/a> by Samuel Richardson. This text was chosen because it is apparently one of the longest books in the English language and is <a href=\"https:\/\/www.gutenberg.org\/ebooks\/author\/1959\">freely available on Project Gutenberg<\/a>. There&#8217;s even a <a href=\"https:\/\/www.imdb.com\/title\/tt0101066\/\">BBC TV adaption<\/a> of it from 1991.<\/p>\n<p>I also did some testing with <a href=\"https:\/\/dev.gutenberg.org\/ebooks\/author\/85\">Les Miserables<\/a>, another long text. Sadly, <a href=\"https:\/\/en.wikipedia.org\/wiki\/Jean_Valjean\"><code>24601<\/code><\/a> didn&#8217;t come up as a word count.<\/p>\n<h2>Results<\/h2>\n<p>Each implementation is measured in terms of:<\/p>\n<ul>\n<li>Lines of code<\/li>\n<li>Speed of execution<\/li>\n<li>Memory use<\/li>\n<\/ul>\n<p>I&#8217;m using a build of .NET 8 very close to the final GA build. While writing this post, I saw that there was another .NET 8 build still coming, however, the build I used is probably within the last two or three builds of final for the release.<\/p>\n<p>I used <a href=\"https:\/\/benchmarkdotnet.org\/index.html\">BenchmarkDotNet<\/a> for performance testing. It&#8217;s a great tool if you&#8217;ve never used it. Writing a benchmark is similar to writing a unit test.<\/p>\n<p>The following use of <code>wc<\/code> lists the cores on my Linux machine. Each core gets its own line in the <code>\/proc\/cpuinfo<\/code> file with &#8220;model name&#8221; appearing in each of those lines and <code>-l<\/code> counts lines.<\/p>\n<pre><code class=\"language-bash\">$ cat \/proc\/cpuinfo | grep \"model name\" | wc -l\n8\n$ cat \/proc\/cpuinfo | grep \"model name\" | head -n 1\nmodel name  : Intel(R) Core(TM) i7-7700K CPU @ 4.20GHz\n$ cat \/etc\/os-release | head -n 1\nNAME=\"Manjaro Linux\"<\/code><\/pre>\n<p>I used that machine for the performance testing in this post. You can see I&#8217;m using Manjaro Linux, which is part of the Arch Linux family. .NET 8 is already available in the <a href=\"https:\/\/aur.archlinux.org\/packages\/dotnet-sdk-preview-bin\">Arch User Repository<\/a> (which is also available to Manjaro users).<\/p>\n<h3>Lines of code<\/h3>\n<p>I love solutions that are easy and approachable. Lines of code is our best proxy metric for that.<\/p>\n<p><img decoding=\"async\" title=\"File API lines of code metric\" src=\"https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2023\/11\/file-api-approachability-loc.png\" width=\"75%\" \/><\/p>\n<p>There are two clusters in this chart, at ~35 and ~75 lines. You&#8217;ll see that these benchmarks boil down to two algorithms with some small differences to accomodate the different APIs. In contrast, the <a href=\"https:\/\/github.com\/coreutils\/coreutils\/blob\/master\/src\/wc.c\"><code>wc<\/code> implementation<\/a> is quite a bit longer, nearing 1000 lines. It does more, however.<\/p>\n<p>I used <code>wc<\/code> to calculate the Benchmark line counts, again with <code>-l<\/code>.<\/p>\n<pre><code class=\"language-bash\">$ wc -l *Benchmark.cs\n      73 FileOpenCharSearchValuesBenchmark.cs\n      71 FileOpenHandleAsciiCheatBenchmark.cs\n      74 FileOpenHandleCharSearchValuesBenchmark.cs\n      60 FileOpenHandleRuneBenchmark.cs\n      45 FileOpenTextCharBenchmark.cs\n      65 FileOpenTextCharIndexOfAnyBenchmark.cs\n      84 FileOpenTextCharLinesBenchmark.cs\n      65 FileOpenTextCharSearchValuesBenchmark.cs\n      34 FileOpenTextReadLineBenchmark.cs\n      36 FileOpenTextReadLineSearchValuesBenchmark.cs\n      32 FileReadAllLinesBenchmark.cs\n      32 FileReadLinesBenchmark.cs\n     671 total<\/code><\/pre>\n<p>I wrote several benchmarks. I summarized those in the image above, using the best performing benchmark for each file API (and then shortened the name for simplicity). The full set of benchmarks will get covered later.<\/p>\n<h3>Functional parity with <code>wc<\/code><\/h3>\n<p>Let&#8217;s validate that my C# implementation matches <code>wc<\/code>.   <\/p>\n<p><code>wc<\/code>:<\/p>\n<pre><code class=\"language-bash\">$  wc ..\/Clarissa_Harlowe\/*\n  11716  110023  610515 ..\/Clarissa_Harlowe\/clarissa_volume1.txt\n  12124  110407  610557 ..\/Clarissa_Harlowe\/clarissa_volume2.txt\n  11961  109622  606948 ..\/Clarissa_Harlowe\/clarissa_volume3.txt\n  12168  111908  625888 ..\/Clarissa_Harlowe\/clarissa_volume4.txt\n  12626  108592  614062 ..\/Clarissa_Harlowe\/clarissa_volume5.txt\n  12434  107576  607619 ..\/Clarissa_Harlowe\/clarissa_volume6.txt\n  12818  112713  628322 ..\/Clarissa_Harlowe\/clarissa_volume7.txt\n  12331  109785  611792 ..\/Clarissa_Harlowe\/clarissa_volume8.txt\n  11771  104934  598265 ..\/Clarissa_Harlowe\/clarissa_volume9.txt\n      9     153    1044 ..\/Clarissa_Harlowe\/summary.md\n 109958  985713 5515012 total<\/code><\/pre>\n<p>And with <a href=\"https:\/\/github.com\/richlander\/convenience\/tree\/main\/wordcount\/count\"><code>count<\/code><\/a>, a standalone copy of <a href=\"https:\/\/github.com\/richlander\/convenience\/blob\/wordcount\/wordcount\/wordcount\/FileOpenHandleCharSearchValuesBenchmark.cs\"><code>FileOpenHandleCharSearchValuesBenchmark<\/code><\/a>:<\/p>\n<pre><code class=\"language-bash\">$ dotnet run ..\/Clarissa_Harlowe\/\n    11716  110023  610515 ..\/Clarissa_Harlowe\/clarissa_volume1.txt\n    12124  110407  610557 ..\/Clarissa_Harlowe\/clarissa_volume2.txt\n    11961  109622  606948 ..\/Clarissa_Harlowe\/clarissa_volume3.txt\n    12168  111908  625888 ..\/Clarissa_Harlowe\/clarissa_volume4.txt\n    12626  108593  614062 ..\/Clarissa_Harlowe\/clarissa_volume5.txt\n    12434  107576  607619 ..\/Clarissa_Harlowe\/clarissa_volume6.txt\n    12818  112713  628322 ..\/Clarissa_Harlowe\/clarissa_volume7.txt\n    12331  109785  611792 ..\/Clarissa_Harlowe\/clarissa_volume8.txt\n    11771  104934  598265 ..\/Clarissa_Harlowe\/clarissa_volume9.txt\n        9     153    1044 ..\/Clarissa_Harlowe\/summary.md\n   109958  985714  5515012 total<\/code><\/pre>\n<p>The results are effectively identical, with a one word difference in total wordcount. Here, you are seeing a Linux version of <code>wc<\/code>. The macOS version reported <code>985716<\/code> words, three words different than the Linux implementation. I noticed that there were some special characters in two of the files that were causing these differences. I didn&#8217;t spend more time investigating them since its outside of the scope of the post.<\/p>\n<h3>Scan the summary (in 10 microseconds)<\/h3>\n<p>I started by testing a <a href=\"https:\/\/github.com\/richlander\/convenience\/blob\/wordcount\/wordcount\/Clarissa_Harlowe\/summary.md\">short summary<\/a> of the novel. It&#8217;s just 1 kilobyte (with 9 lines and 153 words).<\/p>\n<pre><code class=\"language-bash\">$ dotnet run ..\/Clarissa_Harlowe\/summary.md \n        9     153    1044 ..\/Clarissa_Harlowe\/summary.md<\/code><\/pre>\n<p>Let&#8217;s count some words.<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2023\/11\/file-api-speed-small-document.png\" title=\"Speed metrics for reading small document using .NET File I\/O APIs\" width=\"75%\" \/><\/p>\n<p>I&#8217;m going to call this result a tie. There are not a lot of apps where a 1 <a href=\"https:\/\/en.wikipedia.org\/wiki\/Unit_of_time#List\">microsecond<\/a> gap in performance matters. I wouldn&#8217;t write tens of additional lines of code for (only) that win.<\/p>\n<h3>Team <code>byte<\/code> wins the memory race with Team <code>string<\/code><\/h3>\n<p>Let&#8217;s look at memory usage for the same small document.<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2023\/11\/file-api-memory-small-document.png\" title=\"Memory metrics for reading small document using .NET File I\/O APIs\" width=\"75%\" \/><\/p>\n<p>Note: <code>1_048_576<\/code> bytes is 1 <a href=\"https:\/\/en.wikipedia.org\/wiki\/Megabyte\">megabyte (mebibyte)<\/a>. <code>10_000<\/code> bytes is 1% of that. Note: I&#8217;m using the <a href=\"https:\/\/learn.microsoft.com\/dotnet\/csharp\/language-reference\/builtin-types\/integral-numeric-types#integer-literals\">integer literal<\/a> format.<\/p>\n<p>You are seeing one cluster of APIs that return bytes and another that returns heap-allocated strings. In the middle, <code>File.OpenText<\/code> returns <code>char<\/code> values. <\/p>\n<p><code>File.OpenText<\/code> relies on the <code>StreamReader<\/code> and <code>FileStream<\/code> classes to do the required processing. The <code>string<\/code> returning APIs rely on the same types. The <code>StreamReader<\/code> object used by these APIs allocate several buffers, including a 1k buffer. It also creates a <code>FileStream<\/code> object, which by default allocates a 4k buffer. For <code>File.OpenText<\/code> (when using <code>StreamReader.Read<\/code>), those buffers are a fixed cost while <code>File.ReadLines<\/code> and <code>File.ReadAllLines<\/code> also allocate strings (one per line; a variable cost).<\/p>\n<h2>Speed reading the book (in 1 millisecond)<\/h2>\n<p>Let&#8217;s see how long it takes to count lines, words, and bytes in <a href=\"https:\/\/github.com\/richlander\/convenience\/blob\/wordcount\/wordcount\/Clarissa_Harlowe\/clarissa_volume1.txt\">Clarissa_Harlowe volume one<\/a>.<\/p>\n<pre><code class=\"language-bash\">$ dotnet run ..\/Clarissa_Harlowe\/clarissa_volume1.txt\n    11716  110023  610515 ..\/Clarissa_Harlowe\/clarissa_volume1.txt<\/code><\/pre>\n<p>Perhaps we&#8217;ll see a larger separation in performance, grinding through <code>610_515<\/code> bytes of text.<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2023\/11\/file-api-speed-large-document.png\" title=\"Speed metrics for reading large document using .NET File I\/O APIs\" width=\"75%\" \/><\/p>\n<p>And we do. The <code>byte<\/code> and <code>char<\/code> returning APIs cluster together, just above 1ms. We&#8217;re also now seeing a difference between <code>File.ReadLine<\/code> and <code>File.ReadAllLines<\/code>. However, we should put in perspective that the gap is only 2ms for 600k of text. The high-level APIs are doing a great job of delivering competitive performance with much simpler algorithms (in the code I wrote).<\/p>\n<p>The difference in <code>File.ReadLine<\/code> and <code>File.ReadAllLines<\/code> is worth a bit more explanation.<\/p>\n<ul>\n<li>All the APIs start with bytes. <a href=\"https:\/\/github.com\/dotnet\/runtime\/blob\/b6b00ec9f606eca0c47e01b30e74ee0d37d561ab\/src\/libraries\/System.Private.CoreLib\/src\/System\/IO\/File.cs#L770-L775\"><code>File.ReadLines<\/code><\/a> reads <a href=\"https:\/\/github.com\/dotnet\/runtime\/blob\/b6b00ec9f606eca0c47e01b30e74ee0d37d561ab\/src\/libraries\/System.Private.CoreLib\/src\/System\/IO\/StreamReader.cs#L644\">bytes into <code>char<\/code> values<\/a>, <a href=\"https:\/\/github.com\/dotnet\/runtime\/blob\/b6b00ec9f606eca0c47e01b30e74ee0d37d561ab\/src\/libraries\/System.Private.CoreLib\/src\/System\/IO\/StreamReader.cs#L815\">looks for the next line break<\/a>, then <a href=\"https:\/\/github.com\/dotnet\/runtime\/blob\/b6b00ec9f606eca0c47e01b30e74ee0d37d561ab\/src\/libraries\/System.Private.CoreLib\/src\/System\/IO\/StreamReader.cs#L819-L827\">converts that block of text to a <code>string<\/code><\/a>, <a href=\"https:\/\/github.com\/dotnet\/runtime\/blob\/b6b00ec9f606eca0c47e01b30e74ee0d37d561ab\/src\/libraries\/System.Private.CoreLib\/src\/System\/IO\/ReadLinesIterator.cs#L50-L52\">returning one at a time<\/a>.<\/li>\n<li><a href=\"https:\/\/github.com\/dotnet\/runtime\/blob\/b6b00ec9f606eca0c47e01b30e74ee0d37d561ab\/src\/libraries\/System.Private.CoreLib\/src\/System\/IO\/File.cs#L751-L765\"><code>File.ReadAllLines<\/code><\/a> does the same and additionally creates all <code>string<\/code> lines at once and packages them all up into a <code>string[]<\/code>. That&#8217;s a LOT of upfront work that requires a lot of additional memory that frequently offers no additional value.<\/li>\n<\/ul>\n<p><code>File.OpenText<\/code> returns a <code>StreamReader<\/code>, which exposes <code>ReadLine<\/code> and <code>Read<\/code> APIs. The former returns a <code>string<\/code> and the latter one or one <code>char<\/code> values. The <code>ReadLine<\/code> option is very similar to using <code>File.ReadLines<\/code>, which is built on the same API. In the chart, I&#8217;ve shown <code>File.OpenText<\/code> using <code>StreamReader.Read<\/code>. It&#8217;s a lot more efficient.<\/p>\n<h2>Memory: It&#8217;s best to read a page at a time<\/h2>\n<p>Based on the speed differences, we&#8217;re likely to see big memory differences, too.<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2023\/11\/file-api-memory-large-document.png\" title=\"Memory metrics for reading large document using .NET File I\/O APIs\" width=\"75%\" \/><\/p>\n<p>Let&#8217;s be char-itable. That&#8217;s a dramatic difference. The low-level APIs have a fixed cost, while the memory requirements of the <code>string<\/code> APIs scale with the size of the document.<\/p>\n<p>The <code>FileOpenHandle<\/code> and <code>FileOpen<\/code> benchmarks I wrote use <code>ArrayPool<\/code> arrays, whose cost doesn&#8217;t show up in the benchmark.<\/p>\n<pre><code class=\"language-csharp\">Encoding encoding = Encoding.UTF8;\nDecoder decoder = encoding.GetDecoder();\n\/\/ BenchmarkValues.Size = 4 * 1024\n\/\/ charBufferSize = 4097\nint charBufferSize = encoding.GetMaxCharCount(BenchmarkValues.Size);\n\nchar[] charBuffer = ArrayPool&lt;char&gt;.Shared.Rent(charBufferSize);\nbyte[] buffer = ArrayPool&lt;byte&gt;.Shared.Rent(BenchmarkValues.Size);<\/code><\/pre>\n<p>This code shows the two <code>ArrayPool<\/code> arrays that are used (and their sizes). Based on observation, there is a significant performance benefit with a 4k buffer and limited (or none) past that. A 4k buffer seems reasonable to process a 600k file.<\/p>\n<p>I could have used private arrays (or accepted a buffer from the caller). My use of <code>ArrayPool<\/code> arrays demonstrates the memory use difference in the underlying APIs. As you can see, the cost of <code>File.Open<\/code> and <code>File.OpenHandle<\/code> is effectively zero (at least, relatively).<\/p>\n<p>All that said, the memory use of my <code>FileOpen<\/code> and <code>FileOpenHandle<\/code> benchmarks would show up as very similar to <code>FileOpenText<\/code> if I wasn&#8217;t using <code>ArrayPool<\/code>. That should give you the idea that <code>FileOpenText<\/code> is pretty good (when not using <code>StreamReader.ReadLine<\/code>). Certainly, my implementations could be updated to use much smaller buffers, but they would run slower.<\/p>\n<h2>Performance parity with <code>wc<\/code><\/h2>\n<p>I&#8217;ve demonstrated that <code>System.IO<\/code> can be used to produce the same results as <code>wc<\/code>. I should similarly compare performance, using my best-performing benchmark. Here, I&#8217;ll use the <code>time<\/code> command to record the entire invocation (process start to termination), processing both a single volume (of the novel) and all volumes. You&#8217;ll see that the entirety of the novel (all 9 volumes) comprises &gt;5MB of text and just shy of 1M words. <\/p>\n<p>Let&#8217;s start with <code>wc<\/code>.<\/p>\n<pre><code class=\"language-bash\">$ time wc ..\/Clarissa_Harlowe\/clarissa_volume1.txt \n 11716 110023 610515 ..\/Clarissa_Harlowe\/clarissa_volume1.txt\n\nreal    0m0.009s\nuser    0m0.006s\nsys 0m0.003s\n$ time wc ..\/Clarissa_Harlowe\/*\n  11716  110023  610515 ..\/Clarissa_Harlowe\/clarissa_volume1.txt\n  12124  110407  610557 ..\/Clarissa_Harlowe\/clarissa_volume2.txt\n  11961  109622  606948 ..\/Clarissa_Harlowe\/clarissa_volume3.txt\n  12168  111908  625888 ..\/Clarissa_Harlowe\/clarissa_volume4.txt\n  12626  108592  614062 ..\/Clarissa_Harlowe\/clarissa_volume5.txt\n  12434  107576  607619 ..\/Clarissa_Harlowe\/clarissa_volume6.txt\n  12818  112713  628322 ..\/Clarissa_Harlowe\/clarissa_volume7.txt\n  12331  109785  611792 ..\/Clarissa_Harlowe\/clarissa_volume8.txt\n  11771  104934  598265 ..\/Clarissa_Harlowe\/clarissa_volume9.txt\n      9     153    1044 ..\/Clarissa_Harlowe\/summary.md\n 109958  985713 5515012 total\n\nreal    0m0.026s\nuser    0m0.026s\nsys 0m0.000s<\/code><\/pre>\n<p>That&#8217;s pretty fast. That&#8217;s 9 and 26 milliseconds.<\/p>\n<p>Let&#8217;s try with .NET, using my <code>FileOpenHandleCharSearchValuesBenchmark<\/code> implementation.<\/p>\n<pre><code class=\"language-bash\">$ time .\/app\/count ..\/Clarissa_Harlowe\/clarissa_volume1.txt \n    11716  110023  610515 ..\/Clarissa_Harlowe\/clarissa_volume1.txt\n\nreal    0m0.070s\nuser    0m0.033s\nsys 0m0.016s\n$ time .\/app\/count ..\/Clarissa_Harlowe\/\n    11716  110023  610515 ..\/Clarissa_Harlowe\/clarissa_volume1.txt\n    12124  110407  610557 ..\/Clarissa_Harlowe\/clarissa_volume2.txt\n    11961  109622  606948 ..\/Clarissa_Harlowe\/clarissa_volume3.txt\n    12168  111908  625888 ..\/Clarissa_Harlowe\/clarissa_volume4.txt\n    12626  108593  614062 ..\/Clarissa_Harlowe\/clarissa_volume5.txt\n    12434  107576  607619 ..\/Clarissa_Harlowe\/clarissa_volume6.txt\n    12818  112713  628322 ..\/Clarissa_Harlowe\/clarissa_volume7.txt\n    12331  109785  611792 ..\/Clarissa_Harlowe\/clarissa_volume8.txt\n    11771  104934  598265 ..\/Clarissa_Harlowe\/clarissa_volume9.txt\n        9     153    1044 ..\/Clarissa_Harlowe\/summary.md\n   109958  985714  5515012 total\n\nreal    0m0.124s\nuser    0m0.095s\nsys 0m0.010s<\/code><\/pre>\n<p>That&#8217;s no good! Wasn&#8217;t even close.<\/p>\n<p>That&#8217;s 70 and 124 milliseconds with .NET compared to 9 and 26 milliseconds with <code>wc<\/code>. It&#8217;s really interesting that the duration doesn&#8217;t scale with the size of the content, particularly with the .NET implementation. The runtime startup cost is clearly dominant.<\/p>\n<p>Everyone knows that a managed language runtime cannot keep up with native code on startup. The numbers validate that. If only we had a <em>native<\/em> managed runtime.<\/p>\n<p>Oh! We do. We have <a href=\"https:\/\/learn.microsoft.com\/dotnet\/core\/deploying\/native-aot\/\">native AOT<\/a>. Let&#8217;s try it.<\/p>\n<p>Since I enjoy using containers, I used one of our SDK container images (with volume mounting) to do the compilation so that I don&#8217;t have <a href=\"https:\/\/learn.microsoft.com\/dotnet\/core\/deploying\/native-aot\/?tabs=net7%2Clinux-ubuntu#prerequisites\">install a native toolchain<\/a> on my machine.<\/p>\n<pre><code class=\"language-bash\">$ docker run --rm mcr.microsoft.com\/dotnet\/nightly\/sdk:8.0-jammy-aot dotnet --version\n8.0.100-rtm.23523.2\n$ docker run --rm -v $(pwd):\/source -w \/source mcr.microsoft.com\/dotnet\/nightly\/sdk:8.0-jammy-aot dotnet publish -o \/source\/napp\n$ ls -l napp\/\ntotal 4936\n-rwxr-xr-x 1 root root 1944896 Oct 30 11:57 count\n-rwxr-xr-x 1 root root 3107720 Oct 30 11:57 count.dbg<\/code><\/pre>\n<p>If you are looking closely, you&#8217;ll see the benchmark app compiles down to &lt; 2MB (1_944_896) with native AOT. That&#8217;s the runtime, libraries, and app code. Everything. In fact the symbols (<code>count.dbg<\/code>) file is larger. I can take that executable to an Ubuntu 22.04 x64 machine, for example, and just run it.<\/p>\n<p>Let&#8217;s test native AOT.<\/p>\n<pre><code class=\"language-bash\">$ time .\/napp\/count ..\/Clarissa_Harlowe\/clarissa_volume1.txt\n    11716  110023  610515 ..\/Clarissa_Harlowe\/clarissa_volume1.txt\n\nreal    0m0.004s\nuser    0m0.005s\nsys 0m0.000s\n$ time .\/napp\/count ..\/Clarissa_Harlowe\/\n    11716  110023  610515 ..\/Clarissa_Harlowe\/clarissa_volume1.txt\n    12124  110407  610557 ..\/Clarissa_Harlowe\/clarissa_volume2.txt\n    11961  109622  606948 ..\/Clarissa_Harlowe\/clarissa_volume3.txt\n    12168  111908  625888 ..\/Clarissa_Harlowe\/clarissa_volume4.txt\n    12626  108593  614062 ..\/Clarissa_Harlowe\/clarissa_volume5.txt\n    12434  107576  607619 ..\/Clarissa_Harlowe\/clarissa_volume6.txt\n    12818  112713  628322 ..\/Clarissa_Harlowe\/clarissa_volume7.txt\n    12331  109785  611792 ..\/Clarissa_Harlowe\/clarissa_volume8.txt\n    11771  104934  598265 ..\/Clarissa_Harlowe\/clarissa_volume9.txt\n        9     153    1044 ..\/Clarissa_Harlowe\/summary.md\n   109958  985714  5515012 total\n\nreal    0m0.022s\nuser    0m0.025s\nsys 0m0.007s<\/code><\/pre>\n<p>That&#8217;s 4 and 22 milliseconds with native AOT compared to 9 and 25 with <code>wc<\/code>. Those are excellent results and quite competitive! The numbers are so good that I&#8217;d almost have to double check, but the counts validate the computation.<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2023\/11\/native-code-results.png\" title=\"Results of comparing wc and native aot\" width=\"75%\" \/><\/p>\n<p>Note: I <a href=\"https:\/\/learn.microsoft.com\/dotnet\/core\/deploying\/native-aot\/optimizing#optimize-for-size-or-speed\">configured the app<\/a> with <code>&lt;OptimizationPreference&gt;Speed&lt;\/OptimizationPreference&gt;<\/code>. It provided a small benefit.<\/p>\n<h2>Text, Runes, and Unicode<\/h2>\n<p>Text is everywhere. In fact, you are reading it right now. .NET includes multiple types for processing and storing text, including <a href=\"https:\/\/learn.microsoft.com\/dotnet\/api\/system.char\"><code>Char<\/code><\/a>, <a href=\"https:\/\/learn.microsoft.com\/dotnet\/api\/system.text.encoding\"><code>Encoding<\/code><\/a>, <a href=\"https:\/\/learn.microsoft.com\/dotnet\/api\/system.text.rune\"><code>Rune<\/code><\/a>, and <a href=\"https:\/\/learn.microsoft.com\/dotnet\/api\/system.string\"><code>String<\/code><\/a>.<\/p>\n<p><a href=\"https:\/\/en.wikipedia.org\/wiki\/Unicode\">Unicode<\/a> encodes over a million characters, including <a href=\"https:\/\/home.unicode.org\/emoji\/about-emoji\/\">emoji<\/a>. The first 128 characters of <a href=\"https:\/\/en.wikipedia.org\/wiki\/ASCII\">ASCII<\/a> and Unicode match. There are three <a href=\"https:\/\/en.wikipedia.org\/wiki\/Comparison_of_Unicode_encodings\">Unicode encodings<\/a>: UTF8, UTF16, and UTF32, with varying numbers of bytes used to encode each character.<\/p>\n<p>Here&#8217;s some (semi-relevant) text from The Hobbit.<\/p>\n<blockquote>\n<p>\u201cMoon-letters are rune-letters, but you cannot see them,\u201d said Elrond<\/p>\n<\/blockquote>\n<p>I cannot help but think that moon-letters are fantastic <a href=\"https:\/\/en.wikipedia.org\/wiki\/Whitespace_character\">whitespace characters<\/a>.<\/p>\n<p>Here are the results of a <a href=\"https:\/\/github.com\/richlander\/convenience\/tree\/main\/wordcount\/codepoints\">small utility<\/a> that prints out information about each Unicode character, using that text. The byte-length, and bytes are specific to being a UTF8 representation.<\/p>\n<pre><code class=\"language-bash\">$ dotnet run elrond.txt | head -n 16\nchar, codepoint, byte-length, bytes, notes\n\u201c,   8220, 3, 11100010_10000000_10011100,\nM,     77, 1, 01001101,\no,    111, 1, 01101111,\no,    111, 1, 01101111,\nn,    110, 1, 01101110,\n-,     45, 1, 00101101,\nl,    108, 1, 01101100,\ne,    101, 1, 01100101,\nt,    116, 1, 01110100,\nt,    116, 1, 01110100,\ne,    101, 1, 01100101,\nr,    114, 1, 01110010,\ns,    115, 1, 01110011,\n ,     32, 1, 00100000,whitespace\na,     97, 1, 01100001,<\/code><\/pre>\n<p>The opening <a href=\"https:\/\/util.unicode.org\/UnicodeJsps\/character.jsp?a=201c&amp;B1=Show\">quotation mark<\/a> character requires three bytes to encode. The remaining characters all require one byte since they are within the ASCII character range. We also see one whitespace character, the space character.<\/p>\n<p>The binary representation of the characters that use the one-byte encoding exactly match their codepoint integer values. For example, the binary representation of codepoint &#8220;M&#8221; (77) is <code>0b01001101<\/code>, the same as integer 77. In contrast, the binary representation of integer <code>8220<\/code> is <code>0b_100000_00011100<\/code>, not the three-byte binary value we see above for <code>\u201c<\/code>. That&#8217;s because Unicode encodings <a href=\"https:\/\/stackoverflow.com\/questions\/5290182\/how-many-bytes-does-one-unicode-character-take\">describe more than just the codepoint value<\/a>.<\/p>\n<p>Here&#8217;s <a href=\"https:\/\/github.com\/richlander\/convenience\/tree\/main\/wordcount\/printchars\">another program<\/a> that should provide even <a href=\"https:\/\/learn.microsoft.com\/dotnet\/standard\/base-types\/character-encoding-introduction\">more insight<\/a>.<\/p>\n<pre><code class=\"language-csharp\">using System.Text;\n\nchar englishLetter = 'A';\nchar fancyQuote =  '\u201c';\n\/\/ char emoji = (char)0x1f600; \/\/ won't compile\nstring emoji = \"\\U0001f600\";\nEncoding encoding = Encoding.UTF8;\n\nPrintChar(englishLetter);\nPrintChar(fancyQuote);\nPrintChar(emoji[0]);\nPrintUnicodeCharacter(emoji);\n\nvoid PrintChar(char c)\n{\n    int value = (int)c;\n    \/\/ Rune rune = new Rune(c); \/\/ will throw since emoji[0] is an invalid rune\n    Console.WriteLine($\"{c}; bytes: {encoding.GetByteCount([c])}; integer value: {(int)c}; round-trip: {(char)value}\");\n}\n\nvoid PrintUnicodeCharacter(string s)\n{\n    char[] chars = s.ToCharArray();\n    int value = char.ConvertToUtf32(s, 0);\n    Rune r1 = (Rune)value;\n    Rune r2 = new Rune(chars[0], chars[1]);\n    Console.WriteLine($\"{s}; chars: {chars.Length}; bytes: {encoding.GetByteCount(chars)}; integer value: {value}; round-trip {char.ConvertFromUtf32(value)};\");\n    Console.WriteLine($\"{s}; Runes match: {r1 == r2 &amp;&amp; r1.Value == value}; {nameof(Rune.Utf8SequenceLength)}: {r1.Utf8SequenceLength}; {nameof(Rune.Utf16SequenceLength)}: {r1.Utf16SequenceLength}\");\n}<\/code><\/pre>\n<p>It prints out the following:<\/p>\n<pre><code class=\"language-bash\">A; bytes: 1; integer value: 65; round-trip: A\n\u201c; bytes: 3; integer value: 8220; round-trip: \u201c\n\ufffd; bytes: 3; integer value: 55357; round-trip: \ufffd\n\ud83d\ude00; chars: 2; bytes: 4; integer value: 128512; round-trip \ud83d\ude00;\n\ud83d\ude00; Runes match: True; Utf8SequenceLength: 4; Utf16SequenceLength: 2<\/code><\/pre>\n<p>I can run the app again, switching the encoding to UTF16. I switched the value of <code>encoding<\/code> to <code>Encoding.Unicode<\/code>.<\/p>\n<pre><code class=\"language-bash\">A; bytes: 2; integer value: 65; round-trip: A\n\u201c; bytes: 2; integer value: 8220; round-trip: \u201c\n\ufffd; bytes: 2; integer value: 55357; round-trip: \ufffd\n\ud83d\ude00; chars: 2; bytes: 4; integer value: 128512; round-trip \ud83d\ude00;\n\ud83d\ude00; Runes match: True; Utf8SequenceLength: 4; Utf16SequenceLength: 2<\/code><\/pre>\n<p>That tells us a few things:<\/p>\n<ul>\n<li>The UTF8 encoding has a non-uniform byte encoding.<\/li>\n<li>The UTF16 encoding is more uniform.<\/li>\n<li>Characters that require a single codepoint can interoperate with <code>int<\/code>, enabling patterns like <code>(char)8220<\/code> or <code>(char)0x201C<\/code>.<\/li>\n<li>Characters that require two codepoints can be stored in a <code>string<\/code>, an (UTF32) integer value, or as a <code>Rune<\/code>, enabling patterns like <code>(Rune)128512<\/code>.<\/li>\n<li>It is easy to write software with bugs if the code directly handles characters or (even worse) bytes. For example, imagine writing a text search algorithm that supports <a href=\"https:\/\/www.unicode.org\/emoji\/charts\/full-emoji-list.html\">emoji<\/a> search terms.<\/li>\n<li>Multi-codepoint characters are enough to rune any developer.<\/li>\n<li>My terminal supports emoji (and I&#8217;m very happy about that).<\/li>\n<\/ul>\n<p>We can connect those Unicode concepts back to .NET types.<\/p>\n<ul>\n<li><code>string<\/code> and <code>char<\/code> use the UTF16 encoding.<\/li>\n<li><code>Encoding<\/code> classes enable processing text between the encodings and <code>byte<\/code> values.<\/li>\n<li><code>string<\/code> supports Unicode characters that require one or two codepoints.<\/li>\n<li><a href=\"https:\/\/learn.microsoft.com\/dotnet\/api\/system.text.rune#rune-in-net-vs-other-languages\"><code>Rune<\/code><\/a> can represent all Unicode characters (including <a href=\"https:\/\/en.wikipedia.org\/wiki\/Universal_Character_Set_characters#Surrogates\">surrogate pairs<\/a>), unlike <code>char<\/code>.<\/li>\n<\/ul>\n<p>All of these types are used in the benchmarks. All of the benchmarks (except one that cheats) properly use these types so that Unicode text is correctly processed.<\/p>\n<p>Let&#8217;s look at the benchmarks.<\/p>\n<h2><code>File.ReadLines<\/code> and <code>File.ReadAllLines<\/code><\/h2>\n<p>The following benchmarks implement a high-level algorithm based on <code>string<\/code> lines:<\/p>\n<ul>\n<li><a href=\"https:\/\/github.com\/richlander\/convenience\/blob\/wordcount\/wordcount\/wordcount\/FileReadLinesBenchmark.cs\"><code>FileReadLines<\/code><\/a><\/li>\n<li><a href=\"https:\/\/github.com\/richlander\/convenience\/blob\/wordcount\/wordcount\/wordcount\/FileReadAllLinesBenchmark.cs\"><code>FileReadAllLinesBenchmark<\/code><\/a><\/li>\n<\/ul>\n<p>The performance charts in the results section include both of these benchmarks so there is no need to show these results again.<\/p>\n<p>The <code>FileReadLines<\/code> benchmark sets the baseline for our analysis. It uses <code>foreach<\/code> over an <code>IEnumerable&lt;string&gt;<\/code>.<\/p>\n<pre><code class=\"language-csharp\">public static Count Count(string path)\n{\n   long wordCount = 0, lineCount = 0, charCount = 0;\n\n   foreach (string line in File.ReadLines(path))\n   {\n      lineCount++;\n      charCount += line.Length;\n      bool wasSpace = true;\n\n      foreach (char c in line)\n      {\n            bool isSpace = char.IsWhiteSpace(c);\n\n            if (!isSpace &amp;&amp; wasSpace)\n            {\n               wordCount++;\n            }\n\n            wasSpace = isSpace;\n      }\n   }\n\n   return new(lineCount, wordCount, charCount, path);\n}<\/code><\/pre>\n<p>The code counts lines and character counts via the outer <code>foreach<\/code>. The inner <code>foreach<\/code> counts words after spaces, looking at every character in the line. It uses <code>char.IsWhiteSpace<\/code> to determine if a character is whitespace. This algorithm is about as simple as it gets for word counting.<\/p>\n<pre><code class=\"language-bash\">$ wc ..\/Clarissa_Harlowe\/clarissa_volume1.txt\n   11716  110023  610515 ..\/Clarissa_Harlowe\/clarissa_volume1.txt\n$ dotnet run -c Release 1 11\nFileReadLinesBenchmark\n11716 110023 587080 \/Users\/rich\/git\/convenience\/wordcount\/wordcount\/bin\/Release\/net8.0\/Clarissa_Harlowe\/clarissa_volume1.txt<\/code><\/pre>\n<p>Note: The app launches in the benchmarks in several different ways, for my own testing. That&#8217;s the reason for the strange commandline arguments.<\/p>\n<p>The results largely match the <code>wc<\/code> tool. The byte counts don&#8217;t match since this code works on characters not bytes. That means that <a href=\"https:\/\/en.wikipedia.org\/wiki\/Byte_order_mark\">byte order marks<\/a>, multi-byte encodings, and line termination characters have been hidden from view. I could have added +1 to the <code>charCount<\/code> per line, but that didn&#8217;t seem useful to me, particularly since there are multiple <a href=\"https:\/\/en.wikipedia.org\/wiki\/Newline\">newline schemes<\/a>. I decided to accurately count characters or bytes and not attempt to approximate the differences between them.<\/p>\n<blockquote>\n<p>Bottom line: These APIs are great for small documents or when memory use isn&#8217;t a strong constraint. I&#8217;d only use <code>File.ReadAllLines<\/code> if my algorithm relied on knowing the number of lines in a document up front and only for small documents. For larger documents, I&#8217;d adopt a better algorithm to count line break characters to avoid using that API. <\/p>\n<\/blockquote>\n<h2><code>File.OpenText<\/code><\/h2>\n<p>The following benchmarks implement a variety of approaches, all based on <code>StreamReader<\/code>, which <code>File.OpenText<\/code> is <a href=\"https:\/\/github.com\/dotnet\/runtime\/blob\/b6b00ec9f606eca0c47e01b30e74ee0d37d561ab\/src\/libraries\/System.Private.CoreLib\/src\/System\/IO\/File.cs#L30-L31\">simply a wrapper around<\/a>. Some of the <code>StreamReader<\/code> APIs expose <code>string<\/code> lines and others expose <code>char<\/code> values. This is where we&#8217;re going to see a larger separation in performance.<\/p>\n<ul>\n<li><a href=\"https:\/\/github.com\/richlander\/convenience\/blob\/wordcount\/wordcount\/wordcount\/FileOpenTextReadLineBenchmark.cs\"><code>FileOpenTextReadLineBenchmark<\/code><\/a><\/li>\n<li><a href=\"https:\/\/github.com\/richlander\/convenience\/blob\/wordcount\/wordcount\/wordcount\/FileOpenTextReadLineSearchValuesBenchmark.cs\"><code>FileOpenTextReadLineSearchValuesBenchmark<\/code><\/a><\/li>\n<li><a href=\"https:\/\/github.com\/richlander\/convenience\/blob\/wordcount\/wordcount\/wordcount\/FileOpenTextCharBenchmark.cs\"><code>FileOpenTextCharBenchmark<\/code><\/a><\/li>\n<li><a href=\"https:\/\/github.com\/richlander\/convenience\/blob\/wordcount\/wordcount\/wordcount\/FileOpenTextCharLinesBenchmark.cs\"><code>FileOpenTextCharLinesBenchmark<\/code><\/a><\/li>\n<li><a href=\"https:\/\/github.com\/richlander\/convenience\/blob\/wordcount\/wordcount\/wordcount\/FileOpenTextCharIndexOfAnyBenchmark.cs\"><code>FileOpenTextCharIndexOfAnyBenchmark<\/code><\/a><\/li>\n<li><a href=\"https:\/\/github.com\/richlander\/convenience\/blob\/wordcount\/wordcount\/wordcount\/FileOpenTextCharSearchValuesBenchmark.cs\"><code>FileOpenTextCharSearchValuesBenchmark<\/code><\/a><\/li>\n<\/ul>\n<p>The goal of these benchmarks is to determine the benefit of <code>SearchValues<\/code> and of <code>char<\/code> vs <code>string<\/code>. I also included the <code>FileReadLinesBenchmark<\/code> benchmark as a baseline from the previous set of benchmarks.<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2023\/11\/file-open-text-speed-large-document.png\" title=\"Speed metrics for reading large document using File.OpenText.\" width=\"75%\" \/><\/p>\n<p>You might wonder about memory. The memory use with <code>StreamReader<\/code> is a function of <code>char<\/code> vs <code>string<\/code>, which you can see in the initial memory charts earlier in the post. The differences in these algorithms affect speed, but not the memory. <\/p>\n<p>The <code>FileOpenTextReadLineBenchmark<\/code> benchmark is effectively identical to <code>FileReadLines<\/code>, only without the <code>IEnumerable&lt;string&gt;<\/code> abstraction<\/p>\n<p>The <code>FileOpenTextReadLineSearchValuesBenchmark<\/code> benchmark starts to get a bit more fancy.<\/p>\n<pre><code class=\"language-csharp\">public static Count Count(string path)\n{\n   long wordCount = 0, lineCount = 0, charCount = 0;\n   using StreamReader stream = File.OpenText(path);\n\n   string? line = null;\n   while ((line = stream.ReadLine()) is not null)\n   {\n      lineCount++;\n      charCount += line.Length;\n      ReadOnlySpan&lt;char&gt; text = line.AsSpan().TrimStart();\n\n      if (text.Length is 0)\n      {\n            continue;\n      }\n\n      int index = 0;\n      while ((index = text.IndexOfAny(BenchmarkValues.WhitespaceSearchValuesNoLineBreak)) &gt; 0)\n      {\n            wordCount++;\n            text = text.Slice(index).TrimStart();\n      }\n\n      wordCount++;\n   }\n\n   return new(lineCount, wordCount, charCount, path);\n}<\/code><\/pre>\n<p>This benchmark is simply counting spaces (that it doesn&#8217;t trim). It is taking advantage of the new <code>SearchValues<\/code> type, which can speed up <code>IndexOfAny<\/code> when searching for more than just a few values. The <code>SearchValues<\/code> object is constructed with <a href=\"https:\/\/util.unicode.org\/UnicodeJsps\/list-unicodeset.jsp?a=%5B:White_Space=Yes:%5D\">whitespace characters<\/a> except (most) line break characters. We can assume that line break characters are no longer present, since the code is relying on <code>StreamReader.ReadLine<\/code> for that.<\/p>\n<p>I could have used this same algorithm for the previous benchmark implementations, however, I wanted to match the most approachable APIs with the most approachable benchmark implementations.<\/p>\n<p>A big part of the reason that <code>IndexOfAny<\/code> performs so well is vectorization.<\/p>\n<pre><code class=\"language-bash\">$ dotnet run -c Release 2\nVector64.IsHardwareAccelerated: False\nVector128.IsHardwareAccelerated: True\nVector256.IsHardwareAccelerated: True\nVector512.IsHardwareAccelerated: False<\/code><\/pre>\n<p>.NET 8 includes vector APIs all the way up to 512 bits. You can use them in your own algorithms or rely on built-in APIs like <code>IndexOfAny<\/code> to take adantage of the improved processing power. The handy <code>IsHardwareAccelerated<\/code> API tells you how large the vector registers are on a given CPU. This is the result on my Intel machine. I experimented with some newer Intel hardware available in Azure, which reported  <code>Vector512.IsHardwareAccelerated<\/code> as <code>True<\/code>. My MacBook M1 machine reports as <code>Vector128.IsHardwareAccelerated<\/code> as the highest available.<\/p>\n<p>We can now leave the land of <code>string<\/code> and switch to <code>char<\/code> values. There are two expected big benefits. The first is that the underlying API doesn&#8217;t need to read ahead to find a line break character and there won&#8217;t be any more strings to heap allocate and garbage collect. We should see a marked improvement in speed and we already know from previous charts that there is a significant reduction in memory.<\/p>\n<p>I constructed the following benchmarks to tease apart the value of various strategies.<\/p>\n<ul>\n<li><code>FileOpenTextCharBenchmark<\/code> &#8212; Same basic algorithm as <code>FileReadLines<\/code> with the addition of a check for line breaks.<\/li>\n<li><code>FileOpenTextCharLinesBenchmark<\/code> &#8212; An attempt to simplify the core algorithm by synthesizing lines of chars.<\/li>\n<li><code>FileOpenTextCharSearchValuesBenchmark<\/code> &#8212; Similar use of <code>SearchValues<\/code> as <code>FileOpenTextReadLineSearchValuesBenchmark<\/code> to speed up the space searching, but without pre-computed lines.<\/li>\n<li><code>FileOpenTextCharIndexOfAnyBenchmark<\/code> &#8212; Exact same algorithm but uses <code>IndexOfAny<\/code> with a <code>Span&lt;char&gt;<\/code> instead of the new <code>SearchValues<\/code> types.<\/li>\n<\/ul>\n<p>These benchmarks (as demonstrated in the chart above) tell us that <code>IndexOfAny<\/code> with <code>SearchValues&lt;char&gt;<\/code> is very beneficial. It&#8217;s interesting to see how poorly <code>IndexOfAny<\/code> does when given so many values (25) to check. It&#8217;s a lot slower than simply interating over every character with a <code>char.IsWhiteSpace<\/code> check. These results should give you pause if you are using a large set of search terms with <code>IndexOfAny<\/code>.<\/p>\n<p>I did some testing on some other machines. I noticed that <code>FileOpenTextCharLinesBenchmark<\/code> performed quite well on an AVX512 machine (with a lower clock speed). That&#8217;s possibly because it is relying more heavily on <code>IndexOfAny<\/code> (with only two search terms) and is otherwise a pretty lean algorithm.<\/p>\n<p>Here&#8217;s the <code>FileOpenTextCharSearchValuesBenchmark<\/code> implementation.<\/p>\n<pre><code class=\"language-csharp\">public static Count Count(string path)\n{\n   long wordCount = 0, lineCount = 0, charCount = 0;\n   bool wasSpace = true;\n\n   char[] buffer = ArrayPool&lt;char&gt;.Shared.Rent(BenchmarkValues.Size);\n   using StreamReader reader = File.OpenText(path);\n\n   int count = 0;\n   while ((count = reader.Read(buffer)) &gt; 0)\n   {\n      charCount += count;\n      Span&lt;char&gt; chars = buffer.AsSpan(0, count);\n\n      while (chars.Length &gt; 0)\n      {\n            if (char.IsWhiteSpace(chars[0]))\n            {\n               if (chars[0] is '\\n')\n               {\n                  lineCount++;                      \n               }\n\n               wasSpace = true;\n               chars = chars.Slice(1);\n               continue;\n            }\n            else if (wasSpace)\n            {\n               wordCount++;\n               wasSpace = false;\n               chars = chars.Slice(1);\n            }\n\n            int index = chars.IndexOfAny(BenchmarkValues.WhitespaceSearchValues);\n\n            if (index &gt; -1)\n            {\n               if (chars[index] is '\\n')\n               {\n                  lineCount++;       \n               }\n\n               wasSpace = true;\n               chars = chars.Slice(index + 1);\n            }\n            else\n            {\n               wasSpace = false;\n               chars = [];\n            }\n      }\n   }\n\n   ArrayPool&lt;char&gt;.Shared.Return(buffer);\n   return new(lineCount, wordCount, charCount, path);\n}<\/code><\/pre>\n<p>It isn&#8217;t that different to the original implementation. The first block needs to account for line breaks within the <code>char.IsWhiteSpace<\/code> check. After that, <code>IndexOfAny<\/code> is used with a <code>SearchValue&lt;char&gt;<\/code> to find the next whitespace character so that the next check can be done. If <code>IndexOfAny<\/code> returns <code>-1<\/code>, we know that there are no more whitespace characters so there is no need to read any further into the buffer.<\/p>\n<p><code>Span&lt;T&gt;<\/code> is used pervasively in this implementation. Spans provide a cheap way of creating window on an underlying array. They are so cheap that its fine for the implementation to continue slicing all the way to when <code>chars.Length &gt; 0<\/code> is no longer true. I only used that approach with algorithms that required slices &gt;1 characters at once. Otherwise, I used a for loop to iterate over a <code>Span<\/code>, which was faster.<\/p>\n<p>Note: Visual Studio will suggest that <code>chars.Slice(1)<\/code> can be simplified to <code>chars[1..]<\/code>. I discovered that the <a href=\"https:\/\/github.com\/dotnet\/roslyn\/issues\/47629\">simplication isn&#8217;t equivalent<\/a> and shows up as a performance regression in benchmarks. It&#8217;s much less likely to be a problem in apps.<\/p>\n<pre><code class=\"language-bash\">$ wc ..\/Clarissa_Harlowe\/clarissa_volume1.txt\n   11716  110023  610515 ..\/Clarissa_Harlowe\/clarissa_volume1.txt\n$ dotnet run -c Release 1 4\nFileOpenTextCharBenchmark\n11716 110023 610512 \/Users\/rich\/git\/convenience\/wordcount\/wordcount\/bin\/Release\/net8.0\/Clarissa_Harlowe\/clarissa_volume1.txt<\/code><\/pre>\n<p>The <code>FileOpenTextChar*<\/code> benchmarks are a lot closer to matching <code>wc<\/code> for the byte results (for ASCII text). The Byte Order Mark (BOM) is consumed before these APIs start returning values. As a result, the byte counts for the the <code>char<\/code> returning APIs are consistently off by three bytes (the size of the BOM). Unlike the <code>string<\/code> returning APIs, all of the line break characters are counted.<\/p>\n<blockquote>\n<p>Bottom line: <code>StreamReader<\/code> (which is the basis of <code>File.OpenText<\/code>) offers a flexible set of APIs covering a broad range of approachability and performance. For most use cases (if <code>File.ReadLines<\/code> isn&#8217;t appropriate), <code>StreamReader<\/code> is a great default choice.<\/p>\n<\/blockquote>\n<h2><code>File.Open<\/code> and <code>File.OpenHandle<\/code><\/h2>\n<p>The following benchmarks implement the lowest-level algorithms, based on bytes. <code>File.Open<\/code> is wrapper on <code>FileStream<\/code>. <code>File.OpenHandle<\/code> returns an operating system handle, which requires <code>RandomAccess.Read<\/code> to access.<\/p>\n<ul>\n<li><a href=\"https:\/\/github.com\/richlander\/convenience\/blob\/wordcount\/wordcount\/wordcount\/FileOpenCharSearchValuesBenchmark.cs\"><code>FileOpenCharSearchValuesBenchmark<\/code><\/a><\/li>\n<li><a href=\"https:\/\/github.com\/richlander\/convenience\/blob\/wordcount\/wordcount\/wordcount\/FileOpenHandleCharSearchValuesBenchmark.cs\"><code>FileOpenHandleCharSearchValuesBenchmark<\/code><\/a><\/li>\n<li><a href=\"https:\/\/github.com\/richlander\/convenience\/blob\/wordcount\/wordcount\/wordcount\/FileOpenHandleRuneBenchmark.cs\"><code>FileOpenHandleRuneBenchmark<\/code><\/a><\/li>\n<li><a href=\"https:\/\/github.com\/richlander\/convenience\/blob\/wordcount\/wordcount\/wordcount\/FileOpenHandleAsciiCheatBenchmark.cs\"><code>FileOpenHandleAsciiCheatBenchmark<\/code><\/a><\/li>\n<\/ul>\n<p>These APIs offer a lot more control. Lines and chars are now gone and we&#8217;re left with bytes. The goal of these benchmarks is to get the best performance possible and to explore the facilities for correctly reading Unicode text given that the APIs return bytes.<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2023\/11\/file-open-speed-large-document.png\" title=\"Speed metrics for reading large document using File.OpenText.\" width=\"75%\" \/><\/p>\n<p>One last attempt to match the results of <code>wc<\/code>.<\/p>\n<pre><code class=\"language-bash\">$ wc ..\/Clarissa_Harlowe\/clarissa_volume1.txt\n   11716  110023  610515 ..\/Clarissa_Harlowe\/clarissa_volume1.txt\n$ dotnet run -c Release 1 0\nFileOpenHandleCharSearchValuesBenchmark\n11716 110023 610515 \/Users\/rich\/git\/convenience\/wordcount\/wordcount\/bin\/Release\/net8.0\/Clarissa_Harlowe\/clarissa_volume1.txt<\/code><\/pre>\n<p>The byte counts now match. We&#8217;re now looking at every byte in a given file.<\/p>\n<p>The <code>FileOpenHandleCharSearchValuesBenchmark<\/code> adds some new concepts. <code>FileOpenCharSearchValuesBenchmark<\/code> is effectively identical.<\/p>\n<pre><code class=\"language-csharp\">public static Count Count(string path)\n{\n   long wordCount = 0, lineCount = 0, byteCount = 0;\n   bool wasSpace = true;\n\n   Encoding encoding = Encoding.UTF8;\n   Decoder decoder = encoding.GetDecoder();\n   int charBufferSize = encoding.GetMaxCharCount(BenchmarkValues.Size);\n\n   char[] charBuffer = ArrayPool&lt;char&gt;.Shared.Rent(charBufferSize);\n   byte[] buffer = ArrayPool&lt;byte&gt;.Shared.Rent(BenchmarkValues.Size);\n   using Microsoft.Win32.SafeHandles.SafeFileHandle handle = File.OpenHandle(path, FileMode.Open, FileAccess.Read, FileShare.Read, FileOptions.SequentialScan);\n\n   \/\/ Read content in chunks, in buffer, at count length, starting at byteCount\n   int count = 0;\n   while ((count = RandomAccess.Read(handle, buffer, byteCount)) &gt; 0)\n   {\n      byteCount += count;\n      int charCount = decoder.GetChars(buffer.AsSpan(0, count), charBuffer, false);\n      ReadOnlySpan&lt;char&gt; chars = charBuffer.AsSpan(0, charCount);\n\n      while (chars.Length &gt; 0)\n      {\n            if (char.IsWhiteSpace(chars[0]))\n            {\n               if (chars[0] is '\\n')\n               {\n                  lineCount++;                      \n               }\n\n               wasSpace = true;\n               chars = chars.Slice(1);\n               continue;\n            }\n            else if (wasSpace)\n            {\n               wordCount++;\n               wasSpace = false;\n               chars = chars.Slice(1);\n            }\n\n            int index = chars.IndexOfAny(BenchmarkValues.WhitespaceSearchValues);\n\n            if (index &gt; -1)\n            {\n               if (chars[index] is '\\n')\n               {\n                  lineCount++;       \n               }\n\n               wasSpace = true;\n               chars = chars.Slice(index + 1);\n            }\n            else\n            {\n               wasSpace = false;\n               chars = [];\n            }\n      }\n   }\n\n   ArrayPool&lt;char&gt;.Shared.Return(charBuffer);\n   ArrayPool&lt;byte&gt;.Shared.Return(buffer);\n   return new(lineCount, wordCount, byteCount, path);\n}<\/code><\/pre>\n<p>The body of this algorithm is effectively identical to <code>FileOpenTextCharSearchValuesBenchmark<\/code> implementation we just saw. What&#8217;s different is the initial setup.<\/p>\n<p>The following two blocks of code are new.<\/p>\n<pre><code class=\"language-csharp\">Encoding encoding = Encoding.UTF8;\nDecoder decoder = encoding.GetDecoder();\nint charBufferSize = encoding.GetMaxCharCount(BenchmarkValues.Size);<\/code><\/pre>\n<p>This code gets a UTF8 decoder for converting bytes to chars. It also gets the maximum number of characters that the decoder might produce given the size of byte buffer that will be used. This implementation is hard-coded to use UTF8. It could be made dynamic (by reading the byte order mark) to use another Unicode encodings.<\/p>\n<pre><code class=\"language-csharp\">int charCount = decoder.GetChars(buffer.AsSpan(0, count), charBuffer, false);\nReadOnlySpan&lt;char&gt; chars = charBuffer.AsSpan(0, charCount);<\/code><\/pre>\n<p>This block decodes the buffer of bytes into the character buffer. Both buffers are correctly sized (with <code>AsSpan<\/code>) per the reported <code>byte<\/code> and <code>char<\/code> count values. After that, the code adopts a more familiar <code>char<\/code>-based algorithm. There is no obvious way to use <code>SearchValues&lt;byte&gt;<\/code> that plays nicely with the multi-byte Unicode encodings. This approach works fine, so that doesn&#8217;t matter much.<\/p>\n<p>This post is about convenience. I found <code>Decoder.GetChars<\/code> to be incredibly convenient. it&#8217;s a perfect example of a low-level API that does exactly what is needed and sort of saves the day down in the trenches. I found this pattern by reading how <code>File.ReadLines<\/code> (indirectly) solves this same problem. <a href=\"https:\/\/github.com\/dotnet\/runtime\/tree\/main\/src\/libraries\/System.Private.CoreLib\/src\/System\/IO\">All that code<\/a> is there to be read. It&#8217;s open source!<\/p>\n<p><code>FileOpenHandleRuneBenchmark<\/code> uses the <code>Rune<\/code> class instead of <code>Encoding<\/code>. It turns out to be slower, in part because I returned to a more basic algorithm. It wasn&#8217;t obvious how to use <code>IndexOfAny<\/code> or <code>SearchValues<\/code> with <code>Rune<\/code>, in part because there is no analog to <code>decoder.GetChars<\/code> for <code>Rune<\/code>.<\/p>\n<pre><code class=\"language-csharp\">public static Count Count(string path)\n{\n   long wordCount = 0, lineCount = 0, byteCount = 0;\n   bool wasSpace = true;\n\n   byte[] buffer = ArrayPool&lt;byte&gt;.Shared.Rent(BenchmarkValues.Size);\n   using Microsoft.Win32.SafeHandles.SafeFileHandle handle = File.OpenHandle(path, FileMode.Open, FileAccess.Read, FileShare.Read, FileOptions.SequentialScan);\n   int index = 0;\n\n   \/\/ Read content in chunks, in buffer, at count length, starting at byteCount\n   int count = 0;\n   while ((count = RandomAccess.Read(handle, buffer.AsSpan(index), byteCount)) &gt; 0 || index &gt; 0)\n   {\n      byteCount += count;\n      Span&lt;byte&gt; bytes = buffer.AsSpan(0, count + index);\n      index = 0;\n\n      while (bytes.Length &gt; 0)\n      {\n            OperationStatus status = Rune.DecodeFromUtf8(bytes, out Rune rune, out int bytesConsumed);\n\n            \/\/ bad read due to low buffer length\n            if (status == OperationStatus.NeedMoreData &amp;&amp; count &gt; 0)\n            {\n               bytes[..bytesConsumed].CopyTo(buffer); \/\/ move the partial Rune to the start of the buffer before next read\n               index = bytesConsumed;\n               break;\n            }\n\n            if (Rune.IsWhiteSpace(rune))\n            {\n               if (rune.Value is '\\n')\n               {\n                  lineCount++;\n               }\n\n               wasSpace = true;\n            }\n            else if (wasSpace)\n            {\n               wordCount++;\n               wasSpace = false;\n            }\n\n            bytes = bytes.Slice(bytesConsumed);\n      }\n   }\n\n   ArrayPool&lt;byte&gt;.Shared.Return(buffer);\n   return new(lineCount, wordCount, byteCount, path);\n}<\/code><\/pre>\n<p>There isn&#8217;t a lot different here and that&#8217;s a good thing. <code>Rune<\/code> is largely a drop-in replacement for <code>char<\/code>.<\/p>\n<p>This line is the key difference.<\/p>\n<pre><code class=\"language-csharp\">var status = Rune.DecodeFromUtf8(bytes, out Rune rune, out int bytesConsumed);<\/code><\/pre>\n<p>I wanted an API that returns a Unicode character from a <code>Span&lt;byte&gt;<\/code> and reports how many bytes were read. It could be 1 to 4 bytes. <code>Rune.DecodeFromUtf8<\/code> does precisely that. For my purposes, I don&#8217;t care if I get a <code>Rune<\/code> or a <code>char<\/code> back. They are both structs.<\/p>\n<p>I left <code>FileOpenHandleAsciiCheatBenchmark<\/code> for last. I wanted to see how much faster the code could be made to run if it could apply the maxiumum number of assumptions. In short, what would an ASCII-only algorithm look like?<\/p>\n<pre><code class=\"language-csharp\">public static Count Count(string path)\n{\n   const byte NEWLINE = (byte)'\\n';\n   const byte SPACE = (byte)' ';\n   ReadOnlySpan&lt;byte&gt; searchValues = [SPACE, NEWLINE];\n\n   long wordCount = 0, lineCount = 0, byteCount = 0;\n   bool wasSpace = true;\n\n   byte[] buffer = ArrayPool&lt;byte&gt;.Shared.Rent(BenchmarkValues.Size);\n   using Microsoft.Win32.SafeHandles.SafeFileHandle handle = File.OpenHandle(path, FileMode.Open, FileAccess.Read, FileShare.Read, FileOptions.SequentialScan);\n\n   \/\/ Read content in chunks, in buffer, at count length, starting at byteCount\n   int count = 0;\n   while ((count = RandomAccess.Read(handle, buffer, byteCount)) &gt; 0)\n   {\n      byteCount += count;\n      Span&lt;byte&gt; bytes = buffer.AsSpan(0, count);\n\n      while (bytes.Length &gt; 0)\n      {\n            \/\/ what's this character?\n            if (bytes[0] &lt;= SPACE)\n            {\n               if (bytes[0] is NEWLINE)\n               {\n                  lineCount++;\n               }\n\n               wasSpace = true;\n               bytes = bytes.Slice(1);\n               continue;\n            }\n            else if (wasSpace)\n            {\n               wordCount++;\n            }\n\n            \/\/ Look ahead for next space or newline\n            \/\/ this logic assumes that preceding char was non-whitespace\n            int index = bytes.IndexOfAny(searchValues);\n\n            if (index &gt; -1)\n            {\n               if (bytes[index] is NEWLINE)\n               {\n                  lineCount++;\n               }\n\n               wasSpace = true;\n               bytes = bytes.Slice(index + 1);\n            }\n            else\n            {\n               wasSpace = false;\n               bytes = [];\n            }\n      }\n   }\n\n   ArrayPool&lt;byte&gt;.Shared.Return(buffer);\n   return new(lineCount, wordCount, byteCount, path);\n}<\/code><\/pre>\n<p>This code is nearly identical to what you&#8217;ve seen before except it searches for far fewer characters, which &#8212; SURPRISE &#8212; speeds up the algorithm. You can see that in the chart earlier in this section. <code>SearchValues<\/code> isn&#8217;t used here since it&#8217;s not optimized for only two values.<\/p>\n<pre><code class=\"language-bash\">$ dotnet run -c Release 1 3\nFileOpenHandleAsciiCheatBenchmark\n11716 110023 610515 \/Users\/rich\/git\/convenience\/wordcount\/wordcount\/bin\/Release\/net8.0\/Clarissa_Harlowe\/clarissa_volume1.txt<\/code><\/pre>\n<p>This algorithm is still able to produce the expected results. That&#8217;s only because the text file satisfies the assumption of the code.<\/p>\n<blockquote>\n<p>Bottom line: <code>File.Open<\/code> and <code>File.OpenHandle<\/code> offer the highest control and performance. In the case of text data, it&#8217;s not obvious that it is worth the extra effort over <code>File.OpenText<\/code> (with <code>char<\/code>) even though they can deliver higher performance. In this case, these APIs were required to match the byte count baseline. For non-text data, these APIs are a more obvious choice. <\/p>\n<\/blockquote>\n<h2>Summary<\/h2>\n<p><code>System.IO<\/code> provides effective APIs that cover many use cases. I like how easy it is to create straightforward algorithms with <code>File.ReadLines<\/code>. It works very well for content that is line-based. <code>File.OpenText<\/code> enables writing faster algorithms without a big step in complexity. Last, <code>File.Open<\/code> and <code>File.OpenHandle<\/code> are great for getting access to the binary content of files and to enable writing the most high-performance and accurate algorithms.<\/p>\n<p>I didn&#8217;t set out to explore .NET globalization APIs or Unicode in quite so much depth. I&#8217;d used the encoding APIs before, but never tried <code>Rune<\/code>. I was impressed to see how well those APIs suited my project and how well they were able to perform. These APIs were a surprise case-in-point example of the convenience premise of the post. Convenience doesn&#8217;t mean &#8220;high-level&#8221;, but &#8220;right and approachable tool for the job&#8221;.<\/p>\n<p>Another insight was that for this problem, the high-level APIs were approachable and effective, however, only the low-level APIs were up to the task of exactly matching the results of <code>wc<\/code>. I didn&#8217;t understand that dynamic when I started the project, however, I was happy that the required APIs were well within reach.<\/p>\n<p>Thanks to <a href=\"https:\/\/github.com\/GrabYourPitchforks\">Levi Broderick<\/a> for reviewing the benchmarks and helping me understand the finer points of Unicode a little better. Thanks to <a href=\"https:\/\/github.com\/davidfowl\">David Fowler<\/a>, <a href=\"https:\/\/github.com\/jkotas\">Jan Kotas<\/a>, and <a href=\"https:\/\/github.com\/stephentoub\">Stephen Toub<\/a> for their help contributing to this series.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>File I\/O APIs are used pervasively in apps. .NET has great API for reading and writing files. They are a great example of the convenience of .NET.<\/p>\n","protected":false},"author":1312,"featured_media":48728,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[685,756,3009],"tags":[7755],"class_list":["post-48604","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-dotnet","category-csharp","category-performance","tag-convenience-of-dotnet"],"acf":[],"blog_post_summary":"<p>File I\/O APIs are used pervasively in apps. .NET has great API for reading and writing files. They are a great example of the convenience of .NET.<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/posts\/48604","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/users\/1312"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/comments?post=48604"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/posts\/48604\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/media\/48728"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/media?parent=48604"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/categories?post=48604"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/tags?post=48604"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}