{"id":34139,"date":"2021-09-01T09:00:48","date_gmt":"2021-09-01T16:00:48","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/dotnet\/?p=34139"},"modified":"2021-09-15T12:37:50","modified_gmt":"2021-09-15T19:37:50","slug":"file-io-improvements-in-dotnet-6","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/dotnet\/file-io-improvements-in-dotnet-6\/","title":{"rendered":"File IO improvements in .NET 6"},"content":{"rendered":"<p>For .NET 6, we have made <code>FileStream<\/code> much faster and more reliable, thanks to an almost entire re-write. For same cases, the async implementation is now a few times faster!<\/p>\n<p>We also recognized the need of having more high-performance file IO features: concurrent reads and writes, scatter\/gather IO and introduced new APIs for them.<\/p>\n<h2>TL;DR<\/h2>\n<blockquote><p>File I\/O is better, stronger, faster! &#8211; <a href=\"https:\/\/twitter.com\/Fahrni\/status\/1429848474112069632\">Rob Fahrni<\/a><\/p><\/blockquote>\n<p>If you are not into details, please see <a href=\"#summary\">Summary<\/a> for a short recap of what was changed.<\/p>\n<h2>Introduction to FileStream<\/h2>\n<p>Before we deep dive into the details, we need to explain few concepts crucial to understanding what was changed. Let&#8217;s first take a look at the two most flexible <code>FileStream<\/code> constructors and discuss their arguments:<\/p>\n<pre><code class=\"language-csharp\">public FileStream(string path, FileMode mode, FileAccess access, FileShare share, int bufferSize, FileOptions options)\r\npublic FileStream(SafeFileHandle handle, FileAccess access, int bufferSize, bool isAsync)<\/code><\/pre>\n<ul>\n<li><code>path<\/code> is a relative or absolute path to a file. The file can be a:\n<ul>\n<li>Regular file (the most common use case).<\/li>\n<li>Symbolic link &#8211; <code>FileStream<\/code> dereferences symbolic links and opens the target instead of the link itself.<\/li>\n<li>Pipe or a socket &#8211; for which <code>CanSeek<\/code> returns <code>false<\/code>, while <code>Position<\/code> and <code>Seek()<\/code> just throw.<\/li>\n<li>Character, block file and more.<\/li>\n<\/ul>\n<\/li>\n<li><code>handle<\/code> is a handle or file descriptor provided by the caller.<\/li>\n<li><code>mode<\/code> is an enumeration that tells <code>FileStream<\/code> whether the given file should be opened, created, replaced, truncated or opened for appending.<\/li>\n<li><code>access<\/code> specifies the intent: reading, writing, or both.<\/li>\n<li><code>share<\/code> describes whether we want to have exclusive access to the file (<code>FileShare.None<\/code>), share it for reading, writing or even for deleting the file. If you have ever observed a &#8220;<em>file in use<\/em>&#8221; error, it means that someone opened the file first and locked it using this enumeration.<\/li>\n<li><code>bufferSize<\/code> sets the size of the <code>FileStream<\/code> private buffer used for <strong>buffering<\/strong>. When a user requests a read of <code>n<\/code> bytes, and <code>n<\/code> is less than <code>bufferSize<\/code>, <code>FileStream<\/code> is going to try to fetch <code>bufferSize<\/code>-many bytes from the operating system, store them in its private buffer and return only the requested <code>n<\/code> bytes. The next read operation is going to return from the remaining buffered bytes and ask the OS for more, only if needed. This performance optimization allows <code>FileStream<\/code> to reduce the number of expensive sys-calls, as copying bytes is simply cheaper.\n<ul>\n<li>Buffering is also applied to all <code>Write*()<\/code> methods. That is why calling <code>Write*()<\/code> doesn&#8217;t guarantee that the data is immediately saved to the file and we need to call <code>Flush*()<\/code> to flush the buffer. On top of that, every operating system implements buffering to reduce disk activity. So most of the sys-calls don&#8217;t perform actual disk operations, but copy memory from user to kernel space. If we want to force the OS to flush the data to the disk, we need to call <code>Flush(flushToDisk: true)<\/code>.<\/li>\n<li><strong>Buffering is enabled by default<\/strong> (the default for <code>bufferSize<\/code> is 4096).<\/li>\n<li>To <strong>disable the <code>FileStream<\/code> buffering<\/strong>, just pass <code>1<\/code> (works for every .NET) or <code>0<\/code> (works for .NET 6 preview 6+) as <code>bufferSize<\/code>.<\/li>\n<li>If you ever needed to disable the OS buffering in a .NET app, please provide your feedback in <a href=\"https:\/\/github.com\/dotnet\/runtime\/issues\/27408\">#27408<\/a> which would help us to prioritize the feature request.<\/li>\n<\/ul>\n<\/li>\n<li><code>isAsync<\/code> allows for controlling whether the file should be opened for asynchronous or synchronous IO. <strong>The default value is <code>false<\/code>, which translates to synchronous IO<\/strong>. If you open <code>FileStream<\/code> for synchronous IO, but later use any of its <code>*Async()<\/code> methods, they are going to perform synchronous IO (no cancellation support) on a <code>ThreadPool<\/code> thread which might not scale up as well as if the <code>FileStream<\/code> was opened for asynchronous IO. The opposite is also an issue on Windows: if you open <code>FileStream<\/code> for asynchronous IO, but call a synchronous method, it&#8217;s going to start an asynchronous IO operation and <strong>block waiting for it to complete<\/strong>.<\/li>\n<li><code>options<\/code> is a flags enumeration that supports further configuration of behaviors, including <code>isAsync<\/code>.<\/li>\n<\/ul>\n<h2>Benchmarks<\/h2>\n<h3>Environment<\/h3>\n<p>We have used <code>FileStream<\/code> benchmarks from the <a href=\"https:\/\/github.com\/dotnet\/performance\/blob\/main\/src\/benchmarks\/micro\/libraries\/System.IO.FileSystem\/Perf.FileStream.cs\">dotnet\/performance repository<\/a>. The harness was obviously <a href=\"https:\/\/github.com\/dotnet\/BenchmarkDotNet\">BenchmarkDotNet<\/a> (version <code>0.13.1<\/code>). See the <a href=\"https:\/\/pvscmdupload.blob.core.windows.net\/reports\/allTestHistory\/TestHistoryIndexIndex.html\">full historical results from our lab<\/a>.<\/p>\n<p>For the purpose of this blog post, we have run the benchmarks on an <code>x64<\/code> machine (Intel Xeon CPU E5-1650, 1 CPU, 12 logical and 6 physical cores) with an SSD drive. The machine was configured for dual boot of Windows 10 (10.0.18363.1621) and Ubuntu 18.04. The results can&#8217;t be used for absolute numbers comparison, as on Windows the disk encryption was enabled, using <a href=\"https:\/\/docs.microsoft.com\/windows\/security\/information-protection\/bitlocker\/bitlocker-overview\">BitLocker<\/a>. For the sake of simplicity we refer to Ubuntu results using term &#8220;Unix&#8221; as all non-Windows optimizations apply to all Unix-like Operating Systems.<\/p>\n<h3>How to read the Results<\/h3>\n<p>Legend for reading the tables with benchmark results:<\/p>\n<ul>\n<li>options, share, fileSize, userBufferSize: Value of the <code>options|share|fileSize|userBufferSize<\/code> parameter.<\/li>\n<li>Mean: Arithmetic mean of all measurements.<\/li>\n<li>Ratio: Mean of the ratio distribution (in this case it&#8217;s always [.NET 6]\/[.NET 5]).<\/li>\n<li>Allocated: Allocated memory per single operation (managed only, inclusive, <code>1KB = 1024B<\/code>).<\/li>\n<li>1 ns: 1 Nanosecond (0.000000001 sec).<\/li>\n<\/ul>\n<table>\n<thead>\n<tr>\n<th>Method<\/th>\n<th>Runtime<\/th>\n<th>share<\/th>\n<th style=\"text-align: right\">Mean<\/th>\n<th style=\"text-align: right\">Ratio<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>GetLength<\/td>\n<td>.NET 5.0<\/td>\n<td>Read<\/td>\n<td style=\"text-align: right\">1,932.00 ns<\/td>\n<td style=\"text-align: right\">1.00<\/td>\n<\/tr>\n<tr>\n<td>GetLength<\/td>\n<td>.NET 6.0<\/td>\n<td>Read<\/td>\n<td style=\"text-align: right\">58.52 ns<\/td>\n<td style=\"text-align: right\">0.03<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>For the table presented above, we can see that the <code>GetLength<\/code> benchmark was taking <code>1932 ns<\/code> to execute on average with .NET 5, and only <code>58.52 ns<\/code> with .NET 6. The ratio column tells us that .NET 6 was on average taking 3% of .NET 5 total time execution. We can also say that .NET 6 is 33 (<code>1.00 \/ 0.03<\/code>) times faster than .NET 5 for this particular benchmark and environment.<\/p>\n<p>With that in mind, let&#8217;s take a look at what we have changed.<\/p>\n<h2>Performance improvements<\/h2>\n<p>Based on feedback from our customers (<a href=\"https:\/\/github.com\/dotnet\/runtime\/issues\/16354\">#16354<\/a>, <a href=\"https:\/\/github.com\/dotnet\/runtime\/issues\/25905\">#25905<\/a>) and some additional profiling with Visual Studio Profiler we have identified key CPU bottlenecks of the Windows implementation.<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2021\/08\/writeasync_cpu_before_callstack.png\" alt=\".NET 5 sys-calls\" \/><\/p>\n<p>By using <a href=\"https:\/\/docs.microsoft.com\/visualstudio\/profiling\/memory-usage?view=vs-2019\">Visual Studio Memory Profiler<\/a> we have tracked down all allocations:<\/p>\n<p><img decoding=\"async\" class=\"alignnone\" src=\"https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2021\/08\/memory_profiling_before.png\" alt=\".NET 5 Visual Studio Profiler CallTree, File IO improvements in .NET 6\" width=\"912\" height=\"678\" \/><\/p>\n<h3>Seek and Position<\/h3>\n<p>After some decent amount of brainstorming, we got to the conclusion that all performance bottlenecks in <code>FileStream.ReadAsync()<\/code> and <code>FileStream.WriteAsync()<\/code> methods were caused by the fact that when <code>FileStream<\/code> was opened for asynchronous IO, it was synchronizing the file offset with Windows for every asynchronous operation. A <a href=\"https:\/\/docs.microsoft.com\/archive\/blogs\/winserverperformance\/designing-applications-for-high-performance-part-iii-2\">blog post<\/a> from the Windows Server Performance Team calls the API that allows for doing that (<code>SetFilePointer()<\/code> method) an <em>anachronism<\/em>:<\/p>\n<blockquote><p>The old DOS SetFilePointer API is an anachronism. One should specify the file offset in the overlapped structure even for synchronous I\/O. It should never be necessary to resort to the hack of having private file handles for each thread.<\/p><\/blockquote>\n<p>We decided to stop doing that for seekable files and simply track the offset only in memory, and use sys-calls that always require us to provide the file offset in an explicit way. We have done that for both Windows and Unix implementation. We discuss this breaking change <a href=\"#Tracking-file-offset-only-in-memory\">later<\/a> in this post.<\/p>\n<pre><code class=\"language-csharp\">[Benchmark]\r\n[Arguments(OneKibibyte, FileOptions.None)]\r\n[Arguments(OneKibibyte, FileOptions.Asynchronous)]\r\npublic void SeekForward(long fileSize, FileOptions options)\r\n{\r\n    string filePath = _sourceFilePaths[fileSize];\r\n    using (FileStream fileStream = new FileStream(filePath, FileMode.Open, FileAccess.Read, FileShare.Read, FourKibibytes, options))\r\n    {\r\n        for (long offset = 0; offset &lt; fileSize; offset++)\r\n        {\r\n            fileStream.Seek(offset, SeekOrigin.Begin);\r\n        }\r\n    }\r\n}\r\n\r\n[Benchmark]\r\n[Arguments(OneKibibyte, FileOptions.None)]\r\n[Arguments(OneKibibyte, FileOptions.Asynchronous)]\r\npublic void SeekBackward(long fileSize, FileOptions options)\r\n{\r\n    string filePath = _sourceFilePaths[fileSize];\r\n    using (FileStream fileStream = new FileStream(filePath, FileMode.Open, FileAccess.Read, FileShare.Read, FourKibibytes, options))\r\n    {\r\n        for (long offset = -1; offset &gt;= -fileSize; offset--)\r\n        {\r\n            fileStream.Seek(offset, SeekOrigin.End);\r\n        }\r\n    }\r\n}<\/code><\/pre>\n<h4>Windows<\/h4>\n<p>Since <a href=\"https:\/\/docs.microsoft.com\/dotnet\/api\/system.io.filestream.seek\">FileStream.Seek()<\/a> and <a href=\"https:\/\/docs.microsoft.com\/dotnet\/api\/system.io.filestream.position\">FileStream.Position<\/a> are no longer performing a sys-call, but just access the position stored in memory, we can observe an improvement from 10x to 100x.<\/p>\n<table>\n<thead>\n<tr>\n<th>Method<\/th>\n<th>Runtime<\/th>\n<th>fileSize<\/th>\n<th>options<\/th>\n<th style=\"text-align: right\">Mean<\/th>\n<th style=\"text-align: right\">Ratio<\/th>\n<th style=\"text-align: right\">Allocated<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>SeekForward<\/td>\n<td>.NET 5.0<\/td>\n<td>1024<\/td>\n<td>None<\/td>\n<td style=\"text-align: right\">580.88 \u03bcs<\/td>\n<td style=\"text-align: right\">1.00<\/td>\n<td style=\"text-align: right\">168 B<\/td>\n<\/tr>\n<tr>\n<td>SeekForward<\/td>\n<td>.NET 6.0<\/td>\n<td>1024<\/td>\n<td>None<\/td>\n<td style=\"text-align: right\">56.01 \u03bcs<\/td>\n<td style=\"text-align: right\">0.10<\/td>\n<td style=\"text-align: right\">240 B<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<\/tr>\n<tr>\n<td>SeekBackward<\/td>\n<td>.NET 5.0<\/td>\n<td>1024<\/td>\n<td>None<\/td>\n<td style=\"text-align: right\">2,273.19 \u03bcs<\/td>\n<td style=\"text-align: right\">1.00<\/td>\n<td style=\"text-align: right\">169 B<\/td>\n<\/tr>\n<tr>\n<td>SeekBackward<\/td>\n<td>.NET 6.0<\/td>\n<td>1024<\/td>\n<td>None<\/td>\n<td style=\"text-align: right\">60.67 \u03bcs<\/td>\n<td style=\"text-align: right\">0.03<\/td>\n<td style=\"text-align: right\">240 B<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<\/tr>\n<tr>\n<td>SeekForward<\/td>\n<td>.NET 5.0<\/td>\n<td>1024<\/td>\n<td>Asynchronous<\/td>\n<td style=\"text-align: right\">2,623.50 \u03bcs<\/td>\n<td style=\"text-align: right\">1.00<\/td>\n<td style=\"text-align: right\">200 B<\/td>\n<\/tr>\n<tr>\n<td>SeekForward<\/td>\n<td>.NET 6.0<\/td>\n<td>1024<\/td>\n<td>Asynchronous<\/td>\n<td style=\"text-align: right\">61.30 \u03bcs<\/td>\n<td style=\"text-align: right\">0.02<\/td>\n<td style=\"text-align: right\">272 B<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<\/tr>\n<tr>\n<td>SeekBackward<\/td>\n<td>.NET 5.0<\/td>\n<td>1024<\/td>\n<td>Asynchronous<\/td>\n<td style=\"text-align: right\">5,354.25 \u03bcs<\/td>\n<td style=\"text-align: right\">1.00<\/td>\n<td style=\"text-align: right\">200 B<\/td>\n<\/tr>\n<tr>\n<td>SeekBackward<\/td>\n<td>.NET 6.0<\/td>\n<td>1024<\/td>\n<td>Asynchronous<\/td>\n<td style=\"text-align: right\">66.63 \u03bcs<\/td>\n<td style=\"text-align: right\">0.01<\/td>\n<td style=\"text-align: right\">272 B<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>The increased amount of allocated memory comes from the abstraction layer that we have <a href=\"https:\/\/github.com\/dotnet\/runtime\/pull\/47128\">introduced<\/a> to support the .NET 5 Compatibility mode, which also helped increase the code maintainability: we now have a few separate <code>FileStream<\/code> <a href=\"https:\/\/en.wikipedia.org\/wiki\/Strategy_pattern\">strategy<\/a> implementations instead of one with <em>a lot<\/em> of <code>if<\/code> blocks.<\/p>\n<h4>Unix<\/h4>\n<p>Unix implementation is no longer performing the <code>lseek<\/code> sys-call and we can observe a very nice <code>x25<\/code> improvement for the <code>SeekForward<\/code> benchmark.<\/p>\n<table>\n<thead>\n<tr>\n<th>Method<\/th>\n<th>Runtime<\/th>\n<th>fileSize<\/th>\n<th>options<\/th>\n<th style=\"text-align: right\">Mean<\/th>\n<th style=\"text-align: right\">Ratio<\/th>\n<th style=\"text-align: right\">Allocated<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>SeekForward<\/td>\n<td>.NET 5.0<\/td>\n<td>1024<\/td>\n<td>None<\/td>\n<td style=\"text-align: right\">447.915 \u03bcs<\/td>\n<td style=\"text-align: right\">1.00<\/td>\n<td style=\"text-align: right\">161 B<\/td>\n<\/tr>\n<tr>\n<td>SeekForward<\/td>\n<td>.NET 6.0<\/td>\n<td>1024<\/td>\n<td>None<\/td>\n<td style=\"text-align: right\">19.083 \u03bcs<\/td>\n<td style=\"text-align: right\">0.04<\/td>\n<td style=\"text-align: right\">232 B<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<\/tr>\n<tr>\n<td>SeekForward<\/td>\n<td>.NET 5.0<\/td>\n<td>1024<\/td>\n<td>Asynchronous<\/td>\n<td style=\"text-align: right\">453.645 \u03bcs<\/td>\n<td style=\"text-align: right\">1.00<\/td>\n<td style=\"text-align: right\">281 B<\/td>\n<\/tr>\n<tr>\n<td>SeekForward<\/td>\n<td>.NET 6.0<\/td>\n<td>1024<\/td>\n<td>Asynchronous<\/td>\n<td style=\"text-align: right\">19.511 \u03bcs<\/td>\n<td style=\"text-align: right\">0.04<\/td>\n<td style=\"text-align: right\">232 B<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>There is no improvement for <code>SeekBackward<\/code> benchmark, which <a href=\"https:\/\/github.com\/dotnet\/performance\/blob\/686f552943ccef56e8404ed0a060fcc1e689a2a0\/src\/benchmarks\/micro\/libraries\/System.IO.FileSystem\/Perf.FileStream.cs#L117\">uses<\/a> <code>SeekOrigin.End<\/code> which requires the file length to be obtained.<\/p>\n<h3>Length<\/h3>\n<p>We have <a href=\"https:\/\/github.com\/dotnet\/runtime\/issues\/49541\">noticed<\/a> that when the file is opened for reading, we can cache the file length as long as it&#8217;s not shared for writing (<code>FileShare.Write<\/code>). In such scenarios, nobody can write to the given file and file length can&#8217;t change until the file is closed.<\/p>\n<p>This change (<a href=\"https:\/\/github.com\/dotnet\/runtime\/pull\/49975\">#49975<\/a>) was included in .NET 6 Preview 4.<\/p>\n<pre><code class=\"language-csharp\">[Benchmark(OperationsPerInvoke = OneKibibyte)]\r\n[Arguments(FileShare.Read)]\r\n[Arguments(FileShare.Write)]\r\npublic long GetLength(FileShare share)\r\n{\r\n    string filePath = _sourceFilePaths[OneKibibyte];\r\n    using (FileStream fileStream = new FileStream(filePath, FileMode.Open, FileAccess.Read, share, FourKibibytes, FileOptions.None))\r\n    {\r\n        long length = 0;\r\n        for (long i = 0; i &lt; OneKibibyte; i++)\r\n        {\r\n            length = fileStream.Length;\r\n        }\r\n        return length;\r\n    }\r\n}<\/code><\/pre>\n<h4>Windows<\/h4>\n<p>The first-time access have not changed, but every next call to <a href=\"https:\/\/docs.microsoft.com\/dotnet\/api\/system.io.filestream.length\">FileStream.Length<\/a> can be even few dozens times faster.<\/p>\n<table>\n<thead>\n<tr>\n<th>Method<\/th>\n<th>Runtime<\/th>\n<th>share<\/th>\n<th style=\"text-align: right\">Mean<\/th>\n<th style=\"text-align: right\">Ratio<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>GetLength<\/td>\n<td>.NET 5.0<\/td>\n<td>Read<\/td>\n<td style=\"text-align: right\">1,932.00 ns<\/td>\n<td style=\"text-align: right\">1.00<\/td>\n<\/tr>\n<tr>\n<td>GetLength<\/td>\n<td>.NET 6.0<\/td>\n<td>Read<\/td>\n<td style=\"text-align: right\">58.52 ns<\/td>\n<td style=\"text-align: right\">0.03<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Kudos to <a href=\"https:\/\/github.com\/pentp\">@pentp<\/a> who made it also possible for <code>FileStream<\/code> opened with <code>FileShare.Delete<\/code> (<a href=\"https:\/\/github.com\/dotnet\/runtime\/pull\/56465\">#56465<\/a>).<\/p>\n<h4>Unix<\/h4>\n<p>In contrary to Windows, we can&#8217;t cache file length on Unix-like operating systems where the file locking is only <a href=\"https:\/\/en.wikipedia.org\/wiki\/File_locking#In_Unix-like_systems\">advisory<\/a>, and there is no guarantee that the file length won&#8217;t be changed by others.<\/p>\n<p>Speaking of file locking, in .NET 6 preview 7 we have <a href=\"https:\/\/github.com\/dotnet\/runtime\/pull\/55256\">added<\/a> a possibility to disable it on Unix. This can be done by using <code>System.IO.DisableFileLocking<\/code> app context switch or <code>DOTNET_SYSTEM_IO_DISABLEFILELOCKING<\/code> environment variable.<\/p>\n<h3>WriteAsync<\/h3>\n<h4>Windows<\/h4>\n<p>After <a href=\"https:\/\/github.com\/benaadams\">@benaadams<\/a> reported <a href=\"https:\/\/github.com\/dotnet\/runtime\/issues\/25905\">#25905<\/a> we have very carefully studied the profiles and <a href=\"https:\/\/docs.microsoft.com\/windows\/win32\/api\/fileapi\/nf-fileapi-writefile\">WriteFile<\/a> docs and got to the conclusion that we don&#8217;t need to extend the file before performing every async write operation. <code>WriteFile()<\/code> extends the file if needed.<\/p>\n<p>In the past, we were doing that because we were thinking that <code>SetFilePointer<\/code> (Windows sys-call used to set file position) could not be pointing to a non-existing offset (<code>offset &gt; endOfFile<\/code>). <a href=\"https:\/\/docs.microsoft.com\/windows\/win32\/api\/fileapi\/nf-fileapi-setfilepointer#remarks\">Docs<\/a> helped us to invalidate that assumption:<\/p>\n<blockquote><p>It is not an error to set a file pointer to a position beyond the end of the file. The size of the file does not increase until you call the SetEndOfFile, WriteFile, or WriteFileEx function. A write operation increases the size of the file to the file pointer position plus the size of the buffer written, which results in the intervening bytes uninitialized.<\/p><\/blockquote>\n<p>Some pseudocode to show the difference:<\/p>\n<pre><code class=\"language-csharp\">public class FileStream\r\n{\r\n    long _position;\r\n    SafeFileHandle _handle;\r\n\r\n    async ValueTask WriteAsyncBefore(ReadOnlyMemory&lt;byte&gt; buffer)\r\n    {\r\n        long oldEndOfFile = GetFileLength(_handle); \/\/ 1st sys-call\r\n        long newEndOfFile = _position + buffer.Length;\r\n\r\n        if (newEndOfFile &gt; oldEndOfFile) \/\/ this was true for EVERY write to an empty file\r\n        {\r\n            ExtendTheFile(_handle, newEndOfFile); \/\/ 2nd sys-call\r\n\r\n            SetFilePosition(_handle, newEndOfFile); \/\/ 3rd sys-call\r\n            _position += buffer.Length;\r\n        }\r\n\r\n        await WriteFile(_handle, buffer); \/\/ 4th sys-call\r\n    }\r\n\r\n    async ValueTask WriteAsyncAfter(ReadOnlyMemory&lt;byte&gt; buffer)\r\n    {\r\n        await WriteFile(_handle, buffer, _position); \/\/ the ONLY sys-call\r\n        _position += buffer.Length;\r\n    }\r\n}<\/code><\/pre>\n<p>Once we had achieved a single sys-call per <code>WriteAsync<\/code> call, we worked on the memory aspect. In <a href=\"https:\/\/github.com\/dotnet\/runtime\/pull\/50802\">#50802<\/a> we have switched from <code>TaskCompletionSource<\/code> to <a href=\"https:\/\/devblogs.microsoft.com\/dotnet\/understanding-the-whys-whats-and-whens-of-valuetask\/#implementing-ivaluetasksource-ivaluetasksourcetgt\">IValueTaskSource<\/a>. By doing that, we were able to get rid of the <code>Task<\/code> allocation for <code>ValueTask<\/code>-returning method overloads. In <a href=\"https:\/\/github.com\/dotnet\/runtime\/pull\/51363\">#51363<\/a> we have started re-using <code>IValueTaskSource<\/code> instances and eliminated the task source allocation. In the very same PR we have also changed the ownership of <code>OverlappedData<\/code> (<a href=\"https:\/\/github.com\/dotnet\/runtime\/issues\/25074\">#25074<\/a>) and eliminated the remaining two most common allocations: <code>OverlappedData<\/code> and <code>ThreadPoolBoundHandleOverlapped<\/code>.<\/p>\n<p><img decoding=\"async\" class=\"alignnone\" src=\"https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2021\/08\/top_allocations.png\" alt=\"Visual Studio Profiler Allocations, File IO improvements in .NET 6\" width=\"732\" height=\"257\" \/><\/p>\n<p>All aforementioned changes were included in .NET 6 Preview 4. Let&#8217;s use the following benchmarks to measure the difference:<\/p>\n<pre><code class=\"language-csharp\">[Benchmark]\r\n[ArgumentsSource(nameof(AsyncArguments))]\r\npublic Task WriteAsync(long fileSize, int userBufferSize, FileOptions options)\r\n    =&gt; WriteAsync(fileSize, userBufferSize, options, streamBufferSize: FourKibibytes);\r\n\r\n[Benchmark]\r\n[ArgumentsSource(nameof(AsyncArguments_NoBuffering))]\r\npublic Task WriteAsync_NoBuffering(long fileSize, int userBufferSize, FileOptions options)\r\n    =&gt; WriteAsync(fileSize, userBufferSize, options, streamBufferSize: 1);\r\n\r\nasync Task WriteAsync(long fileSize, int userBufferSize, FileOptions options, int streamBufferSize)\r\n{\r\n    CancellationToken cancellationToken = CancellationToken.None;\r\n    Memory&lt;byte&gt; userBuffer = new Memory&lt;byte&gt;(_userBuffers[userBufferSize]);\r\n\r\n    using (FileStream fileStream = new FileStream(_destinationFilePaths[fileSize], FileMode.Create, FileAccess.Write, \r\n        FileShare.Read, streamBufferSize, options))\r\n    {\r\n        for (int i = 0; i &lt; fileSize \/ userBufferSize; i++)\r\n        {\r\n            await fileStream.WriteAsync(userBuffer, cancellationToken);\r\n        }\r\n    }\r\n}<\/code><\/pre>\n<table>\n<thead>\n<tr>\n<th>Method<\/th>\n<th>Runtime<\/th>\n<th>fileSize<\/th>\n<th>userBufferSize<\/th>\n<th>options<\/th>\n<th style=\"text-align: right\">Mean<\/th>\n<th style=\"text-align: right\">Ratio<\/th>\n<th style=\"text-align: right\">Allocated<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>WriteAsync<\/td>\n<td>.NET 5.0<\/td>\n<td>1024<\/td>\n<td>1024<\/td>\n<td>Asynchronous<\/td>\n<td style=\"text-align: right\">433.01 \u03bcs<\/td>\n<td style=\"text-align: right\">1.00<\/td>\n<td style=\"text-align: right\">4,650 B<\/td>\n<\/tr>\n<tr>\n<td>WriteAsync<\/td>\n<td>.NET 6.0<\/td>\n<td>1024<\/td>\n<td>1024<\/td>\n<td>Asynchronous<\/td>\n<td style=\"text-align: right\">402.73 \u03bcs<\/td>\n<td style=\"text-align: right\">0.93<\/td>\n<td style=\"text-align: right\">4,689 B<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<\/tr>\n<tr>\n<td>WriteAsync<\/td>\n<td>.NET 5.0<\/td>\n<td>1048576<\/td>\n<td>512<\/td>\n<td>Asynchronous<\/td>\n<td style=\"text-align: right\">9,140.81 \u03bcs<\/td>\n<td style=\"text-align: right\">1.00<\/td>\n<td style=\"text-align: right\">41,608 B<\/td>\n<\/tr>\n<tr>\n<td>WriteAsync<\/td>\n<td>.NET 6.0<\/td>\n<td>1048576<\/td>\n<td>512<\/td>\n<td>Asynchronous<\/td>\n<td style=\"text-align: right\">5,762.94 \u03bcs<\/td>\n<td style=\"text-align: right\">0.63<\/td>\n<td style=\"text-align: right\">5,425 B<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<\/tr>\n<tr>\n<td>WriteAsync<\/td>\n<td>.NET 5.0<\/td>\n<td>1048576<\/td>\n<td>4096<\/td>\n<td>Asynchronous<\/td>\n<td style=\"text-align: right\">21,214.05 \u03bcs<\/td>\n<td style=\"text-align: right\">1.00<\/td>\n<td style=\"text-align: right\">80,320 B<\/td>\n<\/tr>\n<tr>\n<td>WriteAsync<\/td>\n<td>.NET 6.0<\/td>\n<td>1048576<\/td>\n<td>4096<\/td>\n<td>Asynchronous<\/td>\n<td style=\"text-align: right\">4,711.63 \u03bcs<\/td>\n<td style=\"text-align: right\">0.22<\/td>\n<td style=\"text-align: right\">940 B<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<\/tr>\n<tr>\n<td>WriteAsync_NoBuffering<\/td>\n<td>.NET 5.0<\/td>\n<td>1048576<\/td>\n<td>16384<\/td>\n<td>Asynchronous<\/td>\n<td style=\"text-align: right\">6,866.69 \u03bcs<\/td>\n<td style=\"text-align: right\">1.00<\/td>\n<td style=\"text-align: right\">20,416 B<\/td>\n<\/tr>\n<tr>\n<td>WriteAsync_NoBuffering<\/td>\n<td>.NET 6.0<\/td>\n<td>1048576<\/td>\n<td>16384<\/td>\n<td>Asynchronous<\/td>\n<td style=\"text-align: right\">2,056.75 \u03bcs<\/td>\n<td style=\"text-align: right\">0.31<\/td>\n<td style=\"text-align: right\">782 B<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<\/tr>\n<tr>\n<td>WriteAsync<\/td>\n<td>.NET 5.0<\/td>\n<td>104857600<\/td>\n<td>4096<\/td>\n<td>Asynchronous<\/td>\n<td style=\"text-align: right\">2,613,446.73 \u03bcs<\/td>\n<td style=\"text-align: right\">1.00<\/td>\n<td style=\"text-align: right\">7,987,648 B<\/td>\n<\/tr>\n<tr>\n<td>WriteAsync<\/td>\n<td>.NET 6.0<\/td>\n<td>104857600<\/td>\n<td>4096<\/td>\n<td>Asynchronous<\/td>\n<td style=\"text-align: right\">425,094.18 \u03bcs<\/td>\n<td style=\"text-align: right\">0.16<\/td>\n<td style=\"text-align: right\">2,272 B<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<\/tr>\n<tr>\n<td>WriteAsync_NoBuffering<\/td>\n<td>.NET 5.0<\/td>\n<td>104857600<\/td>\n<td>16384<\/td>\n<td>Asynchronous<\/td>\n<td style=\"text-align: right\">773,901.50 \u03bcs<\/td>\n<td style=\"text-align: right\">1.00<\/td>\n<td style=\"text-align: right\">1,997,248 B<\/td>\n<\/tr>\n<tr>\n<td>WriteAsync_NoBuffering<\/td>\n<td>.NET 6.0<\/td>\n<td>104857600<\/td>\n<td>16384<\/td>\n<td>Asynchronous<\/td>\n<td style=\"text-align: right\">141,073.78 \u03bcs<\/td>\n<td style=\"text-align: right\">0.19<\/td>\n<td style=\"text-align: right\">1,832 B<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>As you can see, <a href=\"https:\/\/docs.microsoft.com\/dotnet\/api\/system.io.filestream.writeasync\">FileStream.WriteAsync()<\/a> is now be up to few times faster!<\/p>\n<h4>Unix<\/h4>\n<p>Unix-like systems don&#8217;t expose async file IO APIs (except of the new <code>io_uring<\/code> which we talk about <a href=\"#Whats-Next\">later<\/a>). Anytime user asks <code>FileStream<\/code> to perform async file IO operation, a synchronous IO operation is being scheduled to Thread Pool. Once it&#8217;s dequeued, the blocking operation is performed on a dedicated thread.<\/p>\n<p>In case of <code>WriteAsync<\/code>, Unix implementation was already performing a single sys-call per invocation. But it does not mean that there was no place for other improvements! In <a href=\"https:\/\/github.com\/dotnet\/runtime\/pull\/55123\">#55123<\/a> the amazing <a href=\"https:\/\/github.com\/teo-tsirpanis\">@teo-tsirpanis<\/a> has combined the concept of <a href=\"https:\/\/devblogs.microsoft.com\/dotnet\/understanding-the-whys-whats-and-whens-of-valuetask\/#implementing-ivaluetasksource-ivaluetasksourcetgt\">IValueTaskSource<\/a> and <a href=\"https:\/\/docs.microsoft.com\/dotnet\/api\/system.threading.ithreadpoolworkitem\">IThreadPoolWorkItem<\/a> into a single type. By implementing <code>IThreadPoolWorkItem<\/code> interface, the type gained the possibility of queueing itself on the Thread Pool (which normally requires an allocation of a <code>ThreadPoolWorkItem<\/code>). By re-using it, <a href=\"https:\/\/github.com\/teo-tsirpanis\">@teo-tsirpanis<\/a> achieved amortized allocation-free file operations (per <code>SafeFileHandle<\/code>, when used non-concurrently). The optimization applied also to Windows implementation for synchronous file handles, but let&#8217;s focus on the Unix results:<\/p>\n<table>\n<thead>\n<tr>\n<th>Method<\/th>\n<th>Runtime<\/th>\n<th>fileSize<\/th>\n<th>userBufferSize<\/th>\n<th>options<\/th>\n<th style=\"text-align: right\">Mean<\/th>\n<th style=\"text-align: right\">Ratio<\/th>\n<th style=\"text-align: right\">Allocated<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>WriteAsync<\/td>\n<td>.NET 5.0<\/td>\n<td>1024<\/td>\n<td>1024<\/td>\n<td>None<\/td>\n<td style=\"text-align: right\">53.002 \u03bcs<\/td>\n<td style=\"text-align: right\">1.00<\/td>\n<td style=\"text-align: right\">4,728 B<\/td>\n<\/tr>\n<tr>\n<td>WriteAsync<\/td>\n<td>.NET 6.0<\/td>\n<td>1024<\/td>\n<td>1024<\/td>\n<td>None<\/td>\n<td style=\"text-align: right\">34.615 \u03bcs<\/td>\n<td style=\"text-align: right\">0.65<\/td>\n<td style=\"text-align: right\">4,424 B<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<\/tr>\n<tr>\n<td>WriteAsync<\/td>\n<td>.NET 5.0<\/td>\n<td>1024<\/td>\n<td>1024<\/td>\n<td>Asynchronous<\/td>\n<td style=\"text-align: right\">34.020 \u03bcs<\/td>\n<td style=\"text-align: right\">1.00<\/td>\n<td style=\"text-align: right\">4,400 B<\/td>\n<\/tr>\n<tr>\n<td>WriteAsync<\/td>\n<td>.NET 6.0<\/td>\n<td>1024<\/td>\n<td>1024<\/td>\n<td>Asynchronous<\/td>\n<td style=\"text-align: right\">33.699 \u03bcs<\/td>\n<td style=\"text-align: right\">0.99<\/td>\n<td style=\"text-align: right\">4,424 B<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<\/tr>\n<tr>\n<td>WriteAsync<\/td>\n<td>.NET 5.0<\/td>\n<td>1048576<\/td>\n<td>512<\/td>\n<td>None<\/td>\n<td style=\"text-align: right\">5,531.106 \u03bcs<\/td>\n<td style=\"text-align: right\">1.00<\/td>\n<td style=\"text-align: right\">234,004 B<\/td>\n<\/tr>\n<tr>\n<td>WriteAsync<\/td>\n<td>.NET 6.0<\/td>\n<td>1048576<\/td>\n<td>512<\/td>\n<td>None<\/td>\n<td style=\"text-align: right\">2,133.012 \u03bcs<\/td>\n<td style=\"text-align: right\">0.39<\/td>\n<td style=\"text-align: right\">5,002 B<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<\/tr>\n<tr>\n<td>WriteAsync<\/td>\n<td>.NET 5.0<\/td>\n<td>1048576<\/td>\n<td>512<\/td>\n<td>Asynchronous<\/td>\n<td style=\"text-align: right\">2,447.687 \u03bcs<\/td>\n<td style=\"text-align: right\">1.00<\/td>\n<td style=\"text-align: right\">33,211 B<\/td>\n<\/tr>\n<tr>\n<td>WriteAsync<\/td>\n<td>.NET 6.0<\/td>\n<td>1048576<\/td>\n<td>512<\/td>\n<td>Asynchronous<\/td>\n<td style=\"text-align: right\">2,121.449 \u03bcs<\/td>\n<td style=\"text-align: right\">0.87<\/td>\n<td style=\"text-align: right\">5,009 B<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<\/tr>\n<tr>\n<td>WriteAsync<\/td>\n<td>.NET 5.0<\/td>\n<td>1048576<\/td>\n<td>4096<\/td>\n<td>None<\/td>\n<td style=\"text-align: right\">2,296.017 \u03bcs<\/td>\n<td style=\"text-align: right\">1.00<\/td>\n<td style=\"text-align: right\">29,170 B<\/td>\n<\/tr>\n<tr>\n<td>WriteAsync<\/td>\n<td>.NET 6.0<\/td>\n<td>1048576<\/td>\n<td>4096<\/td>\n<td>None<\/td>\n<td style=\"text-align: right\">1,889.585 \u03bcs<\/td>\n<td style=\"text-align: right\">0.83<\/td>\n<td style=\"text-align: right\">712 B<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<\/tr>\n<tr>\n<td>WriteAsync<\/td>\n<td>.NET 5.0<\/td>\n<td>1048576<\/td>\n<td>4096<\/td>\n<td>Asynchronous<\/td>\n<td style=\"text-align: right\">2,024.704 \u03bcs<\/td>\n<td style=\"text-align: right\">1.00<\/td>\n<td style=\"text-align: right\">18,986 B<\/td>\n<\/tr>\n<tr>\n<td>WriteAsync<\/td>\n<td>.NET 6.0<\/td>\n<td>1048576<\/td>\n<td>4096<\/td>\n<td>Asynchronous<\/td>\n<td style=\"text-align: right\">1,897.600 \u03bcs<\/td>\n<td style=\"text-align: right\">0.94<\/td>\n<td style=\"text-align: right\">712 B<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<\/tr>\n<tr>\n<td>WriteAsync_NoBuffering<\/td>\n<td>.NET 5.0<\/td>\n<td>1048576<\/td>\n<td>16384<\/td>\n<td>None<\/td>\n<td style=\"text-align: right\">1,659.638 \u03bcs<\/td>\n<td style=\"text-align: right\">1.00<\/td>\n<td style=\"text-align: right\">7,666 B<\/td>\n<\/tr>\n<tr>\n<td>WriteAsync_NoBuffering<\/td>\n<td>.NET 6.0<\/td>\n<td>1048576<\/td>\n<td>16384<\/td>\n<td>None<\/td>\n<td style=\"text-align: right\">1,519.658 \u03bcs<\/td>\n<td style=\"text-align: right\">0.92<\/td>\n<td style=\"text-align: right\">558 B<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<\/tr>\n<tr>\n<td>WriteAsync_NoBuffering<\/td>\n<td>.NET 5.0<\/td>\n<td>1048576<\/td>\n<td>16384<\/td>\n<td>Asynchronous<\/td>\n<td style=\"text-align: right\">1,634.240 \u03bcs<\/td>\n<td style=\"text-align: right\">1.00<\/td>\n<td style=\"text-align: right\">7,698 B<\/td>\n<\/tr>\n<tr>\n<td>WriteAsync_NoBuffering<\/td>\n<td>.NET 6.0<\/td>\n<td>1048576<\/td>\n<td>16384<\/td>\n<td>Asynchronous<\/td>\n<td style=\"text-align: right\">1,503.478 \u03bcs<\/td>\n<td style=\"text-align: right\">0.92<\/td>\n<td style=\"text-align: right\">558 B<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<\/tr>\n<tr>\n<td>WriteAsync<\/td>\n<td>.NET 5.0<\/td>\n<td>104857600<\/td>\n<td>4096<\/td>\n<td>None<\/td>\n<td style=\"text-align: right\">152,306.164 \u03bcs<\/td>\n<td style=\"text-align: right\">1.00<\/td>\n<td style=\"text-align: right\">2,867,840 B<\/td>\n<\/tr>\n<tr>\n<td>WriteAsync<\/td>\n<td>.NET 6.0<\/td>\n<td>104857600<\/td>\n<td>4096<\/td>\n<td>None<\/td>\n<td style=\"text-align: right\">141,986.988 \u03bcs<\/td>\n<td style=\"text-align: right\">0.93<\/td>\n<td style=\"text-align: right\">1,256 B<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<\/tr>\n<tr>\n<td>WriteAsync<\/td>\n<td>.NET 5.0<\/td>\n<td>104857600<\/td>\n<td>4096<\/td>\n<td>Asynchronous<\/td>\n<td style=\"text-align: right\">149,759.617 \u03bcs<\/td>\n<td style=\"text-align: right\">1.00<\/td>\n<td style=\"text-align: right\">1,438,392 B<\/td>\n<\/tr>\n<tr>\n<td>WriteAsync<\/td>\n<td>.NET 6.0<\/td>\n<td>104857600<\/td>\n<td>4096<\/td>\n<td>Asynchronous<\/td>\n<td style=\"text-align: right\">147,565.278 \u03bcs<\/td>\n<td style=\"text-align: right\">0.99<\/td>\n<td style=\"text-align: right\">1,064 B<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<\/tr>\n<tr>\n<td>WriteAsync_NoBuffering<\/td>\n<td>.NET 5.0<\/td>\n<td>104857600<\/td>\n<td>16384<\/td>\n<td>None<\/td>\n<td style=\"text-align: right\">122,106.778 \u03bcs<\/td>\n<td style=\"text-align: right\">1.00<\/td>\n<td style=\"text-align: right\">717,584 B<\/td>\n<\/tr>\n<tr>\n<td>WriteAsync_NoBuffering<\/td>\n<td>.NET 6.0<\/td>\n<td>104857600<\/td>\n<td>16384<\/td>\n<td>None<\/td>\n<td style=\"text-align: right\">114,989.749 \u03bcs<\/td>\n<td style=\"text-align: right\">0.94<\/td>\n<td style=\"text-align: right\">1,392 B<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<\/tr>\n<tr>\n<td>WriteAsync_NoBuffering<\/td>\n<td>.NET 5.0<\/td>\n<td>104857600<\/td>\n<td>16384<\/td>\n<td>Asynchronous<\/td>\n<td style=\"text-align: right\">117,631.183 \u03bcs<\/td>\n<td style=\"text-align: right\">1.00<\/td>\n<td style=\"text-align: right\">717,472 B<\/td>\n<\/tr>\n<tr>\n<td>WriteAsync_NoBuffering<\/td>\n<td>.NET 6.0<\/td>\n<td>104857600<\/td>\n<td>16384<\/td>\n<td>Asynchronous<\/td>\n<td style=\"text-align: right\">113,781.015 \u03bcs<\/td>\n<td style=\"text-align: right\">0.97<\/td>\n<td style=\"text-align: right\">1,392 B<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>In the benchmark where we take advantage of <code>FileStream<\/code> buffering (<code>fileSize==1048576 &amp;&amp; userBufferSize==512<\/code>) we can observe some additional memory allocation improvements that come from <a href=\"https:\/\/github.com\/dotnet\/runtime\/pull\/56095\">#56095<\/a> where we started to pool the async method builder by just annotating async methods with <a href=\"https:\/\/github.com\/dotnet\/runtime\/issues\/49903\">PoolingAsyncValueTaskMethodBuilder<\/a> attributes that have been introduced in .NET 6.<\/p>\n<pre><code class=\"language-csharp\">[AsyncMethodBuilder(typeof(PoolingAsyncValueTaskMethodBuilder&lt;&gt;))]\r\nasync ValueTask&lt;int&gt; ReadAsync(Task semaphoreLockTask, Memory&lt;byte&gt; buffer, CancellationToken cancellationToken)\r\n\r\n[AsyncMethodBuilder(typeof(PoolingAsyncValueTaskMethodBuilder))]\r\nasync ValueTask WriteAsync(Task semaphoreLockTask, ReadOnlyMemory&lt;byte&gt; source, CancellationToken cancellationToken)<\/code><\/pre>\n<h3>ReadAsync<\/h3>\n<h4>Windows<\/h4>\n<p>Initially, <a href=\"https:\/\/docs.microsoft.com\/dotnet\/api\/system.io.filestream.readasync\">FileStream.ReadAsync()<\/a> has benefited a lot from file length caching and lack of file offset synchronization (.NET 6 preview 4). But since length can&#8217;t be cached for files opened with <code>FileAccess.ReadWrite<\/code> or <code>FileShare.Write<\/code>, we have decided to also limit it to a single sys-call (<code>ReadFile<\/code>). After <a href=\"https:\/\/github.com\/dotnet\/runtime\/pull\/56531\">#56531<\/a> got merged (.NET 6 Preview 7), <code>ReadAsync<\/code> ensures that the position is correct after the operation finishes. Without fetching file length before the read operation starts. Some pseudocode:<\/p>\n<pre><code class=\"language-csharp\">public class FileStream\r\n{\r\n    long _position;\r\n    SafeFileHandle _handle;\r\n\r\n    async ValueTask&lt;int&gt; ReadAsyncBefore(Memory&lt;byte&gt; buffer)\r\n    {\r\n        long fileOffset = _position;\r\n        long endOfFile = GetFileLength(_handle); \/\/ 1st sys-call\r\n\r\n        if (fileOffset + buffer.Length &gt; endOfFile) \/\/ read beyond EOF\r\n        {\r\n            buffer = buffer.Slice(0, endOfFile - fileOffset);\r\n        }\r\n\r\n        _position = SetFilePosition(_handle, fileOffset + buffer.Length); \/\/ 2nd sys-call\r\n\r\n        await ReadFile(_handle, buffer); \/\/ 3rd sys-call\r\n    }\r\n\r\n    async ValueTask&lt;int&gt; ReadAsyncBefore(Memory&lt;byte&gt; buffer)\r\n    {\r\n        int bytesRead = await ReadFile(_handle, buffer, _position); \/\/ the ONLY sys-call\r\n        _position += bytesRead;\r\n\r\n        return bytesRead;\r\n    }\r\n}<\/code><\/pre>\n<p>The reduced number of sys-calls and memory allocations (which were exactly the same as for <code>WriteAsync<\/code> described above) has clearly paid off for <a href=\"https:\/\/docs.microsoft.com\/dotnet\/api\/system.io.filestream.readasync\">FileStream.ReadAsync()<\/a> which is now <strong>up to few times faster<\/strong>, depending on file size, user buffer size and <code>FileOptions<\/code> used for the creation of <code>FileStream<\/code>:<\/p>\n<pre><code class=\"language-csharp\">[Benchmark]\r\n[ArgumentsSource(nameof(AsyncArguments))]\r\npublic Task&lt;long&gt; ReadAsync(long fileSize, int userBufferSize, FileOptions options)\r\n    =&gt; ReadAsync(fileSize, userBufferSize, options, streamBufferSize: FourKibibytes);\r\n\r\n[Benchmark]\r\n[ArgumentsSource(nameof(AsyncArguments_NoBuffering))]\r\npublic Task&lt;long&gt; ReadAsync_NoBuffering(long fileSize, int userBufferSize, FileOptions options)\r\n    =&gt; ReadAsync(fileSize, userBufferSize, options, streamBufferSize: 1);\r\n\r\nasync Task&lt;long&gt; ReadAsync(long fileSize, int userBufferSize, FileOptions options, int streamBufferSize)\r\n{\r\n    CancellationToken cancellationToken = CancellationToken.None;\r\n    Memory&lt;byte&gt; userBuffer = new Memory&lt;byte&gt;(_userBuffers[userBufferSize]);\r\n    long bytesRead = 0;\r\n    using (FileStream fileStream = new FileStream(_sourceFilePaths[fileSize], FileMode.Open, FileAccess.Read, FileShare.Read, streamBufferSize, options))\r\n    {\r\n        while (bytesRead &lt; fileSize)\r\n        {\r\n            bytesRead += await fileStream.ReadAsync(userBuffer, cancellationToken);\r\n        }\r\n    }\r\n\r\n    return bytesRead;\r\n}<\/code><\/pre>\n<table>\n<thead>\n<tr>\n<th>Method<\/th>\n<th>Runtime<\/th>\n<th>fileSize<\/th>\n<th>userBufferSize<\/th>\n<th>options<\/th>\n<th style=\"text-align: right\">Mean<\/th>\n<th style=\"text-align: right\">Ratio<\/th>\n<th style=\"text-align: right\">Allocated<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>ReadAsync<\/td>\n<td>.NET 5.0<\/td>\n<td>1048576<\/td>\n<td>512<\/td>\n<td>Asynchronous<\/td>\n<td style=\"text-align: right\">5,163.71 \u03bcs<\/td>\n<td style=\"text-align: right\">1.00<\/td>\n<td style=\"text-align: right\">41,479 B<\/td>\n<\/tr>\n<tr>\n<td>ReadAsync<\/td>\n<td>.NET 6.0<\/td>\n<td>1048576<\/td>\n<td>512<\/td>\n<td>Asynchronous<\/td>\n<td style=\"text-align: right\">3,406.73 \u03bcs<\/td>\n<td style=\"text-align: right\">0.66<\/td>\n<td style=\"text-align: right\">5,233 B<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<\/tr>\n<tr>\n<td>ReadAsync<\/td>\n<td>.NET 5.0<\/td>\n<td>1048576<\/td>\n<td>4096<\/td>\n<td>Asynchronous<\/td>\n<td style=\"text-align: right\">6,575.26 \u03bcs<\/td>\n<td style=\"text-align: right\">1.00<\/td>\n<td style=\"text-align: right\">80,320 B<\/td>\n<\/tr>\n<tr>\n<td>ReadAsync<\/td>\n<td>.NET 6.0<\/td>\n<td>1048576<\/td>\n<td>4096<\/td>\n<td>Asynchronous<\/td>\n<td style=\"text-align: right\">2,873.59 \u03bcs<\/td>\n<td style=\"text-align: right\">0.44<\/td>\n<td style=\"text-align: right\">936 B<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<\/tr>\n<tr>\n<td>ReadAsync_NoBuffering<\/td>\n<td>.NET 5.0<\/td>\n<td>1048576<\/td>\n<td>16384<\/td>\n<td>Asynchronous<\/td>\n<td style=\"text-align: right\">1,915.17 \u03bcs<\/td>\n<td style=\"text-align: right\">1.00<\/td>\n<td style=\"text-align: right\">20,420 B<\/td>\n<\/tr>\n<tr>\n<td>ReadAsync_NoBuffering<\/td>\n<td>.NET 6.0<\/td>\n<td>1048576<\/td>\n<td>16384<\/td>\n<td>Asynchronous<\/td>\n<td style=\"text-align: right\">856.61 \u03bcs<\/td>\n<td style=\"text-align: right\">0.45<\/td>\n<td style=\"text-align: right\">782 B<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<\/tr>\n<tr>\n<td>ReadAsync<\/td>\n<td>.NET 5.0<\/td>\n<td>104857600<\/td>\n<td>4096<\/td>\n<td>Asynchronous<\/td>\n<td style=\"text-align: right\">714,699.30 \u03bcs<\/td>\n<td style=\"text-align: right\">1.00<\/td>\n<td style=\"text-align: right\">7,987,648 B<\/td>\n<\/tr>\n<tr>\n<td>ReadAsync<\/td>\n<td>.NET 6.0<\/td>\n<td>104857600<\/td>\n<td>4096<\/td>\n<td>Asynchronous<\/td>\n<td style=\"text-align: right\">297,675.86 \u03bcs<\/td>\n<td style=\"text-align: right\">0.42<\/td>\n<td style=\"text-align: right\">2,272 B<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<\/tr>\n<tr>\n<td>ReadAsync_NoBuffering<\/td>\n<td>.NET 5.0<\/td>\n<td>104857600<\/td>\n<td>16384<\/td>\n<td>Asynchronous<\/td>\n<td style=\"text-align: right\">192,485.40 \u03bcs<\/td>\n<td style=\"text-align: right\">1.00<\/td>\n<td style=\"text-align: right\">1,997,248 B<\/td>\n<\/tr>\n<tr>\n<td>ReadAsync_NoBuffering<\/td>\n<td>.NET 6.0<\/td>\n<td>104857600<\/td>\n<td>16384<\/td>\n<td>Asynchronous<\/td>\n<td style=\"text-align: right\">93,350.07 \u03bcs<\/td>\n<td style=\"text-align: right\">0.49<\/td>\n<td style=\"text-align: right\">1,040 B<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p><strong>Note:<\/strong> As you can see, the size of the user buffer (the <code>Memory&lt;byte&gt;<\/code> passed to <code>FileStream.ReadAsync<\/code>) has a great impact on the total execution time. Reading 1 MB file using 512 byte buffer was taking 3,406.73 \u03bcs on average, 2,873.59 \u03bcs for a 4 kB buffer and 856.61 \u03bcs for 16 kB. By using the right buffer size, we can speed up the read operation even more than 3x! Would you be interested in reading <em>.NET File IO performance guidelines<\/em>? (We want to know it before we invest our time in writing them).<\/p>\n<h4>Unix<\/h4>\n<p><code>ReadAsync<\/code> implementation has benefited from the optimizations described for <code>WriteAsync<\/code> Unix implementation:<\/p>\n<table>\n<thead>\n<tr>\n<th>Method<\/th>\n<th>Runtime<\/th>\n<th>fileSize<\/th>\n<th>userBufferSize<\/th>\n<th>options<\/th>\n<th style=\"text-align: right\">Mean<\/th>\n<th style=\"text-align: right\">Ratio<\/th>\n<th style=\"text-align: right\">Allocated<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>ReadAsync<\/td>\n<td>.NET 5.0<\/td>\n<td>1048576<\/td>\n<td>512<\/td>\n<td>None<\/td>\n<td style=\"text-align: right\">3,550.898 \u03bcs<\/td>\n<td style=\"text-align: right\">1.00<\/td>\n<td style=\"text-align: right\">233,997 B<\/td>\n<\/tr>\n<tr>\n<td>ReadAsync<\/td>\n<td>.NET 6.0<\/td>\n<td>1048576<\/td>\n<td>512<\/td>\n<td>None<\/td>\n<td style=\"text-align: right\">674.037 \u03bcs<\/td>\n<td style=\"text-align: right\">0.19<\/td>\n<td style=\"text-align: right\">5,019 B<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<\/tr>\n<tr>\n<td>ReadAsync<\/td>\n<td>.NET 5.0<\/td>\n<td>1048576<\/td>\n<td>512<\/td>\n<td>Asynchronous<\/td>\n<td style=\"text-align: right\">744.525 \u03bcs<\/td>\n<td style=\"text-align: right\">1.00<\/td>\n<td style=\"text-align: right\">35,369 B<\/td>\n<\/tr>\n<tr>\n<td>ReadAsync<\/td>\n<td>.NET 6.0<\/td>\n<td>1048576<\/td>\n<td>512<\/td>\n<td>Asynchronous<\/td>\n<td style=\"text-align: right\">663.037 \u03bcs<\/td>\n<td style=\"text-align: right\">0.91<\/td>\n<td style=\"text-align: right\">5,019 B<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<\/tr>\n<tr>\n<td>ReadAsync<\/td>\n<td>.NET 5.0<\/td>\n<td>1048576<\/td>\n<td>4096<\/td>\n<td>None<\/td>\n<td style=\"text-align: right\">537.004 \u03bcs<\/td>\n<td style=\"text-align: right\">1.00<\/td>\n<td style=\"text-align: right\">29,169 B<\/td>\n<\/tr>\n<tr>\n<td>ReadAsync<\/td>\n<td>.NET 6.0<\/td>\n<td>1048576<\/td>\n<td>4096<\/td>\n<td>None<\/td>\n<td style=\"text-align: right\">375.843 \u03bcs<\/td>\n<td style=\"text-align: right\">0.72<\/td>\n<td style=\"text-align: right\">706 B<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<\/tr>\n<tr>\n<td>ReadAsync<\/td>\n<td>.NET 5.0<\/td>\n<td>1048576<\/td>\n<td>4096<\/td>\n<td>Asynchronous<\/td>\n<td style=\"text-align: right\">499.676 \u03bcs<\/td>\n<td style=\"text-align: right\">1.00<\/td>\n<td style=\"text-align: right\">31,249 B<\/td>\n<\/tr>\n<tr>\n<td>ReadAsync<\/td>\n<td>.NET 6.0<\/td>\n<td>1048576<\/td>\n<td>4096<\/td>\n<td>Asynchronous<\/td>\n<td style=\"text-align: right\">398.217 \u03bcs<\/td>\n<td style=\"text-align: right\">0.81<\/td>\n<td style=\"text-align: right\">706 B<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<\/tr>\n<tr>\n<td>ReadAsync_NoBuffering<\/td>\n<td>.NET 5.0<\/td>\n<td>1048576<\/td>\n<td>16384<\/td>\n<td>None<\/td>\n<td style=\"text-align: right\">187.578 \u03bcs<\/td>\n<td style=\"text-align: right\">1.00<\/td>\n<td style=\"text-align: right\">7,664 B<\/td>\n<\/tr>\n<tr>\n<td>ReadAsync_NoBuffering<\/td>\n<td>.NET 6.0<\/td>\n<td>1048576<\/td>\n<td>16384<\/td>\n<td>None<\/td>\n<td style=\"text-align: right\">154.951 \u03bcs<\/td>\n<td style=\"text-align: right\">0.83<\/td>\n<td style=\"text-align: right\">553 B<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<\/tr>\n<tr>\n<td>ReadAsync_NoBuffering<\/td>\n<td>.NET 5.0<\/td>\n<td>1048576<\/td>\n<td>16384<\/td>\n<td>Asynchronous<\/td>\n<td style=\"text-align: right\">189.687 \u03bcs<\/td>\n<td style=\"text-align: right\">1.00<\/td>\n<td style=\"text-align: right\">8,208 B<\/td>\n<\/tr>\n<tr>\n<td>ReadAsync_NoBuffering<\/td>\n<td>.NET 6.0<\/td>\n<td>1048576<\/td>\n<td>16384<\/td>\n<td>Asynchronous<\/td>\n<td style=\"text-align: right\">158.541 \u03bcs<\/td>\n<td style=\"text-align: right\">0.84<\/td>\n<td style=\"text-align: right\">553 B<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<\/tr>\n<tr>\n<td>ReadAsync<\/td>\n<td>.NET 5.0<\/td>\n<td>104857600<\/td>\n<td>4096<\/td>\n<td>None<\/td>\n<td style=\"text-align: right\">49,196.600 \u03bcs<\/td>\n<td style=\"text-align: right\">1.00<\/td>\n<td style=\"text-align: right\">2,867,768 B<\/td>\n<\/tr>\n<tr>\n<td>ReadAsync<\/td>\n<td>.NET 6.0<\/td>\n<td>104857600<\/td>\n<td>4096<\/td>\n<td>None<\/td>\n<td style=\"text-align: right\">41,890.758 \u03bcs<\/td>\n<td style=\"text-align: right\">0.85<\/td>\n<td style=\"text-align: right\">1,124 B<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<\/tr>\n<tr>\n<td>ReadAsync<\/td>\n<td>.NET 5.0<\/td>\n<td>104857600<\/td>\n<td>4096<\/td>\n<td>Asynchronous<\/td>\n<td style=\"text-align: right\">48,793.215 \u03bcs<\/td>\n<td style=\"text-align: right\">1.00<\/td>\n<td style=\"text-align: right\">3,072,600 B<\/td>\n<\/tr>\n<tr>\n<td>ReadAsync<\/td>\n<td>.NET 6.0<\/td>\n<td>104857600<\/td>\n<td>4096<\/td>\n<td>Asynchronous<\/td>\n<td style=\"text-align: right\">42,725.572 \u03bcs<\/td>\n<td style=\"text-align: right\">0.88<\/td>\n<td style=\"text-align: right\">1,124 B<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<\/tr>\n<tr>\n<td>ReadAsync_NoBuffering<\/td>\n<td>.NET 5.0<\/td>\n<td>104857600<\/td>\n<td>16384<\/td>\n<td>None<\/td>\n<td style=\"text-align: right\">23,819.030 \u03bcs<\/td>\n<td style=\"text-align: right\">1.00<\/td>\n<td style=\"text-align: right\">717,354 B<\/td>\n<\/tr>\n<tr>\n<td>ReadAsync_NoBuffering<\/td>\n<td>.NET 6.0<\/td>\n<td>104857600<\/td>\n<td>16384<\/td>\n<td>None<\/td>\n<td style=\"text-align: right\">18,961.480 \u03bcs<\/td>\n<td style=\"text-align: right\">0.80<\/td>\n<td style=\"text-align: right\">644 B<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<\/tr>\n<tr>\n<td>ReadAsync_NoBuffering<\/td>\n<td>.NET 5.0<\/td>\n<td>104857600<\/td>\n<td>16384<\/td>\n<td>Asynchronous<\/td>\n<td style=\"text-align: right\">21,595.085 \u03bcs<\/td>\n<td style=\"text-align: right\">1.00<\/td>\n<td style=\"text-align: right\">768,557 B<\/td>\n<\/tr>\n<tr>\n<td>ReadAsync_NoBuffering<\/td>\n<td>.NET 6.0<\/td>\n<td>104857600<\/td>\n<td>16384<\/td>\n<td>Asynchronous<\/td>\n<td style=\"text-align: right\">18,861.580 \u03bcs<\/td>\n<td style=\"text-align: right\">0.87<\/td>\n<td style=\"text-align: right\">668 B<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h3>Read &amp; Write<\/h3>\n<p><a href=\"https:\/\/docs.microsoft.com\/dotnet\/api\/system.io.filestream.read\">FileStream.Read()<\/a> and <a href=\"https:\/\/docs.microsoft.com\/dotnet\/api\/system.io.filestream.write\">FileStream.Write()<\/a> implementation for files opened for synchronous IO was already optimal for all OSes. But that was not true for files opened for async file IO on Windows.<\/p>\n<pre><code class=\"language-csharp\">[Benchmark]\r\n[ArgumentsSource(nameof(SyncArguments))]\r\npublic long Read(long fileSize, int userBufferSize, FileOptions options)\r\n    =&gt; Read(fileSize, userBufferSize, options, streamBufferSize: FourKibibytes);\r\n\r\nprivate long Read(long fileSize, int userBufferSize, FileOptions options, int streamBufferSize)\r\n{\r\n    byte[] userBuffer = _userBuffers[userBufferSize];\r\n    long bytesRead = 0;\r\n    using (FileStream fileStream = new FileStream(\r\n        _sourceFilePaths[fileSize], FileMode.Open, FileAccess.Read, FileShare.Read, streamBufferSize, options))\r\n    {\r\n        while (bytesRead &lt; fileSize)\r\n        {\r\n            bytesRead += fileStream.Read(userBuffer, 0, userBuffer.Length);\r\n        }\r\n    }\r\n\r\n    return bytesRead;\r\n}\r\n\r\n[Benchmark]\r\n[ArgumentsSource(nameof(SyncArguments))]\r\npublic void Write(long fileSize, int userBufferSize, FileOptions options)\r\n    =&gt; Write(fileSize, userBufferSize, options, streamBufferSize: FourKibibytes);\r\n\r\nprivate void Write(long fileSize, int userBufferSize, FileOptions options, int streamBufferSize)\r\n{\r\n    byte[] userBuffer = _userBuffers[userBufferSize];\r\n    using (FileStream fileStream = new FileStream(_destinationFilePaths[fileSize], FileMode.Create, FileAccess.Write, FileShare.Read, streamBufferSize, options))\r\n    {\r\n        for (int i = 0; i &lt; fileSize \/ userBufferSize; i++)\r\n        {\r\n            fileStream.Write(userBuffer, 0, userBuffer.Length);\r\n        }\r\n    }\r\n}<\/code><\/pre>\n<h4>Windows<\/h4>\n<p>Prior to .NET 6 Preview 6, sync file operations for async file handles were simply starting async file operation by scheduling a new work item to Thread Pool and blocking current thread until the work was finished. Since we did not know whether given span was pointing to a stack-allocated memory, <code>FileStream<\/code> was also <a href=\"https:\/\/github.com\/dotnet\/runtime\/blob\/3d5d26181cd7bc07ea6c6f87710c14ccd043b415\/src\/libraries\/System.Private.CoreLib\/src\/System\/IO\/Stream.cs#L317-L330\">performing a copy<\/a> of given memory buffer to a managed array rented from <code>ArrayPool<\/code>.<\/p>\n<p>This has been changed in .NET 6 Preview 6 (<a href=\"https:\/\/github.com\/dotnet\/runtime\/pull\/54266\">#54266<\/a>) and now sync operations on <code>FileStream<\/code> opened for async file IO on Windows just allocate a dedicated <a href=\"https:\/\/docs.microsoft.com\/dotnet\/standard\/threading\/eventwaithandle\">wait handle<\/a>, start async operation using <code>ReadFile<\/code> or <code>WriteFile<\/code> sys-call and wait for the completion to be signaled by the OS.<\/p>\n<p>Some pseudocode:<\/p>\n<pre><code class=\"language-csharp\">public class FileStream\r\n{\r\n    long _position;\r\n    SafeFileHandle _handle;\r\n\r\n    int ReadBefore(Span&lt;byte&gt; buffer)\r\n    {\r\n        if (_handle.IsAsync)\r\n        {\r\n            byte[] managed = ArrayPool.Shared&lt;byte&gt;.Rent(buffer.Length);\r\n            buffer.CopyTo(managed, 0, buffer.Length);\r\n\r\n            int bytesRead = ReadAsync(managed).GetAwaiter().GetResult();\r\n\r\n            ArrayPool.Shared&lt;byte&gt;.Return(managed);\r\n\r\n            return bytesRead;\r\n        }\r\n    }\r\n\r\n    int ReadAfter(Span&lt;byte&gt; buffer)\r\n    {\r\n        if (_handle.IsAsync)\r\n        {\r\n            var waitHandle = new EventWaitHandle();\r\n            ReadFile(_handle, buffer, _position, waitHandle);\r\n\r\n            waitHandle.WaitOne();\r\n\r\n            int bytesRead = GetOverlappedResult(_handle);\r\n\r\n            return bytesRead;\r\n        }\r\n    }\r\n}<\/code><\/pre>\n<p>And again the reads and writes became up to few times faster!<\/p>\n<table>\n<thead>\n<tr>\n<th>Method<\/th>\n<th>Runtime<\/th>\n<th>fileSize<\/th>\n<th>userBufferSize<\/th>\n<th>options<\/th>\n<th style=\"text-align: right\">Mean<\/th>\n<th style=\"text-align: right\">Ratio<\/th>\n<th style=\"text-align: right\">Allocated<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>Read<\/td>\n<td>.NET 5.0<\/td>\n<td>1048576<\/td>\n<td>512<\/td>\n<td>Asynchronous<\/td>\n<td style=\"text-align: right\">5,138.52 \u03bcs<\/td>\n<td style=\"text-align: right\">1.00<\/td>\n<td style=\"text-align: right\">41,435 B<\/td>\n<\/tr>\n<tr>\n<td>Read<\/td>\n<td>.NET 6.0<\/td>\n<td>1048576<\/td>\n<td>512<\/td>\n<td>Asynchronous<\/td>\n<td style=\"text-align: right\">2,333.87 \u03bcs<\/td>\n<td style=\"text-align: right\">0.45<\/td>\n<td style=\"text-align: right\">59,687 B<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<\/tr>\n<tr>\n<td>Write<\/td>\n<td>.NET 5.0<\/td>\n<td>1048576<\/td>\n<td>512<\/td>\n<td>Asynchronous<\/td>\n<td style=\"text-align: right\">7,122.63 \u03bcs<\/td>\n<td style=\"text-align: right\">1.00<\/td>\n<td style=\"text-align: right\">41,421 B<\/td>\n<\/tr>\n<tr>\n<td>Write<\/td>\n<td>.NET 6.0<\/td>\n<td>1048576<\/td>\n<td>512<\/td>\n<td>Asynchronous<\/td>\n<td style=\"text-align: right\">4,163.27 \u03bcs<\/td>\n<td style=\"text-align: right\">0.59<\/td>\n<td style=\"text-align: right\">59,692 B<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<\/tr>\n<tr>\n<td>Read<\/td>\n<td>.NET 5.0<\/td>\n<td>1048576<\/td>\n<td>4096<\/td>\n<td>Asynchronous<\/td>\n<td style=\"text-align: right\">5,260.75 \u03bcs<\/td>\n<td style=\"text-align: right\">1.00<\/td>\n<td style=\"text-align: right\">80,075 B<\/td>\n<\/tr>\n<tr>\n<td>Read<\/td>\n<td>.NET 6.0<\/td>\n<td>1048576<\/td>\n<td>4096<\/td>\n<td>Asynchronous<\/td>\n<td style=\"text-align: right\">2,113.04 \u03bcs<\/td>\n<td style=\"text-align: right\">0.40<\/td>\n<td style=\"text-align: right\">55,567 B<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<\/tr>\n<tr>\n<td>Write<\/td>\n<td>.NET 5.0<\/td>\n<td>1048576<\/td>\n<td>4096<\/td>\n<td>Asynchronous<\/td>\n<td style=\"text-align: right\">20,534.30 \u03bcs<\/td>\n<td style=\"text-align: right\">1.00<\/td>\n<td style=\"text-align: right\">80,093 B<\/td>\n<\/tr>\n<tr>\n<td>Write<\/td>\n<td>.NET 6.0<\/td>\n<td>1048576<\/td>\n<td>4096<\/td>\n<td>Asynchronous<\/td>\n<td style=\"text-align: right\">3,788.95 \u03bcs<\/td>\n<td style=\"text-align: right\">0.19<\/td>\n<td style=\"text-align: right\">55,572 B<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<\/tr>\n<tr>\n<td>Read<\/td>\n<td>.NET 5.0<\/td>\n<td>104857600<\/td>\n<td>4096<\/td>\n<td>Asynchronous<\/td>\n<td style=\"text-align: right\">537,752.97 \u03bcs<\/td>\n<td style=\"text-align: right\">1.00<\/td>\n<td style=\"text-align: right\">7,990,536 B<\/td>\n<\/tr>\n<tr>\n<td>Read<\/td>\n<td>.NET 6.0<\/td>\n<td>104857600<\/td>\n<td>4096<\/td>\n<td>Asynchronous<\/td>\n<td style=\"text-align: right\">232,123.57 \u03bcs<\/td>\n<td style=\"text-align: right\">0.43<\/td>\n<td style=\"text-align: right\">5,530,632 B<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<td style=\"text-align: right\"><\/td>\n<\/tr>\n<tr>\n<td>Write<\/td>\n<td>.NET 5.0<\/td>\n<td>104857600<\/td>\n<td>4096<\/td>\n<td>Asynchronous<\/td>\n<td style=\"text-align: right\">2,486,838.27 \u03bcs<\/td>\n<td style=\"text-align: right\">1.00<\/td>\n<td style=\"text-align: right\">8,003,016 B<\/td>\n<\/tr>\n<tr>\n<td>Write<\/td>\n<td>.NET 6.0<\/td>\n<td>104857600<\/td>\n<td>4096<\/td>\n<td>Asynchronous<\/td>\n<td style=\"text-align: right\">328,680.68 \u03bcs<\/td>\n<td style=\"text-align: right\">0.13<\/td>\n<td style=\"text-align: right\">5,530,632 B<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h4>Unix<\/h4>\n<p>Unix has no separation for async and sync file handles (the <code>O_ASYNC<\/code> flag passed to <a href=\"https:\/\/man7.org\/linux\/man-pages\/man2\/open.2.html\">open()<\/a> has no effect for regular files as of today) so we could not apply a similar optimization to Unix implementation.<\/p>\n<h2>Thread-Safe File IO<\/h2>\n<p>We recognized the need for <strong>thread-safe File IO<\/strong>. To make this possible, stateless and offset-based <a href=\"https:\/\/github.com\/dotnet\/runtime\/issues\/24847#issuecomment-848063507\">APIs<\/a> have been introduced in <a href=\"https:\/\/github.com\/dotnet\/runtime\/pull\/53669\">#53669<\/a> which was part of .NET 6 Preview 7:<\/p>\n<pre><code class=\"language-csharp\">namespace System.IO\r\n{\r\n    public static class RandomAccess\r\n    {\r\n        public static int Read(SafeFileHandle handle, Span&lt;byte&gt; buffer, long fileOffset);\r\n        public static void Write(SafeFileHandle handle, ReadOnlySpan&lt;byte&gt; buffer, long fileOffset);\r\n\r\n        public static ValueTask&lt;int&gt; ReadAsync(SafeFileHandle handle, Memory&lt;byte&gt; buffer, long fileOffset, CancellationToken cancellationToken = default);\r\n        public static ValueTask WriteAsync(SafeFileHandle handle, ReadOnlyMemory&lt;byte&gt; buffer, long fileOffset, CancellationToken cancellationToken = default);\r\n\r\n        public static long GetLength(SafeFileHandle handle);\r\n    }\r\n\r\n    partial class File\r\n    {\r\n        public static SafeFileHandle OpenHandle(string filePath, FileMode mode = FileMode.Open, FileAccess access = FileAccess.Read, \r\n            FileShare share = FileShare.Read, FileOptions options = FileOptions.None, long preallocationSize = 0);\r\n    }\r\n}<\/code><\/pre>\n<p>By always requesting the file offset, we can use offset-based sys-calls (<a href=\"https:\/\/man7.org\/linux\/man-pages\/man2\/pwrite.2.html\">pread()|pwrite()<\/a> on Unix, <a href=\"https:\/\/docs.microsoft.com\/windows\/win32\/api\/fileapi\/nf-fileapi-readfile\">ReadFile()|WriteFile()<\/a> on Windows) that don&#8217;t modify the current offset for the given file handle. It allows for <strong>thread-safe<\/strong> reads and writes.<\/p>\n<p>Sample usage:<\/p>\n<pre><code class=\"language-csharp\">async Task ThreadSafeAsync(string path, IReadOnlyList&lt;ReadOnlyMemory&lt;byte&gt;&gt; buffers)\r\n{\r\n    using SafeFileHandle handle = File.OpenHandle( \/\/ new API (preview 6)\r\n        path, FileMode.Create, FileAccess.Write, FileShare.None, FileOptions.Asynchronous); \r\n\r\n    long offset = 0;\r\n    for (int i = 0; i &lt; buffers.Count; i++)\r\n    {\r\n        await RandomAccess.WriteAsync(handle, buffers[i], offset); \/\/ new API (preview 7)\r\n        offset += buffers[i].Length;\r\n    }\r\n}<\/code><\/pre>\n<p><strong>Note:<\/strong> the new APIs don&#8217;t support Pipes and Sockets, as they don&#8217;t have the concept of Offset (Position).<\/p>\n<h2>Scatter\/Gather IO<\/h2>\n<p><a href=\"https:\/\/en.wikipedia.org\/wiki\/Vectored_I\/O\">Scatter\/Gather IO<\/a> allows reducing the number of expensive sys-calls by passing multiple buffers in a single sys-call. This is another high-performance feature that has been implemented for .NET 6 Preview <strong>7<\/strong>:<\/p>\n<pre><code class=\"language-csharp\">namespace System.IO\r\n{\r\n    public static class RandomAccess\r\n    {\r\n        public static long Read(SafeFileHandle handle, IReadOnlyList&lt;Memory&lt;byte&gt;&gt; buffers, long fileOffset);\r\n        public static void Write(SafeFileHandle handle, IReadOnlyList&lt;ReadOnlyMemory&lt;byte&gt;&gt; buffers, long fileOffset);\r\n\r\n        public static ValueTask&lt;long&gt; ReadAsync(SafeFileHandle handle, IReadOnlyList&lt;Memory&lt;byte&gt;&gt; buffers, long fileOffset, CancellationToken cancellationToken = default);\r\n        public static ValueTask WriteAsync(SafeFileHandle handle, IReadOnlyList&lt;ReadOnlyMemory&lt;byte&gt;&gt; buffers, long fileOffset, CancellationToken cancellationToken = default);\r\n    }\r\n}<\/code><\/pre>\n<p>These methods map to the following sys-calls:<\/p>\n<ul>\n<li>Unix: if possible <a href=\"https:\/\/man7.org\/linux\/man-pages\/man2\/readv.2.html\">preadv()|pwritev()<\/a>, otherwise <code>n<\/code> calls to <code>pread|pwrite<\/code>.<\/li>\n<li>Windows: if possible <a href=\"https:\/\/docs.microsoft.com\/windows\/win32\/api\/fileapi\/nf-fileapi-readfilescatter\">ReadFileScatter()|WriteFileGather()<\/a>, otherwise <code>n<\/code> calls to <code>ReadFile|WriteFile<\/code>.<\/li>\n<\/ul>\n<pre><code class=\"language-csharp\">async Task OptimalSysCallsAsync(string path, IReadOnlyList&lt;ReadOnlyMemory&lt;byte&gt;&gt; buffers)\r\n{\r\n    using SafeFileHandle handle = File.OpenHandle(\r\n        path, FileMode.Create, FileAccess.Write, FileShare.None, FileOptions.Asynchronous); \r\n\r\n    await RandomAccess.WriteAsync(handle, buffers, fileOffset: 0); \/\/ new API (preview 7)\r\n}<\/code><\/pre>\n<h3>Benchmark<\/h3>\n<p>How performing fewer sys-calls affects performance?<\/p>\n<pre><code class=\"language-csharp\">const int FileSize = 100_000_000;\r\nstring _filePath = Path.Combine(Path.GetTempPath(), Path.GetTempFileName());\r\nbyte[] _buffer = new byte[16000];\r\n\r\n[Benchmark]\r\npublic void Write()\r\n{\r\n    byte[] userBuffer = _buffer;\r\n    using SafeFileHandle fileHandle = File.OpenHandle(_filePath, FileMode.Create, FileAccess.Write, FileShare.Read, FileOptions.DeleteOnClose);\r\n\r\n    long bytesWritten = 0;\r\n    for (int i = 0; i &lt; FileSize \/ userBuffer.Length; i++)\r\n    {\r\n        RandomAccess.Write(fileHandle, userBuffer, bytesWritten);\r\n        bytesWritten += userBuffer.Length;\r\n    }\r\n}\r\n\r\n[Benchmark]\r\npublic void WriteGather()\r\n{\r\n    byte[] userBuffer = _buffer;\r\n    IReadOnlyList&lt;ReadOnlyMemory&lt;byte&gt;&gt; buffers = new ReadOnlyMemory&lt;byte&gt;[] { _buffer, _buffer, _buffer, _buffer };\r\n    using SafeFileHandle fileHandle = File.OpenHandle(_filePath, FileMode.Create, FileAccess.Write, FileShare.Read, FileOptions.DeleteOnClose);\r\n\r\n    long bytesWritten = 0;\r\n    for (int i = 0; i &lt; FileSize \/ (userBuffer.Length * 4); i++)\r\n    {\r\n        RandomAccess.Write(fileHandle, buffers, bytesWritten);\r\n        bytesWritten += userBuffer.Length * 4;\r\n    }\r\n}<\/code><\/pre>\n<h4>Windows<\/h4>\n<p>The very strict <a href=\"https:\/\/docs.microsoft.com\/windows\/win32\/api\/fileapi\/nf-fileapi-writefilegather#parameters\">WriteFileGather()<\/a> requirements were not met, and we have not observed any gains.<\/p>\n<table>\n<thead>\n<tr>\n<th>Method<\/th>\n<th style=\"text-align: right\">Median<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>Write<\/td>\n<td style=\"text-align: right\">78.48 ms<\/td>\n<\/tr>\n<tr>\n<td>WriteGather<\/td>\n<td style=\"text-align: right\">77.88 ms<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h4>Ubuntu<\/h4>\n<p>We have observed 8% gain due to fewer sys-calls:<\/p>\n<table>\n<thead>\n<tr>\n<th>Method<\/th>\n<th style=\"text-align: right\">Median<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>Write<\/td>\n<td style=\"text-align: right\">96.62 ms<\/td>\n<\/tr>\n<tr>\n<td>WriteGather<\/td>\n<td style=\"text-align: right\">89.87 ms<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h2>Preallocation Size<\/h2>\n<p>We have <a href=\"https:\/\/github.com\/dotnet\/runtime\/pull\/51111\/\">implemented<\/a> one more performance and reliability feature that allows users to specify the file preallocation size: <a href=\"https:\/\/github.com\/dotnet\/runtime\/issues\/45946\">#45946<\/a>.<\/p>\n<p>When <code>PreallocationSize<\/code> is specified, .NET requests the OS to ensure the disk space of a given size is allocated in advance. From a performance perspective, the write operations don&#8217;t need to extend the file and it&#8217;s less likely that the file is going to be fragmented. From a reliability perspective, write operations will no longer fail due to running out of space since the space has already been reserved.<\/p>\n<p>On Unix, it&#8217;s mapped to <a href=\"https:\/\/man7.org\/linux\/man-pages\/man3\/posix_fallocate.3.html\">posix_fallocate()<\/a> or <a href=\"https:\/\/developer.apple.com\/library\/archive\/documentation\/System\/Conceptual\/ManPages_iPhoneOS\/man2\/fcntl.2.html\">fcntl(F_PREALLOCATE)<\/a> or fcntl(F_ALLOCSP). On Windows, to <a href=\"https:\/\/docs.microsoft.com\/windows\/win32\/api\/fileapi\/nf-fileapi-setfileinformationbyhandle\">SetFileInformationByHandle(FILE_ALLOCATION_INFO)<\/a>.<\/p>\n<pre><code class=\"language-csharp\">async Task AllOrNothingAsync(string path, IReadOnlyList&lt;ReadOnlyMemory&lt;byte&gt;&gt; buffers)\r\n{\r\n    using SafeFileHandle handle = File.OpenHandle(\r\n        path, FileMode.Create, FileAccess.Write, FileShare.None, FileOptions.Asynchronous\r\n          preallocationSize: buffers.Sum(buffer =&gt; buffer.Length)); \/\/ new API (preview 6)\r\n\r\n    await RandomAccess.WriteAsync(handle, buffers, fileOffset: 0); \r\n}<\/code><\/pre>\n<p>If there is not enough disk space, or the file is too large for the given filesystem, an <code>IOException<\/code> is thrown. For Pipes and Sockets, the <code>preallocationSize<\/code> is ignored.<\/p>\n<p>The example above uses the new <code>File.OpenHandle<\/code> API, but it&#8217;s also supported by <code>FileStream<\/code>.<\/p>\n<h3>Benchmark<\/h3>\n<p>Let&#8217;s measure how specifying the <code>preallocationSize<\/code> affects writing to a <code>100 MB<\/code> file using <code>16 KB<\/code> buffer:<\/p>\n<pre><code class=\"language-csharp\">string _filePath = Path.Combine(Path.GetTempPath(), Path.GetTempFileName());\r\nbyte[] _buffer = new byte[16000];\r\n\r\n[Benchmark]\r\n[Arguments(true)]\r\n[Arguments(false)]\r\npublic void PreallocationSize(bool specifyPreallocationSize)\r\n{\r\n    byte[] userBuffer = _buffer;\r\n    using SafeFileHandle fileHandle = File.OpenHandle(_filePath, FileMode.Create, FileAccess.Write, FileShare.Read, FileOptions.DeleteOnClose, \r\n        specifyPreallocationSize ? FileSize : 0); \/\/ the difference\r\n\r\n    long bytesWritten = 0;\r\n    for (int i = 0; i &lt; FileSize \/ userBuffer.Length; i++)\r\n    {\r\n        RandomAccess.Write(fileHandle, userBuffer, bytesWritten);\r\n        bytesWritten += userBuffer.Length;\r\n    }\r\n}<\/code><\/pre>\n<h4>Windows<\/h4>\n<p>We can observe a very nice performance gain (around 20% in this case) and a flatter distribution.<\/p>\n<table>\n<thead>\n<tr>\n<th>Method<\/th>\n<th>specifyPreallocationSize<\/th>\n<th style=\"text-align: right\">Median<\/th>\n<th style=\"text-align: right\">Min<\/th>\n<th style=\"text-align: right\">Max<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>PreallocationSize<\/td>\n<td>False<\/td>\n<td style=\"text-align: right\">77.07 ms<\/td>\n<td style=\"text-align: right\">75.78 ms<\/td>\n<td style=\"text-align: right\">91.90 ms<\/td>\n<\/tr>\n<tr>\n<td>PreallocationSize<\/td>\n<td>True<\/td>\n<td style=\"text-align: right\">61.93 ms<\/td>\n<td style=\"text-align: right\">61.46 ms<\/td>\n<td style=\"text-align: right\">63.86 ms<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h4>Ubuntu<\/h4>\n<p>In the case of Ubuntu, we can observe an even more impressive perf win (more than 50% in this case):<\/p>\n<table>\n<thead>\n<tr>\n<th>Method<\/th>\n<th>specifyPreallocationSize<\/th>\n<th style=\"text-align: right\">Median<\/th>\n<th style=\"text-align: right\">Min<\/th>\n<th style=\"text-align: right\">Max<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>PreallocationSize<\/td>\n<td>False<\/td>\n<td style=\"text-align: right\">96.65 ms<\/td>\n<td style=\"text-align: right\">95.41 ms<\/td>\n<td style=\"text-align: right\">101.78 ms<\/td>\n<\/tr>\n<tr>\n<td>PreallocationSize<\/td>\n<td>True<\/td>\n<td style=\"text-align: right\">43.70 ms<\/td>\n<td style=\"text-align: right\">43.07 ms<\/td>\n<td style=\"text-align: right\">51.50 ms<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h2>FileStreamOptions<\/h2>\n<p>To improve the user experience of creating new <code>FileStream<\/code> instances, we are introducing a new type called <code>FileStreamOptions<\/code> which is an implementation of the <a href=\"https:\/\/docs.microsoft.com\/dotnet\/core\/extensions\/options\">Options pattern<\/a>. It&#8217;s part of <strong>.NET 6.0 Preview 5<\/strong>.<\/p>\n<pre><code class=\"language-csharp\">namespace System.IO\r\n{\r\n    public sealed class FileStreamOptions\r\n    {\r\n        public FileStreamOptions() {}\r\n        public FileMode Mode { get; set; }\r\n        public FileAccess Access { get; set; } = FileAccess.Read;\r\n        public FileShare Share { get; set; } = FileShare.Read;\r\n        public FileOptions Options { get; set; }\r\n        public int BufferSize { get; set; } = 4096;\r\n        public long PreallocationSize { get; set; }\r\n    }\r\n\r\n    public class FileStream : Stream\r\n    {\r\n        public FileStream(string path, FileStreamOptions options);\r\n    }\r\n}<\/code><\/pre>\n<p>Sample usage:<\/p>\n<pre><code class=\"language-csharp\">var openForReading = new FileStreamOptions { Mode = FileMode.Open };\r\nusing FileStream source = new FileStream(\"source.txt\", openForReading);\r\n\r\nvar createForWriting = new FileStreamOptions\r\n{\r\n    Mode = FileMode.CreateNew,\r\n    Access = FileAccess.Write,\r\n    Options = FileOptions.WriteThrough,\r\n    BufferSize = 0, \/\/ disable FileStream buffering\r\n    PreallocationSize = source.Length \/\/ specify size up-front\r\n};\r\nusing FileStream destination = new FileStream(\"destination.txt\", createForWriting);\r\nsource.CopyTo(destination);<\/code><\/pre>\n<p>Moreover, we have <a href=\"https:\/\/github.com\/dotnet\/runtime\/pull\/52587\">hidden<\/a> the <code>[Obsolete]<\/code> <code>FileStream<\/code> constructors to exclude them from IntelliSense.<\/p>\n<h2>Breaking changes<\/h2>\n<h3>Synchronization of async operations when buffering is enabled<\/h3>\n<p>On Windows, for <code>FileStream<\/code> opened for asynchronous IO with buffering enabled, calls to <code>ReadAsync()<\/code> (<a href=\"https:\/\/github.com\/dotnet\/runtime\/issues\/16341\">#16341<\/a>) and <code>FlushAsync()<\/code> (<a href=\"https:\/\/github.com\/dotnet\/runtime\/issues\/27643\">#27643<\/a>) were performing <strong>blocking<\/strong> calls when they were filling or flushing the buffer.<\/p>\n<p>In order to fix that, we have introduced synchronization in <a href=\"https:\/\/github.com\/dotnet\/runtime\/pull\/48813\">#48813<\/a>. <strong>When buffering is enabled, all async operations are serialized<\/strong>.<\/p>\n<p>This introduced the first breaking change: <code>FileStream.Position<\/code> is now updated <strong>after<\/strong> the asynchronous operation completes, not before the operation is started. <code>FileStream<\/code> has never been thread-safe, but for those of you who were starting multiple asynchronous operations and not awaiting them, you won&#8217;t observe an updated <code>Position<\/code> before awaiting the operations (the order of the operations is not going to change). The recommended approach is to use the new APIs for Scatter\/Gather IO.<\/p>\n<p>Since this is an anti-pattern, and should be rare, we won&#8217;t go into details, but you can read more about it in <a href=\"https:\/\/github.com\/dotnet\/runtime\/issues\/50858\">#50858<\/a>.<\/p>\n<h3>Tracking file offset only in memory<\/h3>\n<p>None of the <code>FileStream.Read*()<\/code> and <code>FileStream.Write*()<\/code> operations synchronize the offset with the OS anymore. The current offset can be obtained <strong>only<\/strong> with a call to <code>FileStream.Position<\/code> or <code>FileStream.Seek(0, SeekOrigin.Current)<\/code>. If you obtain the handle by calling <code>FileStream.SafeFileHandle<\/code>, and ask the OS for the current offset for the given handle by using the <code>SetFilePointerEx<\/code> or <code>lseek<\/code> sys-call, it won&#8217;t always return the same value as <code>FileStream.Position<\/code>. It works in the other direction as well: if you obtain the <code>FileStream.SafeFileHandle<\/code> and use a sys-call that modifies the offset, <code>FileStream.Position<\/code> won&#8217;t reflect the change. Since we believe that this is a very niche scenario, we won&#8217;t describe the breaking change in detail here. For more details, please refer to <a href=\"https:\/\/github.com\/dotnet\/runtime\/issues\/50860\">#50860<\/a>.<\/p>\n<h3>.NET 5 compatibility mode<\/h3>\n<p>You can request .NET 5 compatibility mode in <code>runtimeconfig.json<\/code>:<\/p>\n<pre><code class=\"language-json\">{\r\n    \"configProperties\": {\r\n        \"System.IO.UseNet5CompatFileStream\": true\r\n    }\r\n}<\/code><\/pre>\n<p>Or using the following environment variable:<\/p>\n<pre><code class=\"language-cmd\">set DOTNET_SYSTEM_IO_USENET5COMPATFILESTREAM=1<\/code><\/pre>\n<p>This mode is going to be <a href=\"https:\/\/github.com\/dotnet\/runtime\/issues\/55196\">removed<\/a> in .NET 7. Please let us know if there is any scenario that can&#8217;t be implemented in .NET 6 without using the .NET 5 compatibility mode.<\/p>\n<h2>What&#8217;s Next?<\/h2>\n<p>We have already started working on adding support for a new <code>FileMode<\/code> that is going to allow for atomic appends to end of file: <a href=\"https:\/\/github.com\/dotnet\/runtime\/issues\/53432#issuecomment-902478772\">#53432<\/a>.<\/p>\n<p>We are considering adding <code>io_uring<\/code> support as part of our .NET 7 planning. If you would benefit from <code>FileStream<\/code> performance improvements on 5.5+ kernels, please share your scenario on the <a href=\"https:\/\/github.com\/dotnet\/runtime\/issues\/51985\">dotnet\/runtime#51985<\/a> issue to help us prioritize that effort.<\/p>\n<h2>Summary<\/h2>\n<p>In .NET 6, we&#8217;ve made several improvements to file IO:<\/p>\n<ul>\n<li>Async file IO can be now up to few times faster and allocation-free.<\/li>\n<li>Async file IO on Windows is not using blocking APIs anymore.<\/li>\n<li>New stateless and offset-based APIs for thread-safe file IO have been introduced. Some overloads accept multiple buffers at a time, allowing to reduce the number of sys-calls.<\/li>\n<li>New APIs for specifying file preallocation size have been introduced. Both performance and reliability can be improved by using them.<\/li>\n<li><code>FileStream.Position<\/code> is not synchronized with the OS anymore (it&#8217;s tracked only in memory).<\/li>\n<li><code>FileStream.Position<\/code> is updated after the async operation has completed, not before it was started.<\/li>\n<li>Users can request .NET 5 compatibility mode using a configuration file or an environment variable.<\/li>\n<li><code>FileStream<\/code> behavior for edge cases has been aligned for both Windows and Unix.<\/li>\n<\/ul>\n<p>Let us know what you think!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>High-performance File IO<\/p>\n","protected":false},"author":60716,"featured_media":34140,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[685,3012,3009],"tags":[],"class_list":["post-34139","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-dotnet","category-internals","category-performance"],"acf":[],"blog_post_summary":"<p>High-performance File IO<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/posts\/34139","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/users\/60716"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/comments?post=34139"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/posts\/34139\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/media\/34140"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/media?parent=34139"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/categories?post=34139"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/tags?post=34139"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}