{"id":44863,"date":"2015-01-21T07:00:00","date_gmt":"2015-01-21T22:00:00","guid":{"rendered":"https:\/\/blogs.msdn.microsoft.com\/oldnewthing\/2015\/01\/21\/why-does-my-synchronous-overlapped-readfile-return-false-when-the-end-of-the-file-is-reached\/"},"modified":"2019-03-13T12:12:14","modified_gmt":"2019-03-13T19:12:14","slug":"20150121-00","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/oldnewthing\/20150121-00\/?p=44863","title":{"rendered":"Why does my synchronous overlapped ReadFile return FALSE when the end of the file is reached?"},"content":{"rendered":"<p>A customer reported that the behavior of <code>Read&shy;File<\/code> was not what they were expecting. <\/p>\n<blockquote CLASS=\"q\"><p>We have a synchronous file handle (not created with <code>FILE_FLAG_OVERLAPPED<\/code>), but we issue reads against it with an <code>OVERLAPPED<\/code> structure. We find that when we read past the end of the file, the <code>Read&shy;File<\/code> returns <code>FALSE<\/code> even though the documentation says it should return <code>TRUE<\/code>. <\/p><\/blockquote>\n<p>They were kind enough to <a HREF=\"http:\/\/blogs.msdn.com\/b\/oldnewthing\/archive\/2013\/10\/18\/10457796.aspx\">include a simple program that demonstrates the problem<\/a>. <\/p>\n<pre>\n#include &lt;windows.h&gt;\n\nint __cdecl wmain(int, wchar_t **)\n{\n \/\/ Create a zero-length file. This succeeds.\n HANDLE h = CreateFileW(L\"test\", GENERIC_READ | GENERIC_WRITE,\n               0, nullptr, CREATE_ALWAYS,\n               FILE_ATTRIBUTE_NORMAL, nullptr);\n\n \/\/ Read past EOF.\n char buffer[10];\n DWORD cb;\n OVERLAPPED o = { 0 };\n ReadFile(h, buffer, 10, &amp;cb, &amp;o); \/\/ returns FALSE\n GetLastError(); \/\/ returns ERROR_HANDLE_EOF\n\n return 0;\n}\n<\/pre>\n<p>The customer quoted this section from <a HREF=\"http:\/\/msdn.microsoft.com\/library\/aa365467\">The documentation for <code>Read&shy;File<\/code><\/a>: <\/p>\n<blockquote CLASS=\"m\">\n<p>Considerations for working with synchronous file handles: <\/p>\n<ul>\n<li>If <i>lpOverlapped<\/i> is <b>NULL<\/b>,     the read operation starts at the current file position and     <b>Read&shy;File<\/b> does not return until the oepration     is complete,     and the system updates the file pointer before <b>Read&shy;File<\/b>     returns. \n<li>If <i>lpOverlapped<\/i> is not <b>NULL<\/b>,     the read operation starts at the offset that is specified     in the <b>OVERLAPPED<\/b> structure and <b>Read&shy;File<\/b>     does not return until the read operation is complete.     The system updates the <b>OVERLAPPED<\/b> offset before     <b>Read&shy;File<\/b> returns. \n<li>When a synchronous read operation reads the end of a file,     <b>Read&shy;File<\/b> returns <b>TRUE<\/b> and sets     <code>*lpNumberOfBytesRead<\/code> to zero. <\/ul>\n<\/blockquote>\n<p>and then added <\/p>\n<blockquote CLASS=\"q\"><p>According to the third bullet point, the <code>Read&shy;File<\/code> should return <code>TRUE<\/code>, but in practice it returns <code>FALSE<\/code> and the error code is <code>ERROR_HANDLE_EOF<\/code>. <\/p><\/blockquote>\n<p>The problem here is that there are two concepts here, and they confusingly both use the word <i>synchronous<\/i>. <\/p>\n<ul>\n<li>A synchronous file handle is a handle opened without     <code>FILE_FLAG_OVERLAPPED<\/code>.     All I\/O to a synchronous file handle is serialized     and synchronous. \n<li>A synchronous I\/O operation is an I\/O issued with     <code>lpOverlapped == NULL<\/code>. <\/ul>\n<p>The sample program issues an asynchronous read against a synchronous handle. The third bullet point applies only to synchronous reads. <\/p>\n<p>To reduce confusion, the documentation would have been clearer if it hadn&#8217;t switched terminology midstream. <\/p>\n<blockquote CLASS=\"m\">\n<ul>\n<li>If <i>lpOverlapped<\/i> is <b>NULL<\/b>,     the read operation starts at the current file position and     <b>Read&shy;File<\/b> does not return until the oepration     is complete,     and the system updates the file pointer before <b>Read&shy;File<\/b>     returns. \n<li>If <i>lpOverlapped<\/i> is not <b>NULL<\/b>,     the read operation starts at the offset that is specified     in the <b>OVERLAPPED<\/b> structure and <b>Read&shy;File<\/b>     does not return until the read operation is complete.     The system updates the <b>OVERLAPPED<\/b> offset before     <b>Read&shy;File<\/b> returns. \n<li><u>If <i>lpOverlapped<\/i> is <b>NULL<\/b> and<\/u>     the read operation reads the end of a file,     <b>Read&shy;File<\/b> returns <b>TRUE<\/b> and sets     <code>*lpNumberOfBytesRead<\/code> to zero. <\/ul>\n<\/blockquote>\n<p>We asked what the customer was doing that caused them to trip over this confusion in the documentation. <\/p>\n<blockquote CLASS=\"q\">\n<p>The customer&#8217;s original code opened a file (synchronously) and read from it (synchronously). The customer is parallelizing the computation in a way that will read that single file from multiple threads. A single file pointer is therefore not suitable, because different threads will want to read from different positions. <\/p>\n<p>One idea would be to have each thread call <code>Create&shy;File<\/code> so that each handle has its own file position. Unfortunately, this won&#8217;t work for the customer because the sharing mode on the file handle denies read sharing. <\/p>\n<p>The solution they came up with was to open the file synchronously (without <code>FILE_FLAG_OVERLAPPED<\/code>) but to read asynchronously (by using an <code>OVERLAPPED<\/code> structure). The <code>OVERLAPPED<\/code> structure lets you specify where you want to read from, so multiple threads can issue reads against the file position they want. <\/p>\n<p>This solution works, but the customer is concerned because this hybrid model is not well-documented in MSDN. They found <a HREF=\"http:\/\/blogs.msdn.com\/b\/oldnewthing\/archive\/2012\/04\/05\/10290954.aspx\">a blog entry that discusses it<\/a>, but even that blog entry does not discuss what happens in the multithreaded case.) In particular, they are seeing that the end-of-file behavior acts according to asynchronous rather than synchronous rules. <\/p>\n<p>Any advice you have on how we can pursue this model would be appreciated. Another concern is that since we do not set the <code>hEvent<\/code> in the <code>OVERLAPPED<\/code> structure, the file handle itself is used as the signal that I\/O has completed, and this will cause problems if multiple I\/O&#8217;s are active simultaneously. <\/p>\n<\/blockquote>\n<p>The problem is that the customer confused the two senses of synchronous, one when applied to files and one when applied to I\/O operations. Since they opened a synchronous file handle, all I\/O operations are serialized and execute synchronously. Passing an <code>OVERLAPPED<\/code> structure issues an asynchronous I\/O, but since the underlying handle is synchronous, the I\/O is serialized and synchronous. The customer&#8217;s code therefore is not actually performing I\/O asynchronously; its requests for asynchronous I\/O is overridden by the fact that the underlying handle is synchronous. <\/p>\n<p>The hybrid model doesn&#8217;t actually realize any gains of asynchronous I\/O. The use of the <code>OVERLAPPED<\/code> structure merely provides the convenience of combining the seek and read operations into a single call. Since the benefit is rather meager, the hybrid model is not commonly used, and consequently it is not covered in depth in the documentation. (The facts are still there, but there is relatively little discussion and elaboration.) <\/p>\n<p>Based on this feedback, the customer considered switching to using an asynchronous file handle and setting the <code>hEvent<\/code> in the <code>OVERLAPPED<\/code> structure so that each thread can wait for its specific I\/O to complete. In the end, however, they decided to stick with the hybrid model because switching to an asynchronous handle was too disruptive to their code base. They are satisfied with the <code>OVERLAPPED<\/code> technique that lets them perform the equivalent of an atomic <code>Set&shy;File&shy;Pointer<\/code> + <code>Read&shy;File<\/code> (even if the I\/O is synchronous and serialized). <\/p>\n","protected":false},"excerpt":{"rendered":"<p>That&#8217;s what the documentation says, in a way that could be made more clear.<\/p>\n","protected":false},"author":1069,"featured_media":111744,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[25],"class_list":["post-44863","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-oldnewthing","tag-code"],"acf":[],"blog_post_summary":"<p>That&#8217;s what the documentation says, in a way that could be made more clear.<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/44863","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/users\/1069"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/comments?post=44863"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/44863\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media\/111744"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media?parent=44863"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/categories?post=44863"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/tags?post=44863"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}