{"id":8023,"date":"2012-03-23T07:00:00","date_gmt":"2012-03-23T07:00:00","guid":{"rendered":"https:\/\/blogs.msdn.microsoft.com\/oldnewthing\/2012\/03\/23\/why-is-the-heap32next-function-incredibly-slow-on-windows-7\/"},"modified":"2021-03-22T20:59:35","modified_gmt":"2021-03-23T03:59:35","slug":"20120323-00","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/oldnewthing\/20120323-00\/?p=8023","title":{"rendered":"Why is the Heap32Next function incredibly slow on Windows 7?"},"content":{"rendered":"<p>A customer observed that the <code>Heap32Function<\/code> runs much slower on Windows\u00a07 compared to previous versions of Windows. What happened?<\/p>\n<p>Set the wayback machine to 1992. The product is Windows\u00a03.1. One of the new components available in Windows\u00a03.1 went by the name <code>TOOLHELP<\/code>. It let you snoop around the low-level guts of the Windows\u00a03.1 kernel, and the feature that is relevant here is walking the heaps. Since Windows\u00a03.1 was a cooperatively multitasking system, you could ensure that the heap was stable during your calls to <code>Heap\u00adFirst<\/code> and <code>Heap\u00adNext<\/code> by the simple process of not yielding control.<\/p>\n<p>Mind you, ToolHelp was not part of the kernel itself. It was bolted onto the side. As I recall, ToolHelp arrived late on the scene, and the kernel folks didn&#8217;t want to destabilize the kernel with any ToolHelp-related changes, so all the work done by ToolHelp was done &#8220;from the outside&#8221;.<\/p>\n<p>Windows\u00a095 introduced 32-bit versions of the ToolHelp functions. I&#8217;m not sure why.<\/p>\n<p>Where was I? Oh right, <code>Heap32\u00adNext<\/code>.<\/p>\n<p>In the 32-bit version of ToolHelp, you could walk the heap of a process by calling <code>Heap32\u00adFirst<\/code> and <code>Heap32\u00adNext<\/code>. As implemented in Windows\u00a095, the <code>Heap32\u00adFirst<\/code> function allocated some memory to keep track of the state of the heap walk and stored it in the <code>HEAPENTRY32.dwResvd<\/code> field. The <code>Heap32\u00adNext<\/code> used this state to find the next heap block, and it finally freed the memory when the end of the heap was reached (and <code>Heap32\u00adNext<\/code> returned <code>FALSE<\/code>). This means that if you call <code>Heap32\u00adFirst<\/code> and do not walk the heap to completion but rather abandon the walk partway through, you leaked some memory. Unlike functions like <code>Find\u00adFirst\u00adFile<\/code> which have an explicit <code>Find\u00adClose<\/code> function to indicate that you are done with the enumeration (and allow the operating system to free the tracking state), there was no corresponding <code>Heap32\u00adClose<\/code> function. The only way to free the memory was to walk the heap to the end. The Windows\u00a095 implementation also didn&#8217;t handle the case that the heap changed while you were walking it.<\/p>\n<p>But since the toolhelp library was intended for diagnostic purposes anyway (I mean, it&#8217;s right there in the name: <i>tool help<\/i>), these weren&#8217;t considered serious problems. Your debugging plug-in might use it to walk the heap looking for memory leaks, but you wouldn&#8217;t deploy it in production, right?<\/p>\n<p>The Windows\u00a0NT folks didn&#8217;t like that there was a memory leak built into the design. Since there was no way to ensure that the memory allocated by <code>Heap32\u00adFirst<\/code> was freed in the event the application wanted to abandon the heap walk, their solution was simply to free all allocated memory before returning from <code>Heap32\u00adFirst<\/code> and <code>Heap32\u00adNext<\/code>. If an application asks to walk the heap, the <code>Heap32\u00adFirst<\/code> takes a snapshot of the heap, returns the first heap block, then frees the snapshot. When the application calls <code>Heap32\u00adNext<\/code>, it takes a snapshot of the heap, returns the second heap block, then frees the snapshot. On the second call to <code>Heap32\u00adNext<\/code>, it takes a snapshot of the heap, returns the third heap block, then frees the snapshot. You get the idea.<\/p>\n<p>As a result, walking the heap via <code>Heap32\u00adFirst<\/code>\/<code>Heap32\u00adNext<\/code> is an <i>O<\/i>(<i>n<\/i>\u00b2) operation.<\/p>\n<p>So why did this become slower in Windows\u00a07?<\/p>\n<p>Prior to Windows\u00a07, the snapshot was taken in a fixed-size buffer that held information for around a quarter million heap entries. As a result, there was a hard limit on the worst-case cost of walking the heap via the toolhelp functions. In Windows\u00a07, this hard limit was lifted because there were <a href=\"http:\/\/technet.microsoft.com\/en-us\/sysinternals\/dd535533.aspx\"> some diagnostic tools<\/a> which were bumping into this limit. The kernel folks decided that it was better to have the functions be slow but correct rather than fast and incomplete. Since the limit was lifted, so too was the cap on the worst-case cost of walking the heap with <code>Heap32\u00adFirst<\/code>\/<code>Heap32\u00adNext<\/code>.<\/p>\n<p>Toolhelp was designed back in the days of co-operative multitasking and hasn&#8217;t aged well. At this point, he&#8217;s sort of this unwanted distant relative in the kernel. Nobody actually likes him, but when he shows up at the family reunion, you have to let him in.<\/p>\n<p>By the way, the recommended way to walk the contents of the heap is to use the <code>Heap\u00adWalk<\/code> function. The <code>Heap\u00adWalk<\/code> function does not suffer from this problem; enumerating the entire heap via repeated calls to <code>Heap\u00adWalk<\/code> has total running time proportional to the number of heap blocks. Note that <code>Heap\u00adWalk<\/code> can only enumerate heap blocks from the current process. If you&#8217;re doing cross-process heap walking for diagnostic purposes, then you&#8217;re stuck with <code>Heap32\u00adFirst<\/code>\/<code>Heap32\u00adNext<\/code>, but since you&#8217;re just doing it for diagnostic purposes, correctness should be more important to you than performance.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Improved correctness but at a price.<\/p>\n","protected":false},"author":1069,"featured_media":111744,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[25],"class_list":["post-8023","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-oldnewthing","tag-code"],"acf":[],"blog_post_summary":"<p>Improved correctness but at a price.<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/8023","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/users\/1069"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/comments?post=8023"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/8023\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media\/111744"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media?parent=8023"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/categories?post=8023"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/tags?post=8023"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}