Why doesn't Windows use the 64-bit virtual address space below 0x00000000`7ffe0000?

A customer used VMMap and observed that for all of their 64-bit processes, nothing was allocated at any addresses below 0x00000000`7ffe0000. Why does the virtual address space start at 0x00000000`7ffe0000? Is it to make it easier to catch pointer truncation bugs? And what’s so special about 0x00000000`7ffe0000?

Okay, let’s go through the questions one at a time.

First, is it even true that the virtual address space starts at 0x00000000`7ffe0000?

No. The virtual address space starts at the 64KB boundary. You can confirm this by calling GetSystemInfo and checking the lpMinimumApplicationAddress. It will be 0x00000000`00010000.

If the address space starts at 64KB, why is the lower 2GB pretty much ignored?

Because it turns out that the total address space is really big.

Address Space Layout Randomization (ASLR) tries to put things at unpredictable addresses. The full user-mode address space on x86-64 is 128TB, and a randomly-generated 47-bit address is very unlikely to begin with 15 consecutive zero bits. The first 2GB of address space is only 0.003% of the total available address space, so it’s a pretty small target.

But why is there a page of memory consistently allocated at exactly 0x00000000`7ffe0000?

That is a special page of memory that is mapped read-only into user mode from kernel mode, and it contains things like the current time, so that applications can get this information quickly without having to take a kernel transition. This page is at a fixed location for performance reasons.

If a 64-bit application is not marked /LARGEADDRESSAWARE, then it receives a shrunken address space of only 2GB so that it doesn’t have to deal with any “large addresses” (namely, those above 2GB). These “64-bit processes that don’t understand addresses above 2GB” (also known in my kitchen as “really stupid 64-bit processes”) still need to access the shared data, so the shared data must go below the 2GB boundary.

Since virtual address space granularity on Windows is 64KB, reserving a single page actually reserves an entire 64KB block, from 0x00000000`7ffe0000 to 0x00000000`7ffeffff. Also in this range is a second read-only page mapped from kernel mode, this time for sharing information with the hypervisor, specifically the partition reference time stamp counter. This second page is placed at a random location inside the 64KB range, not so much for ASLR reasons (since there are only 15 choices, which isn’t very random), but just to make sure that nobody accidentally takes a dependency on a fixed address. As noted in the documentation, you find this partition reference time stamp counter page by reading the HV_X64_MSR_REFERENCE_TSC model-specific register.

Okay, but why are these special shared pages in the range 0x00000000`7ffe0000 to 0x00000000`7ffeffff? Why not put them right up at the 2GB boundary of 0x00000000`7fff0000 to 0x00000000`7fffffff?

One reason is that the 64KB region immediately above and below the 2GB boundary are marked permanently invalid in order to simplify address validity checks: On 32-bit systems, the 2GB boundary marks the traditional boundary between user mode and kernel mode. If you put a “no man’s land” between user mode and kernel mode, then you can validate a memory range by checking that it starts in user mode, and verifying that every page is accessible. You’ll run into the inaccessible page before you get to the kernel mode addresses.

Okay, that explains why there’s a no man’s land on 32-bit systems, but why do we also have it on 64-bit systems?

On the Alpha AXP, most 32-bit constants can be generated in at most two instructions. But there’s a range of values that requires three instructions: 0x7fff8000 to 0x7fffffff. Blocking off that region means that you never have to provide a relocation fixup that targets the problematic memory range. And the initial target of 64-bit Windows was the Alpha AXP.

Author

Raymond Chen

Raymond has been involved in the evolution of Windows for more than 30 years. In 2003, he began a Web site known as The Old New Thing which has grown in popularity far beyond his wildest imagination, a development which still gives him the heebie-jeebies. The Web site spawned a book, coincidentally also titled The Old New Thing (Addison Wesley 2007). He occasionally appears on the Windows Dev Docs Twitter account to tell stories which convey no useful information.

7 comments

Robert Burke December 18, 2022

Given that most applications see a 10%ish speedup from large pages, is there any plan to make large pages usable for applications that do not run as administrator or applications that might not run continually from startup on Windows, like Microsoft Excel or World of Warcraft?

Andrew Kent-Morris December 18, 2022

I just wanted to thank you for the Cheers reference.

Stevie White December 17, 2022

Great write up Raymond, I had a feeling this was going to be an interesting one. The comments are also very interesting. 👍

Joshua Hudson December 17, 2022 · Edited

“really stupid 64 bit processes” Does that mean I can make an x32 process on Windows?

An x32 process is one complied targeting amd64 instruction set but has sizeof(void*) = 4. This decreases cache pressure at the consequence of can’t possibly exceed 4GB ram (well in this case 2GB unless VirtualAlloc will take a fixed address above 2GB despite no /LARGEADDRESSSEARE)

Jan Ringoš December 17, 2022 · Edited

Only sort of.
There is no x32 ABI on Windows, so pointers interacting with Windows API, and basically all other libraries, always needs to be 64-bit.

But your own pointers absolutely can be 32-bit, and that indeed improves performance. See this synthetic benchmark and results I did some time ago.
- Henke37 December 18, 2022
  
  Yes there is? How else do you think that 32 bit applications work? Sure, it’s mostly trivial leaf functions and wrappers around the 64 bit version, but the api exists.
  
  There’s a fair amount of 3rd party research into the internals. Some have even managed to run their own code of the wrong bitness in a process.
- Simon Farnsworth December 21, 2022
  
  x32 is a hybrid ABI; like amd64, it uses the processor in long mode (“64 bit mode”), but it restricts pointers to 32 bits. The result is something that’s similar to amd64, except for pointer size.
  
  Windows has a 32 bit x86 ABI, but that uses the processor in protected or compatibility mode, not long mode, and is not quite the same as x32. In particular, amd64 has the extra 8 registers, which are useful in system call conventions, and x32 can use those while 32-bit x86 can’t.

Discussion is closed. Login to edit/delete existing comments.

Robert Burke December 18, 2022

Given that most applications see a 10%ish speedup from large pages, is there any plan to make large pages usable for applications that do not run as administrator or applications that might not run continually from startup on Windows, like Microsoft Excel or World of Warcraft?
Andrew Kent-Morris December 18, 2022

I just wanted to thank you for the Cheers reference.
Stevie White December 17, 2022

Great write up Raymond, I had a feeling this was going to be an interesting one. The comments are also very interesting. 👍
Joshua Hudson December 17, 2022 · Edited

“really stupid 64 bit processes” Does that mean I can make an x32 process on Windows?

An x32 process is one complied targeting amd64 instruction set but has sizeof(void*) = 4. This decreases cache pressure at the consequence of can’t possibly exceed 4GB ram (well in this case 2GB unless VirtualAlloc will take a fixed address above 2GB despite no /LARGEADDRESSSEARE)
- Jan Ringoš December 17, 2022 · Edited
  
  Only sort of.
  There is no x32 ABI on Windows, so pointers interacting with Windows API, and basically all other libraries, always needs to be 64-bit.
  
  But your own pointers absolutely can be 32-bit, and that indeed improves performance. See this synthetic benchmark and results I did some time ago.
  - Henke37 December 18, 2022
    
    Yes there is? How else do you think that 32 bit applications work? Sure, it’s mostly trivial leaf functions and wrappers around the 64 bit version, but the api exists.
    
    There’s a fair amount of 3rd party research into the internals. Some have even managed to run their own code of the wrong bitness in a process.
  - Simon Farnsworth December 21, 2022
    
    x32 is a hybrid ABI; like amd64, it uses the processor in long mode (“64 bit mode”), but it restricts pointers to 32 bits. The result is something that’s similar to amd64, except for pointer size.
    
    Windows has a 32 bit x86 ABI, but that uses the processor in protected or compatibility mode, not long mode, and is not quite the same as x32. In particular, amd64 has the extra 8 registers, which are useful in system call conventions, and x32 can use those while 32-bit x86 can’t.

Why doesn’t Windows use the 64-bit virtual address space below 0x00000000`7ffe0000?

Author

7 comments

Read next

On the large number of ways of expressing Microsoft Visual C++ compiler versions

Why doesn’t the BitLocker wizard let me save the BitLocker key on an encrypted drive?

Author

7 comments

Read next

On the large number of ways of expressing Microsoft Visual C++ compiler versions

Why doesn’t the BitLocker wizard let me save the BitLocker key on an encrypted drive?

Stay informed