An analogy about register preservation rules in calling conventions

Raymond Chen

A common problem beginning programmers have with assembly language is understanding calling conventions and register preservation rules, because those are not rules imposed by the architecture, but are rather created by convention.

Imagine that you are an instructor at a university. In the United States, classrooms are typically not dedicated to a single class or a single instructor, but rather, classes and instructors are assigned to rooms on a case-by-case basis, and it is common for instructors to have to teach classes in multiple rooms.

The convention at the school might be that the instructor is free to use the chalkboard however they like, and they are allowed to erase any content left on the chalkboard when they arrive for the start of class, with no responsibility to return the chalkboard to its original state when the class ends.

On the other hand, the convention is that the desks and chairs in the classroom must be returned to their original places at the end of class. The instructor is permitted to move the desks and chairs during class, but they have to put them back in the original locations when the class is over.

The chalkboard is like the registers we call “scratch” or “volatile”: You are welcome to use them all you want, and you are not under any obligation to restore them to their original values when you are done.

The desks and chairs are like the registers we call “preserved” or “non-volatile”: You are welcome to use them all you want, but you are required to put them “back where you found them” when you are done. This is typically done by saving their initial state somewhere when you start (for desks and chairs, maybe by marking their locations with tape on the floor; for registers, by saving their values to memory) and restoring them at before you finish (moving the desks and chairs back to the markers; for registers, reloading the values from memory).

There might be some fixtures in the classroom that may not be moved at all. You cannot unscrew the light bulbs or disable the fire alarm. These are “immutable” registers, which you aren’t allowed to modify, not even temporarily.

Now, maybe your class is a lab class that has been given a dedicated room for an entire week, so you don’t have to reset the room after each session and can instead just resume from where the previous session left off. But you don’t have class on Wednesday, so for Wednesday, another class uses the room. What can you expect?

Well, the conventions say that the chalkboard is available for use. So don’t leave anything important on the chalkboard at the end of the Tuesday class, because it will be gone when you come back on Thursday.

On the other hand, the conventions also say that the desks should be returned to their original locations, so the desk clumps you created for your lab class will still be there. The desks may very well be moved around by the Wednesday class, but they will return them to the original positions before they are done.

Now, the school may have other conventions regarding the classroom. For example, maybe in addition to the convention that says that you have to restore the desks to their original locations, they also say that if you do move any desks, you have to take a photo of the classroom to show the original desk positions, and you have to post the photo on the bulletin board next to the door (and take it away when your class is over). The school requires this because your class might be cancelled after the Thursday session, so you won’t be around on Friday to reset the desks. The facilities staff needs to use that photo you took in order to restore the desks to their original positions for the class that had the room before you.

When people write code in assembly language, they often forget about the conventions about taking photos of things before you move them because those photos (unwind codes) are not explicitly in the code itself, but are instead stored in another part of the executable, to be consulted only in the (hopefully rare) case that the system needs to reason over the stacks of active threads. This primarily happens during exception handling.¹ But since the unwind codes are not immediately visible, people usually forget that they need them.

Related reading: Zero-cost exceptions aren’t actually zero cost.

¹ You might say, “But my code doesn’t throw exceptions, why do I need to worry about exception handling?” The system internally uses exceptions for normal everyday activities. For example, guard page exceptions are used to indicate that the stack needs to grow. Also, debug messages like those generated by OutputDebugString are communicated by raising an exception and immediately catching it, with the expectation that if a debugger is connected, it will intercept and handle the exception.

If you fail to declare unwind codes for your function, the system assumes that you are a lightweight leaf function. This means that all nonvolatile registers and the stack pointer are unchanged from their function entry, and that the return address is in its default location (on the top of the stack for x86 family systems, in the link register for branch-and-link architectures). If you don’t follow the rules for a lightweight leaf function (for example, if you push registers onto the stack), then the lightweight leaf function unwinding produces invalid values, and if you’re lucky, the systems terminates your process immediately for having a corrupted stack. If you’re unlucky, the system unwinds the stack incorrectly, and a function higher up the stack runs with corrupted registers and behaves unpredictably. This is not just bad for your program (which no longer operates predictably), but also bad for your users (because an attacker might induce a stack unwind and control enough of the unintended execution to gain remote code execution).

Author

Raymond Chen

Raymond has been involved in the evolution of Windows for more than 30 years. In 2003, he began a Web site known as The Old New Thing which has grown in popularity far beyond his wildest imagination, a development which still gives him the heebie-jeebies. The Web site spawned a book, coincidentally also titled The Old New Thing (Addison Wesley 2007). He occasionally appears on the Windows Dev Docs Twitter account to tell stories which convey no useful information.

5 comments

Discussion is closed. Login to edit/delete existing comments.

Dmitry December 11, 2024 · Edited

Well, yes, I fully understand this. But the post says about code not throwing exceptions, just doing some stuff that involves exception throwing and catching internally. Say, if the fun() function in your example just touches a guard page or calls OutputDebugString, I wouldn’t expect the filter expression to be calculated since I wouldn’t expect any exception to jump out if the fun().

In fact, it seems to be even more difficult. What if guard page gets touched while building a new stack frame for a function? What if the guard page touch occurs by building a new SEH frame on...
Read more
Well, yes, I fully understand this. But the post says about code not throwing exceptions, just doing some stuff that involves exception throwing and catching internally. Say, if the fun() function in your example just touches a guard page or calls OutputDebugString, I wouldn’t expect the filter expression to be calculated since I wouldn’t expect any exception to jump out if the fun().

In fact, it seems to be even more difficult. What if guard page gets touched while building a new stack frame for a function? What if the guard page touch occurs by building a new SEH frame on the stack? I used to think that guard page exceptions were handled in kernel mide with kernel stack used by the code doing the bookkeeping, since this involves changing system structures that shouldn’t normally be accessible to user mode code, so transparently to user mode application. And I wouldn’t expect a call to OutputDebugString to let its internal exception escape the function itself.

P.S. Comments are totally broken. Replies come out as separate comments.

Read less
Dmitry December 6, 2024

Are you trying to say that having a call chain of functions A -> B -> C a guard page exception or a call to OutputDebugString in function C might cause stack unwinding that might end up in function A? And that for any function changing non-volatile registers one should build a stack frame with exception-handling related stuff? Sounds wrong to me.
- Me Gusta December 8, 2024 · Edited
  As a potential hint, consider the context that the filter expression has to be handled in.
  
  <code>
  
  This is a small and silly example, but what would you expect the putwc to print? Because it is a stack variable, RBP/RSP (or platform equivalent) has to be set correctly to read it. So the search for a block that is capable of handling an exception also has to undo the context, even if the stack isn't unwound on the first pass.
  
  Had to edit this because the < and > were stripped out of the code block.
  
  Read more
  As a potential hint, consider the context that the filter expression has to be handled in.
  
  #include <Windows.h> #include <cstdio> int filter(unsigned int code, EXCEPTION_POINTERS *info, wchar_t c) { putwc(c, stdout); return EXCEPTION_CONTINUE_EXECUTION; } void fun() { RaiseException(1, 0, 0, nullptr); } int wmain() { wchar_t c = L'a'; __try { fun(); } __except (filter(GetExceptionCode(), GetExceptionInformation(), c)) { } return 0; }
  
  This is a small and silly example, but what would you expect the putwc to print? Because it is a stack variable, RBP/RSP (or platform equivalent) has to be set correctly to read it. So the search for a block that is capable of handling an exception also has to undo the context, even if the stack isn’t unwound on the first pass.
  
  Had to edit this because the < and > were stripped out of the code block.
  Read less
Joshua Hudson December 6, 2024

[Comments appear to have been broken for two weeks--reposting]

So I'm going to be that guy.

We have one active case remaining where the assembly code is called without an unwind frame, and that's because the unwind frame is inexpressible and can't be changed to be expressible.

Easy mode version: Assuming this implementation of FreeLibraryAndExitThread:

volatile void WINAPI FreeLibraryAndExitThread(HINSTANCE hLibModule, DWORD dwExitCode)
{
FreeLibrary(hLibModule);
ExitThread(dwExitCode);
}

The assembly code would look something like this:

FreeLibraryAndExitThread:
sub rsp, 40
mov [rsp + 32], edx
...
Read more
[Comments appear to have been broken for two weeks–reposting]

So I’m going to be that guy.

We have one active case remaining where the assembly code is called without an unwind frame, and that’s because the unwind frame is inexpressible and can’t be changed to be expressible.

Easy mode version: Assuming this implementation of FreeLibraryAndExitThread:

volatile void WINAPI FreeLibraryAndExitThread(HINSTANCE hLibModule, DWORD dwExitCode)
{
FreeLibrary(hLibModule);
ExitThread(dwExitCode);
}

The assembly code would look something like this:

FreeLibraryAndExitThread:
sub rsp, 40
mov [rsp + 32], edx
call FreeLibrary@8
mov ecx, [rsp + 32]
call ExitThread@8

The trouble in writing down the unwind codes is after the FreeLibrary function returns, unwinding is not possible. The calling function’s unwind codes are gone.

I have essentially the same thing only it does a few more calls between FreeLibrary and ExitThread.

Read less
Marcel Kilgus December 6, 2024
I have written my own crash reporter using a separate reporter process and SetUnhandledExceptionFilter in the main process many years ago. When porting the code to 64-bit it just didn't work, the unhandled exception filter was never invoked. VERY long story short, the (commercial, non-MS) C compiler I'm using emits wrong code in the 64-bit case. The function stack frames use a granularity of 8-bytes (like "lea rbp,[rsp+28h]"), but Windows' exception unwind data structures for SET_FPREG can only encode multiples of 10h
<code>
resulting in an off-by-8-bytes stack unwind issue and thus "missing" the uppermost exception filter. What a pain...
Read more
I have written my own crash reporter using a separate reporter process and SetUnhandledExceptionFilter in the main process many years ago. When porting the code to 64-bit it just didn’t work, the unhandled exception filter was never invoked. VERY long story short, the (commercial, non-MS) C compiler I’m using emits wrong code in the 64-bit case. The function stack frames use a granularity of 8-bytes (like “lea rbp,[rsp+28h]”), but Windows’ exception unwind data structures for SET_FPREG can only encode multiples of 10h
```
Unwind codes: 
      0A: SET_FPREG, register=rbp, offset=0x20 
```
resulting in an off-by-8-bytes stack unwind issue and thus “missing” the uppermost exception filter. What a pain to find out and as the bug was determined a WONTFIX by the vendor I had to write a patch utility that fixes this problem post-build.
Read less

An analogy about register preservation rules in calling conventions

Author

5 comments

Read next

Why does my DLL reference count go up by one every time I create and exit a thread?

Won’t waiting for multiple threads one at a time introduce a severe performance issue?

Author

5 comments

Read next

Why does my DLL reference count go up by one every time I create and exit a thread?

Won’t waiting for multiple threads one at a time introduce a severe performance issue?

Stay informed