What to do when you have a crash in the runtime control flow guard check

Windows Control Flow Guard (CFG) is a defense in depth feature which validates indirect call targets. The idea is that each module that is enabled for CFG provides a bitmap that describes which addresses in the module are intended to be targets of indirect calls. When CFG is enabled in a process, indirect function calls are checked against this table, and if the address is deemed invalid, the process terminates itself, and the Watson service records the details for future investigation.

If you are studying a crash in the control flow guard validator¹ you may want to pick out the failed address so you can understand better what went wrong and use it to guide the next step of your debugging. (Was it a bad address? Was the DLL unloaded? Was it a garbage value due to use-after-free?)

In general, the control flow guard validator takes a function address in some register, performs shifting and masking operations using that register as a source (to calculate the bit position in the call target bitmap), and then tests a bit. The source register is left unchanged so that the caller, on success, can use the validated address as a jump target.

Let’s practice. Here’s one of the control flow guard validator functions for x86-64, which Windows often calls x64. Try to spot the register that holds the address being validated.

ntdll!LdrpValidateUserCallTarget:
    mov     rdx,qword ptr [ntdll!................]
    mov     rax,rcx                  
    shr     rax,9                     ; shift
    mov     rdx,qword ptr [rdx+rax*8] ; crash here
    mov     rax,rcx
    shr     rax,3
    test    cl,0Fh
    jne     @1
    bt      rdx,rax
    jae     @2
    ret
@1: btr     rax,0
    bt      rdx,rax
    jae     @3
@2: or      rax,1
    bt      rdx,rax
    jae     @3
    ret
@3: mov     rax,rcx
    xor     r10d,r10d
    jmp     ntdll!LdrpHandleInvalidUserCallTarget

We see that the value in rcx gets moved into rax, and then rax gets shifted. So the address being validated is in rcx. The marked instruction is the only one that accesses memory, so if there’s a crash, it’ll happen there. The rest of the function is just bit twiddling.

Let’s do the same exercise for x86-32, which Windows often just calls x86.

ntdll!LdrpValidateUserCallTarget:
    mov     edx,dword ptr [ntdll!........]
    mov     eax,ecx                  
    shr     eax,8                     ; shift
    mov     edx,dword ptr [edx+eax*4] ; crash here
    mov     eax,ecx
    shr     eax,3
    test    cl,0Fh
    jne     @1
    bt      edx,eax
    jae     ...
    ret
@1: btr     eax,0
    bt      edx,eax
    jae     ...
    or      eax,1
    bt      edx,eax
    jae     ...
    ret

This time, it’s the value in ecx that gets moved into eax, and then eax gets shifted. The address being validated is therefore in ecx. Again, the marked instruction is the only one that accesses memory.

One more: This time, it’s 32-bit ARM, which Windows calls simply arm.

ntdll!LdrpValidateUserCallTarget:
    mov         r3,#0x.... 
    movt        r3,#0x.... 
    ldr         r3,[r3]    

    lsrs        r2,r0,#6    ; shift
    ubfx        r1,r0,#3,#3
    ldrb        r2,[r3,r2]  ; crash here

    mov         r3,r0
    and         r0,r0,#0xF
    subs        r0,r0,#1
    bne         ...

There are two memory accesses this time. The first is loading from a fixed address (built into r3 in two instructions), so it matches the first instruction of the x86-32 and x86-64 versions; it’s just that x86 can load from many fixed adresses in just one instruction.

The second group of instructions is the interesting one. It shifts the value in r0 and puts the result in r2. It also uses r0 as the source for a bit extraction operation that puts the result in r1, and then it accesses some memory. So it looks like r0 is the address, since it’s the source of the shift instruction.

Mind you, this code modifies r0 later on, so the value in r0 doesn’t hold the address through the entire function. It got copied into r3 for safekeeping, so if you break in later in the function, you’ll want to look to r3 for the address. But if you crash on the memory access, the address is in r0.

Our last example is AArch64, which Windows usually calls arm64.

ntdll!LdrpValidateUserCallTarget:
    adrp        xip0,ntdll!....   
    ldr         xip0,[xip0,#0x598]

    lsr         xip1,x15,#6      ; shift
    tst         x15,#0xF        
    ldrb        wip1,[xip0,xip1] ; crash here
    ubfx        xip0,x15,#3,#3
    bne         @2

    lsr         xip1,xip1,xip0
    tbz         wip1,#0,@3
@1: ret

@2: and         xip0,xip0,#-2
    lsr         xip1,xip1,xip0
    tbz         wip1,#0,@4
@3: tbnz        wip1,#1,@1
@4: mov         xip0,#0
    b           @5
@5: b           ntdll!LdrpHandleInvalidUserCallTarget

Again, we start by loading an address from memory, and then we shift a register, this time the x15 register. There is a bit test instruction whose result is used later, and then we perform a memory access (which could crash). From inspection, we therefore see that the address being validated is in x15.

The point of this exercise is not to memorize the registers that each architecture uses for control flow guard,³ but rather to take a little information about the design of control flow guard (checking a bit in a bitmap, using the address passed in a register to calculate the index),² and using that to figure out on the fly which register you need to look at based on the code surrounding the crashing access.

¹ Usually, these crashes occur because the address that got passed in is so invalid that there is no memory at the location where the bit in the validation bitmap is supposed to be, resulting in an access violation.

² You don’t even have to know the precise meaning of the bits in the bitmap. All you have to remember is that the address is used to determine the bit to check.

³ I sure don’t have them memorized. Each time it happens, I just re-derive it from the instructions around the crash.

What to do when you have a crash in the runtime control flow guard check

Author

0 comments

Leave a commentCancel reply

Read next

The early history of the Windows Runtime PropertyValue and why there is a PropertyType.Inspectable that is never used

Windows Runtime design principle: Properties can be set in any order