February 20th, 2023

The case of the mysterious "out of bounds" error from CreateUri and memmove

A customer was trying to understand why their program was crashing with an E_BOUNDS error in what appears to be a call to CreateUri.

combase!RoOriginateErrorW+0x50
wincorlib!Platform::Details::ReCreateFromException+0x40
contoso!`__abi_translateCurrentException'::`1'::catch$0+0x10
contoso!memmove+0x217f4
contoso!Windows::Foundation::IUriRuntimeClassFactory::CreateUri+0x44
contoso!Contoso::DashboardView::DashboardView_obj1_Bindings::Update_ViewModel_Layout_Groups+0x50
contoso!Contoso::DashboardView::DashboardView_obj1_Bindings::Update_ViewModel_Layout+0xe4
contoso!Contoso::DashboardView::DashboardView_obj1_Bindings::PropertyChanged+0x1134
contoso!XamlBindingInfo::XamlBindingTrackingBase::PropertyChanged+0x30

From the stack, it looks like memmove threw a E_BOUNDS C++/CX exception, which doesn’t make sense. Even more mysteriously, the memmove was called from CreateUri, but their DashboardView doesn’t manipulate URIs in any obvious way. It’s just a stack trace of nonsense.

Let’s try to unwind the nonsense.

As for the mysterious memmove, notice that the offset is 0x217f4. It’s unlikely that the memmove function is over 100KB in size. Let’s see what’s really going on there. This is just some code that has probably been shunted into a rarely-used code page far, far away from the rest of the code, and the nearest symbol to it happens to be memmove.

    xor     ecx,ecx
    call    contoso!__abi_translateCurrentException
    int     3       ; memmove+0x217f4

Yup, this is an exception rethrow. Since exceptions are rare, profile-guided optimization puts all the exception-handling nonsense into faraway pages so that they don’t consume valuable space in the hot code pages.

So why is CreateUri throwing an “out of bounds” exception?

Well, are you sure it’s really CreateUri?

I looked a frame higher on the stack. “Why is data binding calling CreateUri?”

The data binding code is autogenerated by the XAML compiler; it’s not checked into the source tree. Instead of trying to figure out how to build their project (so I can extract the autogenerated file), maybe I can infer what’s going on from the source.

One basic assumption that you make about code in general is that people who write code are doing the best they can, rather than being sadists. This means that function names will generally be descriptive of what they do, variable names will generally be descriptive of what they represent, and so on. So when I see a class called DashboardView_obj1_Bindings, I’m going to assume that this class is for dealing with the bindings of some object in DashboardView, and since it has a method called Update_ViewModel_Layout_Groups, it probably has something to do with updating the binding of something whose names involve the words ViewModel, Layout, and Groups.

I looked at DashboardView.xaml and searched for the word ViewModel in elements that appeared to be involved with binding.

<ContentControl
    Grid.Row="0"
    x:Name="TogglesGroup"
    IsTabStop="False"
    Width="360"
    Content="{x:Bind ViewModel.Layout.Groups[0], Mode=OneWay}"
    ContentTemplateSelector="{StaticResource DashboardGroupTemplateSelector}"/>

Now, this wasn’t the first use of x:Bind in the XAML markup, so that doesn’t line up with obj1, but the other parts do line up (the Layout and Groups), so I chalked this up to “Maybe the XAML compiler generates bindings in some order other than the order they appear in the markup.”

How could this binding raise an “out of bounds” exception? Well, there’s a subscript operation, so maybe the Groups collection is empty.

I looked at the Update_ViewModel_Layout_Groups method to see if that theory lined up.

Update_ViewModel_Layout_Groups:
    test    rdx,rdx
    je      ...
    mov     qword ptr [rsp+8],rbx
    mov     qword ptr [rsp+18h],rbp

    push    rsi
    push    rdi
    push    r14
    sub     rsp,20h
    mov     rbp,rdx
    mov     rsi,rcx
    test    r8d,0C0000001h
    je      ...

    xor     edx,edx
    mov     rcx,rbp
    call    contoso!Windows::Foundation::IUriRuntimeClassFactory::CreateUri

The function starts with a shrink-wrapped early-out if the first parameter is zero. (This is a C++ method, so rcx contains this and rdx contains the first formal parameter.) I don’t know how binding works, but presumably this is just a binding thing.

If the parameter is nonzero, then we build a proper stack frame, test some bits in the third parameter, and if they’re set, we call, um, Create­Uri with nullptr? That makes no sense. The XAML isn’t asking for a URI, and why is this code trying to create a URI from an empty string?

But then you realize that you’ve been faked out by COMDAT folding. The this parameter for the call to Create­Uri is supposed to be the IUri­Runtime­Class­Factory, but that’s not what we’re passing; we’re passing the first formal parameter.

Really, this is a call to IVector::GetAt, and the parameter is zero, indicating that we want the object at index zero. The functions IVector::GetAt and Create­Uri were folded because they happen to be byte-for-byte identical. They are both “Call the method at index 6 in the vtable with one parameter.” For IUri­Runtime­Class­Factory, that method is Create­Uri and the parameter is a string. For IVector that method is Get­At and the parameter is an index.

With this explanation, the customer realized that they did have an outstanding bug that said, “If our settings file is corrupted, we end up with no groups,” and this bug is likely an alternate manifestation of that bug.

Topics
Code

Author

Raymond has been involved in the evolution of Windows for more than 30 years. In 2003, he began a Web site known as The Old New Thing which has grown in popularity far beyond his wildest imagination, a development which still gives him the heebie-jeebies. The Web site spawned a book, coincidentally also titled The Old New Thing (Addison Wesley 2007). He occasionally appears on the Windows Dev Docs Twitter account to tell stories which convey no useful information.

9 comments

Discussion is closed. Login to edit/delete existing comments.

  • 紅樓鍮

    Can you also talk about the GetKnownFolder bug that was present in Windows builds 25290 and 25295?

  • Greedy Greedo · Edited

    Another great post it's crazy the optimisation of memory layout and one thing masquerading as another. It reminds me of another question I've had I don't think you've covered in any blog, I'm not sure how best to ask you. I'm curious about DECIMAL in tagVARIANT struct. I noticed that the declaration for decimal overlaps the VarType flag completely. This means to me that the `.vt` member of a variant containing a decimal is undefined....

    Read more
    • Me Gusta · Edited

      I did a bit of digging around in the Visual C++ headers, and those two bytes at the start of DECIMAL don't coincidentally line up with the vt member of VARIANT.
      While there isn't any interesting internal Microsoft history in there, you can infer a lot from just seeing some changes.

      Anyway, the first time you see VARIANT in the Visual C++ headers is Visual C++ 1.5. This doesn't mean that this is the first time...

      Read more
      • Kristof RoompMicrosoft employee

        I guess they wanted DECIMAL to line up to 16 bytes if you had an array of DECIMALs, but on the other hand it had to fit into a VARIANT. Make the first 2 bytes of DECIMAL reserved and then they can overlap without a problem.

      • skSdnW

        Almost without a problem. A C++ language-lawyer might claim that you cannot access both the .vt member and the decimal, only one of the union members is valid. If you are manually writing the decimal, you have to set the reserved member, not vt when you fill it in.

      • Me Gusta · Edited

        This area is always a pain.
        However, VARIANT, and therefore DECIMAL and the variant sub-object itself have a very strong property, they are all POD types. This basically means that for C++, they are seen as standard layout and trivial. This is a very strong property to have in this situation.
        You see, if two standard layout non-union class types have common non-static members at the start of the class type then it is valid...

        Read more
      • Greedy Greedo

        I guess what confuses me is why decimal is this:

        <code>

        Not this:
        <code>

        Or part of Variant like this

        <code>

        With DECIMAL being 14 bytes not 16 and not having the reserved stuff at the start but explicitly getting the vt there rather than just by convention

        Read more
      • Me Gusta · Edited

        The change in variant is easy, structure packing. The vt member is 2 bytes and the union that contains the reserved stuff and the DECIMAL would be 14 bytes. The default packing for 32 bit Windows is 4 bytes, this means that VARIANT would end up being 20 bytes with padding between vt and the union and padding after the union. It is potentially possible to deal with this with structure packing, but that goes...

        Read more
    • Michell Groner