A customer was trying to understand why their program was crashing with an E_BOUNDS
error in what appears to be a call to CreateUri
.
combase!RoOriginateErrorW+0x50 wincorlib!Platform::Details::ReCreateFromException+0x40 contoso!`__abi_translateCurrentException'::`1'::catch$0+0x10 contoso!memmove+0x217f4 contoso!Windows::Foundation::IUriRuntimeClassFactory::CreateUri+0x44 contoso!Contoso::DashboardView::DashboardView_obj1_Bindings::Update_ViewModel_Layout_Groups+0x50 contoso!Contoso::DashboardView::DashboardView_obj1_Bindings::Update_ViewModel_Layout+0xe4 contoso!Contoso::DashboardView::DashboardView_obj1_Bindings::PropertyChanged+0x1134 contoso!XamlBindingInfo::XamlBindingTrackingBase::PropertyChanged+0x30
From the stack, it looks like memmove
threw a E_BOUNDS
C++/CX exception, which doesn’t make sense. Even more mysteriously, the memmove
was called from CreateUri
, but their DashboardView doesn’t manipulate URIs in any obvious way. It’s just a stack trace of nonsense.
Let’s try to unwind the nonsense.
As for the mysterious memmove
, notice that the offset is 0x217f4
. It’s unlikely that the memmove
function is over 100KB in size. Let’s see what’s really going on there. This is just some code that has probably been shunted into a rarely-used code page far, far away from the rest of the code, and the nearest symbol to it happens to be memmove
.
xor ecx,ecx call contoso!__abi_translateCurrentException int 3 ; memmove+0x217f4
Yup, this is an exception rethrow. Since exceptions are rare, profile-guided optimization puts all the exception-handling nonsense into faraway pages so that they don’t consume valuable space in the hot code pages.
So why is CreateUri throwing an “out of bounds” exception?
Well, are you sure it’s really CreateUri
?
I looked a frame higher on the stack. “Why is data binding calling CreateUri
?”
The data binding code is autogenerated by the XAML compiler; it’s not checked into the source tree. Instead of trying to figure out how to build their project (so I can extract the autogenerated file), maybe I can infer what’s going on from the source.
One basic assumption that you make about code in general is that people who write code are doing the best they can, rather than being sadists. This means that function names will generally be descriptive of what they do, variable names will generally be descriptive of what they represent, and so on. So when I see a class called DashboardView_obj1_Bindings
, I’m going to assume that this class is for dealing with the bindings of some object in DashboardView, and since it has a method called Update_
, it probably has something to do with updating the binding of something whose names involve the words ViewModel
, Layout
, and Groups
.
I looked at DashboardView.xaml
and searched for the word ViewModel
in elements that appeared to be involved with binding.
<ContentControl Grid.Row="0" x:Name="TogglesGroup" IsTabStop="False" Width="360" Content="{x:Bind ViewModel.Layout.Groups[0], Mode=OneWay}" ContentTemplateSelector="{StaticResource DashboardGroupTemplateSelector}"/>
Now, this wasn’t the first use of x:Bind
in the XAML markup, so that doesn’t line up with obj1
, but the other parts do line up (the Layout
and Groups
), so I chalked this up to “Maybe the XAML compiler generates bindings in some order other than the order they appear in the markup.”
How could this binding raise an “out of bounds” exception? Well, there’s a subscript operation, so maybe the Groups
collection is empty.
I looked at the Update_
method to see if that theory lined up.
Update_ViewModel_Layout_Groups: test rdx,rdx je ... mov qword ptr [rsp+8],rbx mov qword ptr [rsp+18h],rbp push rsi push rdi push r14 sub rsp,20h mov rbp,rdx mov rsi,rcx test r8d,0C0000001h je ... xor edx,edx mov rcx,rbp call contoso!Windows::Foundation::IUriRuntimeClassFactory::CreateUri
The function starts with a shrink-wrapped early-out if the first parameter is zero. (This is a C++ method, so rcx
contains this
and rdx
contains the first formal parameter.) I don’t know how binding works, but presumably this is just a binding thing.
If the parameter is nonzero, then we build a proper stack frame, test some bits in the third parameter, and if they’re set, we call, um, CreateUri
with nullptr
? That makes no sense. The XAML isn’t asking for a URI, and why is this code trying to create a URI from an empty string?
But then you realize that you’ve been faked out by COMDAT folding. The this
parameter for the call to CreateUri
is supposed to be the IUriRuntimeClassFactory
, but that’s not what we’re passing; we’re passing the first formal parameter.
Really, this is a call to IVector::GetAt
, and the parameter is zero, indicating that we want the object at index zero. The functions IVector::GetAt
and CreateUri
were folded because they happen to be byte-for-byte identical. They are both “Call the method at index 6 in the vtable with one parameter.” For IUriRuntimeClassFactory
, that method is CreateUri
and the parameter is a string. For IVector
that method is GetAt
and the parameter is an index.
With this explanation, the customer realized that they did have an outstanding bug that said, “If our settings file is corrupted, we end up with no groups,” and this bug is likely an alternate manifestation of that bug.
Can you also talk about the GetKnownFolder bug that was present in Windows builds 25290 and 25295?
Another great post it’s crazy the optimisation of memory layout and one thing masquerading as another. It reminds me of another question I’ve had I don’t think you’ve covered in any blog, I’m not sure how best to ask you. I’m curious about DECIMAL in tagVARIANT struct. I noticed that the declaration for decimal overlaps the VarType flag completely. This means to me that the `.vt` member of a variant containing a decimal is undefined. However it also “coincidentally” always equals VT_DECIMAL in practice. So why isn’t DECIMAL defined in the same UNION as everything else just as 14 bytes with a 2 byte VT rather than 16 bytes? I asked this question on stack overflow and there is an interesting theory someone gave but it’s basically speculation. I thought you might be able to shed some light on the history of it?
How can someone know if the vt is a VARTYPE or just the upper 2 bytes of a decimal?
I did a bit of digging around in the Visual C++ headers, and those two bytes at the start of DECIMAL don’t coincidentally line up with the vt member of VARIANT.
While there isn’t any interesting internal Microsoft history in there, you can infer a lot from just seeing some changes.
Anyway, the first time you see VARIANT in the Visual C++ headers is Visual C++ 1.5. This doesn’t mean that this is the first time VARIANT appears though. But the big thing that come along with Visual C++ 1.5 is OLE2 support. The definition of VARIANT in this was:
Visual C++ 1.5 was released in December 1993 IIRC, and set VARIANT’s size at 16 bytes.
DECIMAL first appeared in the Visual C++ headers in Visual C++ 5. I think this was released in May 1997, so VARIANT being 16 bytes had been around for a while and there had been plenty of time for people to take dependencies on the size of VARIANT. So what do you do if you want to add a new fixed point type to VARIANT? Naturally, you design it to work with VARIANT. VARIANT couldn’t be extended because of compatibility constraints, the structure doesn’t have a size member so there couldn’t be a versioned structure. The fourth member of VARIANT couldn’t change in size for the same reason, it had to stay 8 bytes. So beyond making a version 2 of the structure, the best option was to design DECIMAL to fit into the 16 bytes of the VARIANT structure and just type pun it. This allowed existing code to keep working, and it could just ignore DECIMAL if it didn’t understand it. Also, [MS-OAUT] 2.2.26 essentially tells users to initialise the reserved member to 0 and then never touch it again. So the specification of DECIMAL shows that it had the VARIANT memory layout in mind.
At this point, I would be very surprised if the design of DECIMAL was any kind of coincident.
I guess they wanted DECIMAL to line up to 16 bytes if you had an array of DECIMALs, but on the other hand it had to fit into a VARIANT. Make the first 2 bytes of DECIMAL reserved and then they can overlap without a problem.
Almost without a problem. A C++ language-lawyer might claim that you cannot access both the .vt member and the decimal, only one of the union members is valid. If you are manually writing the decimal, you have to set the reserved member, not vt when you fill it in.
This area is always a pain.
However, VARIANT, and therefore DECIMAL and the variant sub-object itself have a very strong property, they are all POD types. This basically means that for C++, they are seen as standard layout and trivial. This is a very strong property to have in this situation.
You see, if two standard layout non-union class types have common non-static members at the start of the class type then it is valid for you to be able to read these member of a non-active member of a union.
So in this case, DECIMAL and the variant sub-object of VARIANT have a typedef of unsigned short as the initial member, they both use the default compiler alignment and since they are both not bitfields then vt and wReserved are part of the common initial sequence. This means that if DECIMAL is the active member then it is valid to read the variant sub-object’s vt member. If the variant sub-object is the active member then it is valid to read the DECIMAL’s wReserved member.
Of course, you are entirely correct in stating that when writing the DECIMAL you need to write to the DECIMAL’s wReserved member, however “access” implies both read and write, but reading is allowed.
I guess what confuses me is why decimal is this:
Not this:
Or part of Variant like this
With DECIMAL being 14 bytes not 16 and not having the reserved stuff at the start but explicitly getting the vt there rather than just by convention
The change in variant is easy, structure packing. The vt member is 2 bytes and the union that contains the reserved stuff and the DECIMAL would be 14 bytes. The default packing for 32 bit Windows is 4 bytes, this means that VARIANT would end up being 20 bytes with padding between vt and the union and padding after the union. It is potentially possible to deal with this with structure packing, but that goes further down the path of changing the definition of VARIANT in the wrong way.
For putting VARTYPE into DECIMAL, well, why would it have a member named vt? That would imply that it is relevant, but as stated, the specification of DECIMAL states that it isn’t. In fact, they actively tell you not to by using the reserved name, they are putting a big DO NOT USE label on the first two bytes. The use of reserved has been used quite extensively in the Windows headers as a way of saying “hands off”, and people did have a funny habit of trying to stuff bits into places where they shouldn’t go in the past.
The truth of the matter is that as long as the type is the same, the name of the member doesn’t matter with how VC is implemented. VARTYPE and USHORT are both typedefs of unsigned short, so what does it matter if they call it vt or reserved or meow or even insert_witty_name_here? It was more important to make people not to use the member.