Managed object internals, Part 4. Fields layout

Sergey Tepliakov

Sergey

In the recent blog posts we’ve discussed invisible part of the object layout in the CLR:

This time we’re going to focus on the layout of an instance itself, specifically, how instance fields are laid out in memory.

FieldsLayout_Figure1

There is no official documentation about fields layout because the CLR authors reserved the right to change it in the future. But knowledge about the layout can be helpful if you’re curious or if you’re working on a performance critical application.

How can we inspect the layout? We can look at a raw memory in Visual Studio or use !dumpobj command in SOS Debugging Extension. These approaches are tedious and boring, so we’ll try to write a tool that will print an object layout at runtime.

If you’re not interested in the implementation details of the tool, feel free to jump to the section ‘Inspecting a value type layout at runtime’.

Getting the field offset at runtime

We’re not going to use unmanaged code or Profiling API, instead we’ll use the power of LdFlda instruction. This IL instruction returns an address of a field for a given type. Unfortunately, this instruction is not exposed in C# language, so we have to do some light-weight code generation to work around that limitation.

In Dissecting the new() constraint in C# we already did something similar. We’ll generate a Dynamic Method with the necessary IL-instructions.

The method should do the following:

  • Create an array for all field addresses.
  • Enumerate over each FieldInfo of an object to get the offset by calling LdFlda instruction.
  • Convert the result of LdFlda instruction to long and store the result in the array.
  • Return the array.

Now we can create a helper function that will provide the offsets for each field for a given type:

The function is pretty straightforward with one caveat: LdFlda instruction expects an object instance on the evaluation stack. For value types and for reference types with a default constructor, the solution is trivial: use Activator.CreateInstance(Type). But what if want to inspect classes that doesn’t have a default constructor?

In this case we can use a lesser known “generic factory” called FormatterServices.GetUninitializedObject(Type):

Let’s test GetFieldOffsets to get the layout for the following type:

The output is:

Interesting, but not sufficient. We can inspect offsets for each field, but it would be very helpful to know the size of each field to understand how efficient the layout is and how much empty space each instance has.

Computing the size for a type instance

And again, there is no “official” way to get the size of the object instance. sizeof operator works only for primitive types and user-defined structs with no fields of reference types. Marshal.SizeOf returns a size of an object in unmanaged memory and is also not suitable for our needs.

We’ll compute instance size for value types and object separately. To compute the size of a struct we’ll rely on the CLR itself. We will create a simple generic type with two fields: the first field of the desired type and the second field that will be used to get the size of the first one.

To get the size of a reference type instance we will use another trick: we’ll get the max field offset, then add the size of that field and round that number to a pointer-size boundary. We already know how to compute the size of a value type and we know that every field of a reference type occupies 4 or 8 bytes depending on the platform. So we’ve got everything we need:

We have enough information to get a proper layout information for any type instance at runtime.

Inspecting a value type layout at runtime

Let’s start with value types and inspect the following struct:

Here is a result of TypeLayout.Print<NotAlignedStruct>() method call:

By default, a user-defined struct has the ‘sequential’ layout with Pack equal to 0. Here is a rule that the CLR follows:

Each field must align with fields of its own size (1, 2, 4, 8, etc., bytes) or the alignment of the type, whichever is smaller. Because the default alignment of the type is the size of its largest element, which is greater than or equal to all other field lengths, this usually means that fields are aligned by their size. For example, even if the largest field in a type is a 64-bit (8-byte) integer or the Pack field is set to 8, Byte fields align on 1-byte boundaries, Int16 fields align on 2-byte boundaries, and Int32 fields align on 4-byte boundaries.

In this case, the alignment is equal to 4 causing a reasonable amount of overhead. We can change the Pack to 1, but we can get a performance degradation due to unaligned memory operations. Instead we can use LayoutKind.Auto to allow the CLR to figure out the best layout:

Please, keep in mind that the sequential layout for both value types and reference types is only possible if a type doesn’t have “pointers” in it. If a struct or a class has at least one field of a reference type, the layout is automatically changed to LayoutKind.Auto.

Inspecting a reference type layout at runtime

There are two main differences between the layout of a reference type and a value type. First, each “object” instance has a header and a method table pointer. And second, the default layout for “objects” is automatic not sequential. And similar to value types, the sequential layout is possible only for classes which don’t have any fields of reference types.

Method TypeLayout.PrintLayout<T>(bool recursively = true) takes an argument that allows to print the nested types as well.

The cost of wrapping a struct

Even though the type layout is pretty straightforward, I’ve found one interesting aspect.

I was investigating a memory issue in my project recently and I noticed something strange: the sum of all fields of a managed object was higher than the size of the instance. I roughly knew the rules how the CLR lays out fields so I was puzzled. I’ve started working on this tool to understand that issue.

I’ve narrowed down the issue to the following case:

Even though the size of the ByteWrapper is 1 byte, the CLR aligns each field on the pointer boundaries! If the type layout is LayoutKind.Auto the CLR will pad each field of a custom value type! This means that if you have multiple structs that wrap just a single int or byte and they’re widely used in millions of objects, you could have a noticeable memory overhead due to padding!

References

Sergey Tepliakov
Sergey Tepliakov

Senior Software Engineer, Tools for Software Engineers

Follow Sergey   

0 comments

    Leave a comment