Introducing the Half type!

Prashanth Govindarajan

The IEEE 754 specification defines many floating point types, including: binary16, binary32, binary64 and binary128. Most developers are familiar with binary32 (equivalent to float in C#) and binary64 (equivalent to double in C#). They provide a standard format to represent a wide range of values with a precision acceptable for many applications. .NET has always had float and double and with .NET 5 Preview 7, we’ve added a new Half type (equivalent to binary16)!

A Half is a binary floating-point number that occupies 16 bits. With half the number of bits as float, a Half number can represent values in the range ±65504. More formally, the Half type is defined as a base-2 16-bit interchange format meant to support the exchange of floating-point data between implementations. One of the primary use cases of the Half type is to save on storage space where the computed result does not need to be stored with full precision. Many computation workloads already take advantage of the Half type: machine learning, graphics cards, the latest processors, native SIMD libraries etc. With the new Half type, we expect to unlock many applications in these workloads.

Let’s explore the Half type:

The 16 bits in the Half type are split into:

  1. Sign bit: 1 bit
  2. Exponent bits: 5 bits
  3. Significand bits: 10 bits (with 1 implicit bit that is not stored)

Despite that fact that the significand is made up of 10 bits, the total precision is really 11 bits. The format is assumed to have an implicit leading bit of value 1 (unless the exponent field is all zeros, in which case the leading bit has a value 0). To represent the number 1 in the Half format, we’d use the bits:

0 01111 0000000000 = 1

The leading bit (our sign bit) is 0, indicating a positive number. The exponent bits are 01111, or 15 in decimal. However, the exponent bits don’t represent the exponent directly. Instead, an exponent bias is defined that lets the format represent both positive and negative exponents. For the Half type, that exponent bias is 15. The true exponent is derived by subtracting 15 from the stored exponent. Therefore, 01111 represents the exponent e = 01111 (in binary) - 15 (the exponent bias) = 0. The significand is 0000000000, which can be interpreted as the number .significand(in base 2) in base 2, 0 in our case. If, for example, the significand was 0000011010 (26 in decimal), we can divide its decimal value 26 by the number of values representable in 10 bits (1 << 10): so the significand 0000011010 (in binary) is 26 / (1 << 10) = 26 / 1024 = 0.025390625 in decimal. Finally, because our stored exponent bits (01111) are not all 0, we have an implicit leading bit of 1. Therefore,

0 01111 0000000000 = 2^0 * (1 + 0/1024) = 1
In general, the 16 bits of a Half value are interpreted as -1^(sign bit) * 2^(storedExponent - 15) * (implicitBit + (significand/1024)). A special case exists for the stored exponent 00000. In this case, the bits are interpreted as -1^(sign bit) * 2^(-14) * (0 + (significand/1024)). Let’s look at the bit representations of some other numbers in the Half format:

Smallest positive non-zero value

0 00000 0000000001 = -1^(0) * 2^(-14) * (0 + 1/1024) ≈ 0.000000059604645

 (Note the implicit bit is 0 here because the stored exponents bits are all 0)

Largest normal number

0 11110 1111111111 = -1^(0) * 2^(15) * (1 + 1023/1024) ≈ 65504

Negative Infinity

1 11111 0000000000 = -Infinity
A peculiarity of the format is that it defines both positive and negative 0:
1 00000 0000000000 = -0
0 00000 0000000000 = +0

Conversions to/from float/double

Half can be converted to/from a float/double by simply casting it:
float f = (float)half; Half h = (Half)floatValue;

Any Half value, because Half uses only 16 bits, can be represented as a float/double without loss of precision. However, the inverse is not true. Some precision may be lost when going from float/double to Half. In .NET 5.0, the Half type is primarily an interchange type with no arithmetic operators defined on it. It only supports parsing, formatting and comparison operators. All arithmetic operations will need an explicit conversion to a float/double. Future versions will consider adding arithmetic operators directly on Half.

As library authors, one of the points to consider is that a language can add support for a type in the future. It is conceivable that C# adds a half type in the future. Language support would enable an identifier such as f16(similar to the f that exists today) and implicit/explicit conversions. Thus, the library defined type Half needs to be defined in a manner that does not result in any breaking changes if half becomes a reality. Specifically, we needed to be careful about adding operators to the Half type. Implicit conversions to float/double could lead to potential breaking changes if language support is added. On the other hand, having a Float/Double property on the Half type felt less than ideal. In the end, we decided to add explicit operators to convert to/from float/double. If C# does add support for half, no user code would break, since all casts would be explicit.

Adoption

We expect that Half will find its way into many codebases. The Half type plugs a gap in the .NET ecosystem and we expect many numerics libraries to take advantage of it. In the open source arena, ML.NET is expected to start using Half, the Apache Arrow project’s C# implementation has an open issue for it and the DataFrame library tracks a related issue here. As more intrinsics are unlocked in .NET for x86 and ARM processors, we expect that computation performance with Half can be accelerated and result in more efficient code!

29 comments

Comments are closed. Login to edit/delete your existing comments

  • Avatar
    Martin Sedlmair

    Nice work. I have a lot of questions 😉
    1. Will the Half be a primitive type (also visible in TypeCode.Half)?
    2 What will be the CLR type (Single, Double, …?)
    3. Will existing classes be updated (like Vector and Intrinsics)
    4. Will there be a Math for it?
    5. Is the type blittable to a CUDA half-float?

    “We expect that Half will find its way into many codebases”
    The biggest problem still is that there is currently no support in C# to work with numeric types in a generic way. Our whole codebase is cluttered by gettintg around this problem. Just for basic math we have a library with 3000 lines of code that does nothing else than having a large if..then..else to switch between the datatypes. For me this means I have to modify around 1000 methods to bring in Half support.

    So the problem is that it can only move into many codebases if there is general generic support for numeric types will be available.

    • Avatar
      Huo Yaoyuan

      Non-official answer based on my knowledge:
      1. Probably not. TypeCode and others are very difficult to extend.
      2. System.Half
      3. Probably, but I think built-in arithmetic support should be a precondition.
      4. Should be also after arithmetic support.
      5. Probably, if they are both IEEEE754.

  • Avatar
    Andrew Witte

    This is unexpected and awesome!
    What happens when the CPU doesn’t support HF halfs?
    Is software emulation entirely used or will they get processed as floats and stored back into 16 bit halfs for some level of hardware acceleration?

    • Avatar
      Tanner Gooding Microsoft employee

      There is no hardware acceleration in .NET 5 and that is instead something we will be looking at for a future version of .NET.

      Provided the hardware can accelerate a given operation and the accelerated version outperforms the equivalent software fallback, then the hardware option will likely be used.

  • Avatar
    Larvoire, Jean-Francois

    This is good, except for the naming: Wouldn’t it be clearer to just call these types Float16, Float32, Float64 ?

    Also why stop here. There’s an obvious need for a Float128.
    And on the other side, a Float8 type is still heavily used in telephony: Despite a very different vocabulary used in their specs, the PCM codes used in digital telephony are just 8-bit floating point numbers, with 1 sign bit, 3 exponent bits, and 4 significant bits. The difference between the A-Law and µ-Law PCM are in the way they code the 0, and the smallest values around it.

  • Avatar
    Darryl Skeard

    What is the main efficiency advantage of having a 16-bit floating point data type? Can you give an example of where it could improve the efficiency of code?

    • Matthew Crews
      Matthew Crews

      There are many scenarios in training ML models where the 16-bit float is sufficiently accurate. The main advantage going from 32-bit to 16-bit is the reduced memory requirements. Models in ML can get very large and have many coefficients. If the coefficients go from 32-bit to 16-bit, they will take up half the space. This means more values on each cache line, more values in registers, less need to go to main memory. Smaller programs are generally faster on modern CPUs because more of the program fits in the CPU caches.