Going beyond the empty set: Embracing the power of other empty things

The empty set contains nothing. This sounds really silly, but it’s actually really nice.

The Windows Runtime has a policy that if a method returns a collection (such as an IVector), and the method produces no results, then it should return an empty collection, rather than a null reference. That way, consumers can just iterate over the collection without having to deal with a null test.

For example, suppose you have a method Widget::GetAssociatedDoodads which returns an IVectorView<Doodad> representing the Doodad objects that have been associated with a Widget object. If no Doodads have been associated with the Widget, then it should return an empty vector, not a null pointer. That allows developers to write the natural-looking code:

// C#
foreach (var doodad in widget.GetAssociatedDoodads()) {
    ⟦ process each doodad ⟧
}

// C++/WinRT
for (auto&& doodad : widget.GetAssociatedDoodads()) {
    ⟦ process each doodad ⟧
}

// JavaScript
widget.GetAssociatedDoodads().forEach(doodad =>
{
    ⟦ process each doodad ⟧
});

rather than having to insert a null test (which is easily forgotten):

// C#
var doodads = widget.GetAssociatedDoodads();
if (doodads != null) { // annoying null test
    foreach (var doodad in widget.GetAssociatedDoodads()) {
        ⟦ process each doodad ⟧
    }
}

// C++/WinRT
auto doodads = widget.GetAssociatedDoodads();
if (doodads) { // annoying null test
    for (auto&& doodad : doodads) {
        ⟦ process each doodad ⟧
    }
}

// JavaScript
var doodads = widget.GetAssociatedDoodads();
if (doodads) { // annoying null test
    doodads.forEach(doodad =>
    {
        ⟦ process each doodad ⟧
    });
}

The principle of the empty collection applies to other types of collections, like IMap<K, V>, array. You can think of strings as collections of characters, and you can think of memory buffers (such as IBuffer) as collections of bytes.

An example of a poor design is the CryptographicBuffer class. (Sorry, CryptographicBuffer, for throwing you under the bus.)

Method	Expected Result	Actual Result
`buffer = ConvertStringToBinary("");`	`buffer != null` `buffer.Length == 0`	`buffer == null` `buffer.Length /* crashes */`
`buffer = CreateFromByteArray(new[] {});`
`buffer = DecodeFromBase64String("");`
`buffer = DecodeFromHexString("");`
`buffer = GenerateRandom(0);`		`buffer != null` `buffer.Length == 0`

If the ConvertStringToBinary, CreateFromByteArray, DecodeFromBase64String, DecodeFromHexString are given empty strings or arrays, you expect them to produce an empty buffer, but instead they return no buffer at all.

This means that code like this looks correct:

// Write the string to a file as UTF-8
var buffer = CryptographicBuffer.ConvertStringToBinary(
        BinaryStringEncoding.Utf8, message);
await FileIO.WriteBufferAsync(storageFile, buffer);

but then you discover (probably at a very inconvenient moment) that it crashes if the message is an empty string, because ConvertStringToBinary returned null (instead of a non-null reference to an empty buffer), and then WriteBufferAsync threw an invalid parameter exception because the buffer cannot be null.

On the other hand, if you ask GenerateRandom to generate zero random bytes, it correctly gives you an empty buffer, rather than a null pointer. So at least one of the methods in the CryptographicBuffer class understands how empty collections work.

As a bonus insult, the CryptographicBuffer.Compare method requires that both buffers be non-null, so you can’t even do this:

// Do it twice and confirm the results are the same
var buffer1 = CryptographicBuffer.ConvertStringToBinary(
        BinaryStringEncoding.Utf8, message);
var buffer2 = CryptographicBuffer.ConvertStringToBinary(
        BinaryStringEncoding.Utf8, message);
if (CryptographicBuffer.Compare(buffer1, buffer2)) {
    // the buffers are equal
}

The code crashes if the message is an empty string because buffer1 and buffer2 will be null, which is not a valid parameter to CryptographicBuffer.Compare. It’s a bit ironic that the CryptographicBuffer can dish out null buffers but can’t take them.

Cryptography in general seems to have a hard time with the concept of zero. The UserDataProtectionManager.ProtectBufferAsync method, for example, rejects attempts to protect an empty buffer, so if you want to protect a buffer that might be empty, you need to special-case the empty buffer.

// This version crashes if the buffer is empty.
static class Protector
{
    static UserDataProtectionManager manager =
        UserDataProtectionManager.TryGetDefault();

    public Task<IBuffer> ProtectBufferAsync(IBuffer buffer)
    {
        if (manager != null) {
            return await manager.ProtectBufferAsync(buffer,
                    UserDataAvailability.AfterFirstUnlock);
        } else {
            // No protection available - leave unprotected.
            return buffer;
        }
    }

    public Task<IBuffer> UnprotectBufferAsync(IBuffer buffer)
    {
        if (manager != null) {
            return await manager.UnProtectBufferAsync(buffer);
        } else {
            // No protection available - it was left unprotected.
            return buffer;
        }
    }
}

A naïve way of fixing this is to detect an empty buffer and just skip the ProtectBufferAsync call, letting an empty buffer be its own protected buffer. This is a bad idea, however, because a bad guy who sees an empty protected buffer will know that this represents an empty unprotected buffer. If the buffer represents a password, then they will know that the password is blank!

If you choose some sentinel non-empty buffer value to represent a non-empty buffer, you then have to have some way of distinguishing this from a genuine non-empty buffer that happens to match your sentinel. In mathematical terms, your function that converts buffers to non-empty buffers needs to be injective. One way is to append a dummy byte to the buffer, and remove the dummy byte when unprotecting.

// C#

// Work around inability to protect empty buffers
// by appending a dummy byte to all buffers.
var paddedBuffer = WindowsRuntimeBuffer.Create(buffer.Length + 1);
paddedBuffer.Length = actualBuffer.Capacity;
buffer.CopyTo(paddedBuffer);
var protectedBuffer = await manager.ProtectBufferAsync(
    paddedBuffer, UserDataAvailability.AfterFirstUnlock);

// Reverse the workaround by removing the dummy byte
// after unprotecting.
var result = await manager.UnprotectBufferAsync(protectedBuffer);
if (result.Status == UserDataBufferUnprotectStatus.Succeeded)
{
    var trimmedBuffer = result.UnprotectedBuffer;
    trimmedBuffer.Length = trimmedBuffer.Length - 1;
    ⟦ do something with the trimmed buffer ⟧
}

// C++

// Work around inability to protect empty buffers
// by appending a dummy byte to all buffers.
auto length = buffer.Length();
auto paddedBuffer = winrt::Buffer(length + 1);
paddedBuffer.Length(length + 1);
memcpy_s(paddedBuffer.data(), length, buffer.data(), length);
auto protectedBuffer = co_await manager.ProtectBufferAsync(
    paddedBuffer, winrt::UserDataAvailability::.AfterFirstUnlock);

// Reverse the workaround by removing the dummy byte
// after unprotecting.
auto result = co_await manager.UnprotectBufferAsync(protectedBuffer);
if (result.Status() == winrt::UserDataBufferUnprotectStatus::Succeeded) {
    auto trimmedBuffer = result.UnprotectedBuffer();
    trimmedBuffer.Length(trimmedBuffer.Length() - 1);
    ⟦ do something with the trimmed buffer ⟧
}

The inability to handle zero-byte buffers makes everybody’s life harder.

Zero. It’s a valid number. Please support it.

Author

Raymond Chen

Raymond has been involved in the evolution of Windows for more than 30 years. In 2003, he began a Web site known as The Old New Thing which has grown in popularity far beyond his wildest imagination, a development which still gives him the heebie-jeebies. The Web site spawned a book, coincidentally also titled The Old New Thing (Addison Wesley 2007). He occasionally appears on the Windows Dev Docs Twitter account to tell stories which convey no useful information.

6 comments

Joe Beans September 26, 2024

I propose a corollary: any method that receives a collection should treat null as an empty collection.

Rob Paveza

September 24, 2024

One question I have here is: how might you address fixing `CryptographicBuffer`?

I think that the four versions that produce the null buffer could likely just be changed to return the empty result, because you're going from an error case to a success-but-empty case. But, it's a functional change, and it makes me think back to all of the work that you've done historically to ensure compatibility.

Another option might be to deprecated the methods and introduce...

Martin Ba September 24, 2024

At least one of the things where the idioms in C++ still shine.
If a function returns a collection by value (or by const ref in case of getters), all it can do is return a collection, no null shenanigans.
And thanks to move-semantics and [N]RVO, it should generally be pretty cheap to return by value.

Kevin Norris September 23, 2024 · Edited

Personally, I'm a big fan of Rust's decision to make Option and Result iterable. They both yield one object on the happy path, and zero objects on the unhappy path, which can then be composed with the rest of Rust's iterator calculus. This may sound like an awkward and fiddly way of doing things... except in cases where the underlying object either is a collection, or you plan to convert it into a collection in...

紅樓鍮 September 24, 2024 · Edited 0

That's not what Rust (like all strongly-typed functional programming languages) is fundamentally about, though. In Rust a value must be explicitly made "nullable" by giving it a type of , and a value of type can never be "null". Good programming practice in Rust involves properly typing infallible computations as (or at least ) as opposed to . A well-designed Rust library will not have functions returning s all over the place, regardless...
Read more
That’s not what Rust (like all strongly-typed functional programming languages) is fundamentally about, though. In Rust a value must be explicitly made “nullable” by giving it a type of Option<T>, and a value of type T can never be “null”. Good programming practice in Rust involves properly typing infallible computations as T (or at least Result<T, !>) as opposed to Option<T>. A well-designed Rust library will not have functions returning Options all over the place, regardless of how convenient it is to consume them.

Read less
- Kevin Norris September 25, 2024 0
  
  It is not either-or. There are situations where the correct API type is an Option of some kind (to distinguish between e.g. "there are no active frobnicators right now" and "frobnicator support is disabled, so there is no list of active frobnicators"), but also a particular caller wants to map None to the empty collection (and a different caller might not want to do that). Option::into_iter() lets both callers get what they want with minimal...
  Read more
  It is not either-or. There are situations where the correct API type is an Option of some kind (to distinguish between e.g. “there are no active frobnicators right now” and “frobnicator support is disabled, so there is no list of active frobnicators”), but also a particular caller wants to map None to the empty collection (and a different caller might not want to do that). Option::into_iter() lets both callers get what they want with minimal fuss.
  
  Read less

Discussion is closed. Login to edit/delete existing comments.

Sort by :

Newest

Newest Popular Oldest

Joe Beans September 26, 2024 0

I propose a corollary: any method that receives a collection should treat null as an empty collection.
Rob Paveza September 24, 2024 0

One question I have here is: how might you address fixing `CryptographicBuffer`?

I think that the four versions that produce the null buffer could likely just be changed to return the empty result, because you're going from an error case to a success-but-empty case. But, it's a functional change, and it makes me think back to all of the work that you've done historically to ensure compatibility.

Another option might be to deprecated the methods and introduce...
Read more
One question I have here is: how might you address fixing `CryptographicBuffer`?

I think that the four versions that produce the null buffer could likely just be changed to return the empty result, because you’re going from an error case to a success-but-empty case. But, it’s a functional change, and it makes me think back to all of the work that you’ve done historically to ensure compatibility.

Another option might be to deprecated the methods and introduce new ones. But all of the existing methods already have “the good names.”

Yet another is just to leave it as-is and say “this is inconsistent but sorry.” That might be the actual choice right now — but do you think it’s the best?

Read less
Martin Ba September 24, 2024 2

At least one of the things where the idioms in C++ still shine.
If a function returns a collection by value (or by const ref in case of getters), all it can do is return a collection, no null shenanigans.
And thanks to move-semantics and [N]RVO, it should generally be pretty cheap to return by value.
Kevin Norris September 23, 2024 · Edited 1

Personally, I'm a big fan of Rust's decision to make Option and Result iterable. They both yield one object on the happy path, and zero objects on the unhappy path, which can then be composed with the rest of Rust's iterator calculus. This may sound like an awkward and fiddly way of doing things... except in cases where the underlying object either is a collection, or you plan to convert it into a collection in...
Read more
Personally, I’m a big fan of Rust’s decision to make Option and Result iterable. They both yield one object on the happy path, and zero objects on the unhappy path, which can then be composed with the rest of Rust’s iterator calculus. This may sound like an awkward and fiddly way of doing things… except in cases where the underlying object either is a collection, or you plan to convert it into a collection in some way. Then you call flatten() or flat_map() (respectively) and do more iterator calculus from there, exactly as if you really were handed an empty collection on the unhappy path in the first place.

The simplest example is (given value is an Option-al Iterable) value.into_iter().flatten().collect(), which peels off the Option wrapper, turns None into an empty collection, and optionally converts one collection type into another (e.g. converting Vec to boxed slice) if the type annotations indicate that this should be done. You can (and probably would, in practice) replace collect() with a more complicated chain of iterator adapters if desired, this is just a trivial example. You can also skip collect() and hand the whole thing directly to a for loop (but for simple operations, map() is probably easier to read than a for loop).

Read less
- 紅樓鍮 September 24, 2024 · Edited 0
  
  That's not what Rust (like all strongly-typed functional programming languages) is fundamentally about, though. In Rust a value must be explicitly made "nullable" by giving it a type of , and a value of type can never be "null". Good programming practice in Rust involves properly typing infallible computations as (or at least ) as opposed to . A well-designed Rust library will not have functions returning s all over the place, regardless...
  Read more
  That’s not what Rust (like all strongly-typed functional programming languages) is fundamentally about, though. In Rust a value must be explicitly made “nullable” by giving it a type of Option<T>, and a value of type T can never be “null”. Good programming practice in Rust involves properly typing infallible computations as T (or at least Result<T, !>) as opposed to Option<T>. A well-designed Rust library will not have functions returning Options all over the place, regardless of how convenient it is to consume them.
  
  Read less
  - Kevin Norris September 25, 2024 0
    
    It is not either-or. There are situations where the correct API type is an Option of some kind (to distinguish between e.g. "there are no active frobnicators right now" and "frobnicator support is disabled, so there is no list of active frobnicators"), but also a particular caller wants to map None to the empty collection (and a different caller might not want to do that). Option::into_iter() lets both callers get what they want with minimal...
    Read more
    It is not either-or. There are situations where the correct API type is an Option of some kind (to distinguish between e.g. “there are no active frobnicators right now” and “frobnicator support is disabled, so there is no list of active frobnicators”), but also a particular caller wants to map None to the empty collection (and a different caller might not want to do that). Option::into_iter() lets both callers get what they want with minimal fuss.
    
    Read less

Going beyond the empty set: Embracing the power of other empty things

Author

6 comments

Read next

Another example of the Windows Runtime interop pattern: Using the UserConsentVerifier from a Win32 program

If you’re going to specify the `LVS_SORTASCENDING` or `LVS_SORTDESCENDING` style, you had better be telling the truth

Author

6 comments

Read next

Another example of the Windows Runtime interop pattern: Using the UserConsentVerifier from a Win32 program

If you’re going to specify the LVS_SORTASCENDING or LVS_SORTDESCENDING style, you had better be telling the truth

Stay informed

If you’re going to specify the `LVS_SORTASCENDING` or `LVS_SORTDESCENDING` style, you had better be telling the truth