If you’re just going to sit there doing nothing, at least do nothing correctly

Raymond Chen

There may be times where you need to make an API do nothing. It’s important to have it do nothing in the correct way.

For example, Windows has an extensive printing infrastructure. But that infrastructure does not exist on Xbox. What should happen if an app tries to print on an Xbox?

Well, the wrong thing to do is to have the printing functions throw a Not­Supported­Exception. The app that the user installed on the Xbox was probably tested primarily, if not exclusively, on a PC, where printing is always available. When run on an Xbox, the exception will probably go unhandled, and the app will crash. Even if the app tried to catch the exception, it would probably display a message like “Oops. That went badly. Call support and provide this incident code.”

A better design for “supporting” printing on Xbox is to have the printing functions succeed, but report that there are no printers installed. With this behavior, when the app tries to print, it will ask the user to select a printer, and show an empty list. The user realizes, “Oh, there are no printers,” and cancels the printing request.

To deal with apps that get fancy and say “Oh, you have no printers installed, let me help you install one,” the function for installing a printer can return immediately with a result code that means “The user cancelled the operation.”

The idea here is to have the printing functions all behave in a manner perfectly consistent with printing being fully supported, yet mysteriously there is never a printer to print to.

Now, you probably also want to add a function to check whether printing even works at all. Apps can use this function to hide the Print button from their UI if they are running on a system that doesn’t support printing at all. But naïve apps that assume that printing works will still behave in a reasonable manner: You’re just on a system that doesn’t have any printers and all attempts to install a printer are ineffective.

The name we use to describe this “do nothing” behavior is “inert”.

The API surface still exists and functions according to its specification, but it also does nothing. The important thing is that it does nothing in a way that is consistent with its documentation and is least likely to create problems with existing code.

Another example is the retirement of an API that has a variety of functions for creating widget handles, other functions that accept widget handles, and a function for closing widget handles. The team that was doing the retirement originally proposed making the API inert as follows:

HRESULT CreateWidget(_Out_ HWIDGET* widget)
{
    *widget = nullptr;
    return S_OK;
}

// Every widget is documented to have at least one alias,
// so we have to produce one dummy alias (empty string).
HRESULT GetWidgetAliases(
    _Out_writes_to_(capacity, *actual) PWSTR* aliases,
    UINT capacity,
    _Out_ UINT* actual)
{
    *actual = 0;

    RETURN_HR_IF(
        HRESULT_FROM_WIN32(ERROR_MORE_DATA),
        capacity < 1);

    aliases[0] = make_cotaskmem_string_nothrow(L"").release();
    RETURN_IF_NULL_ALLOC(aliases[0]);

    *actual = 1;
    return S_OK;
}

// Inert widgets cannot be enabled or disabled.
HRESULT EnableWidget(HWIDGET widget, BOOL value)
{
    return E_HANDLE;
}

HRESULT Close(HWIDGET widget)
{
    RETURN_HR_IF(E_INVALIDARG, widget != nullptr);
    return S_OK;
}

I pointed out that having Create­Widget succeed but return a null pointer is going to confuse apps. “The call succeeded, but I didn’t get a valid handle back?” I even found some of their own test code that checked whether the handle was null to determine whether the call succeeded, rather than checking the return value.

I also pointed out that having Enable­Widget return “invalid handle” is also going to create confusion. An app calls Create­Widget, and it succeeds, and it takes that handle (which is presumably valid) and tries to use it to enable a widget, and it’s told “That handle isn’t valid.” How can that be? “I asked for a widget, and you gave me one, and then when I showed it to you, you said, ‘That’s not a widget.’ This API is gaslighting me!”

I looked through the existing documentation for their API and found that a documented return value is ERROR_CANCELLED to mean that the user cancelled the creation of the widget. Therefore, apps are already dealing with the possibility of widgets not being created due to conditions outside their control, so we can take advantage of that: Any time the app tries to create a widget, just say “Nope, the, uh, user cancelled, yeah, that’s what happened.”

HRESULT CreateWidget(_Out_ HWIDGET* widget)
{
    *widget = nullptr;
    return HRESULT_FROM_WIN32(ERROR_CANCELLED);
}

HRESULT GetWidgetAliases(
    _Out_writes_to_(capacity, *actual) PWSTR* aliases,
    UINT capacity,
    _Out_ UINT* actual)
{
    *actual = 0;
    return E_HANDLE;
}

HRESULT EnableWidget(HWIDGET widget, BOOL value)
{
    return E_HANDLE;
}

HRESULT Close(HWIDGET widget)
{
    return E_HANDLE;
}

Now we have a proper inert API surface.

If you try to create a widget, we tell you that we couldn’t because the user cancelled. Since all attempts to create a widget fail, there is no such thing as a valid widget handle, and any time you try to use one, we tell you that the handle is invalid.

This also avoids the problem of having to produce dummy aliases for widgets. Since there are no widgets, there is no legitimate case where an app could ask a widget for its aliases.

Bonus chatter: To clear up some confusion: The idea here is that the printing API has always existed on desktop, where printing is supported, and the “get me the list of printers” function is documented not to throw an exception. If you want to port the printing API to Xbox, how do you do it in a way that allows existing desktop apps to continue to run on Xbox? The inert behavior is completely truthful: There are no printers on an Xbox. Nobody expects the answer to the question, “How many printers are there?” to be “How dare you ask me such a thing!”

Another scenario where you need to create an inert API surface is if you want to retire an existing API. How do you make the behavior of the API consistent with its contract while still doing nothing useful?

17 comments

Discussion is closed. Login to edit/delete existing comments.


Newest
Newest
Popular
Oldest
  • Jonathan Potter 0

    I’d love to know where you find this API documentation utopia where all possible error codes are clearly documented 🙂

    • Tom Lint 0

      Do you mean for API implementation x or the list of possible error codes in Win32? The latter already exists, and I use it frequently to make sure my Win32 code returns the most intuitive error code for each scenario I encounter.

  • Lyubomir Lyubomirov 0

    I’m not sure who to ask, but why am I not receiving notifications when I have selected “Email me when there are new comments on posts where I have posted a comment”?

    • Scarlet Manuka 2

      Maybe this feature has been inerted out…

  • Danielix Klimax 0

    While incidental to your point, why retire API in such “unhelpful” way? Is backwards compatibility no longer in effect?

    • Simon Farnsworth 0

      If you can’t maintain backwards compatibility with the old API (e.g. you’re on a games console that has no support for printing at all, the API depends on hardware like the ISA bus or the FDC that’s no longer supported, the API was inherently insecure and there’s no way to support it and protect users from malware), this is a helpful way to retire it.

      Instead of applications refusing to start because an API has been removed , or crashing when they call the API (which may happen without the user being aware it’s done), the application runs without the functionality the API provided. Everything else is intact; so, for example, if you’ve removed support for POTS modems, the application can still run, can still display and edit your contacts book, it just can’t dial numbers for you any more – which lets you (e.g.) sync the contacts from the old application to a modern application that can send them to your mobile phone.

    • Lyubomir Lyubomirov 0

      I’m not sure I fully understand the point, but I think it’s to keep old apps running on new hardware (or lack of matching as the case may be) as much as possible. What useful way of “retirement” would you suggest for an application made to work with floppy disks, given that probably some of the younger colleagues don’t even know what a floppy disk is?

  • Mark Magagna 0

    Somewhat related to the Null Object pattern, where you return an actual object instead of null, to prevent null pointer exception errors down the road. It also allows customization of behavior.

    • Lyubomir Lyubomirov 0

      It seems to me that the author is talking about the exact opposite – to return NULL, but with an appropriate error code.

  • Shawn Van Ness 0

    Because all this code looks so COM-ish (HRESULT, E_HANDLE and S_OK etc) it really needs to be said .. this is the problem that QueryInterface() was intended to solve.

    Why no QueryInterface(ISupportPrinting) ?

    • Raymond ChenMicrosoft employee Author · Edited 1

      The Widget example is a flat API. It just happens to use HRESULTs as error codes, but Widgets are not COM objects.

      In the Printing example, the issue is that time travel has yet to be perfected. The original API was designed on PCs, and PCs always supported printing, so in the original API you just say “Hey, give me the printers”, and it always succeeded (possibly with an empty list, if no printers were installed). The problem is how to port this API to Xbox sanely. (If your answer is “Don’t port this API to Xbox,” then any app that has a Print option [even if not central to the app’s functionality] cannot run on Xbox. “Why can’t I play Candy Crush on Xbox?” “Oh, because there’s an Easter Egg after you beat level 500, where it shows you a “Print a certificate” button. Without a printing API, this app can’t run.)

  • Mohammad Ghasemi 0

    Very educational article. Thank you.

  • Lyubomir Lyubomirov 0

    In such situations I always check the returned handle. I would write the unused functions in the same way, except that instead of ERROR_CANCELLED I would use ERROR_CALL_NOT_IMPLEMENTED or ERROR_NOT_SUPPORTED.

    • Dustin Howett 1

      Unless you had previously documented your API as potentially returning those status codes, your “inert” shims will break applications that are written to the specification. That is the central thrust of this article: inert implementations comply with the API as documented and in so doing do not break existing applications.

      • Lyubomir Lyubomirov · Edited 0

        Of course I got the point of the article, it’s just that back in the day the Windows API used these error codes. So ancient that probably only Raymond remembers them. 🙂

  • Georg Rottensteiner 0

    Is that what went wrong at DirectSound when removing the hardware support? Or did I expect wrong?

    I recall my code failing when requesting a hardware device, which pretty much succeeded everywhere, what with DirectSound devices on board per default.
    Suddenly with the removal I had to change to fallback to a software device. I seem to remember that the samples also did it with simple hardware or nothing.

    Probably just wishful thinking on my side that I didn’t do it wrong 🙂

    Nitpicker’s Corner (what happened to that actually?): The on board sound drivers were really really bad, and probably the reason for removing their support.
    E.g. there was an API call to set an event once a sound buffer had completed playing, or reached a certain position, crucial for audio streaming. The on board sound driver sent that event for every sound buffer, not only the one I requested for.

Feedback usabilla icon