October 15th, 2025
heart1 reaction

Why can you increment a reference count with relaxed semantics, but you have to decrement with release semantics?

When managing reference counts, there is an asymmetry between incrementing and decrementing: Incrementing the reference count can use relaxed semantics, but decrementing requires release semantics (and destroying requires acquire semantics).

The asymmetry may strike you as odd, but maybe it shouldn’t. After all, it’s not surprising that it’s easier to pull your toys out out than to put them away.

Incrementing a reference count can be done with relaxed semantics (no memory ordering with respect to other memory locations) because the object is not at risk of being destroyed, and any memory operations that occur after the increment may as well have occurred before the increment. Incrementing a reference count doesn’t really impose any ordering requirements on memory accesses to the object.

Decrementing a reference count is a different story.

The danger with decrementing a reference count is that the the object is destructed when the reference count goes to zero. Now, maybe you didn’t decrement the reference count to zero, but it’s possible that another thread decrements it to zero after you do. Therefore, any decrement must be done with release semantics so that any straggling writes to memory are visible to the destructing thread before it frees the memory. One reason is that you want the destructor to see a consistent object. And even if the delayed write doesn’t affect consistency, you don’t want it to complete after the memory is freed. That would be a use-after-free, which is undefined behavior. In practice, this will corrupt whatever object was allocated into the memory that was previously occupied by the destrutced object.

Meanwhile, the thread that decrements the reference count to zero must perform an acquire to ensure that it doesn’t start destructing the object until all previous writes have drained.

There are two approaches to this double responsibility on the decrement.

One is to decrement with release semantics, and then establish an acquire fence if you realize that you are the one to do the decrement. This is the strategy employed by C++/WinRT:

static uint32_t __stdcall Release(fast_abi_forwarder* self) noexcept
{
    uint32_t const remaining = self->m_references.
        fetch_sub(1, std::memory_order_release) - 1;
    if (remaining == 0)
    {
        std::atomic_thread_fence(std::memory_order_acquire);
        delete self;
    }
    return remaining;
}

Another approach is to use an acquire-release on the decrement, thereby avoiding the need for a separate acquire when the reference count goes to zero. This is the strategy employed by Microsoft’s STL:

    void _Decref() noexcept { // decrement use count
        if (_MT_DECR(_Uses) == 0) {
            _Destroy();
            _Decwref();
        }
    }

    void _Decwref() noexcept { // decrement weak reference count
        if (_MT_DECR(_Weaks) == 0) {
            _Delete_this();
        }
    }

where _MT_DECR is defined as

#define _MT_DECR(x) _INTRIN_ACQ_REL(_InterlockedDecrement)(reinterpret_cast(&x))

and _INTRIN_ACQ_REL performs an acquire-release atomic operation, or at least the closest version supported by the processor provided it is at least as strong as an acquire-release.

The libcxx library (llvm) also uses acquire-release, as does the libc++ library (gcc).

Topics
Code

Author

Raymond has been involved in the evolution of Windows for more than 30 years. In 2003, he began a Web site known as The Old New Thing which has grown in popularity far beyond his wildest imagination, a development which still gives him the heebie-jeebies. The Web site spawned a book, coincidentally also titled The Old New Thing (Addison Wesley 2007). He occasionally appears on the Windows Dev Docs Twitter account to tell stories which convey no useful information.

1 comment

Sort by :
  • Ben Craig

    There's an extra wrinkle / strategy used by shared_ptr and weak_ptr in LLVM's libc++. For the weak count, libc++ does a load acquire on the weak ref count, and if this is the last reference, it doesn't even do the decrement. If it's not the last one, then it does the acq_rel decrement. This saves a potentially expensive atomic store in the extremely common case of going from a ref count of 1 to 0, at the expense of a unnecessary loads when there are a lot of weak_ptrs around. I even left a big comment...

    Read more