Lock-free reference-counting a TLS slot using atomics, part 1

Raymond Chen

Some time ago, we spent time looking at various lock-free algorithms, one of which is the lock-free singleton constructor. But suppose you want your singleton to be reference-counted?

To make things concrete, let’s suppose that we want a class which manages a TLS slot, allocating it on demand, and freeing it when there are no longer any users.

Let’s start with a sketch of how we want this to work, but without worrying about atomicity yet.

// Note: Not finished yet
struct TlsManager
{
    DWORD m_count = 0;
    DWORD m_tls = TLS_OUT_OF_INDEXES;

    void Acquire()
    {
        if (++m_count == 1) {
            m_tls = TlsAlloc();
            THROW_LAST_ERROR_IF(m_tls == TLS_OUT_OF_INDEXES);
        }
    }

    void Release()
    {
        if (--m_count == 0) {
            TlsFree(std::exchange(m_tls, TLS_OUT_OF_INDEXES));
        }
    }
};

struct TlsUsage
{
    TlsUsage() = default;

    explicit TlsUsage(TlsManager& manager) :
        m_manager(&manager) { manager.Acquire(); }

    TlsUsage(TlsUsage&& other) :
        m_manager(std::exchange(other.manager, nullptr)) {}

    TlsUsage& operator=(TlsUsage&& other) {
        std::swap(m_manager, other.m_manager);
    }

    ~TlsUsage()
    {
        if (m_manager) m_manager->Release();
    }

    void* GetValue()
    {
        return TlsGetValue(m_manager->m_tls);
    }

    void SetValue(void* value)
    {
        TlsSetValue(m_manager->m_tls, value);
    }

    TlsManager* m_manager = nullptr;
};

The idea here is that a Tls­Manager is the object that manages access to a TLS slot. You call Acquire to start using the TLS slot (allocating it on demand), and you can use that slot until you call Release. When the last consumer of a slot calls Release, the slot is freed.

Instead of talking directly to the Tls­Manager, you use a Tls­Usage, which is an RAII type that deals with the acquire/release protocol for you.

To make the Tls­Manager thread-safe, we can add locks:

struct TlsManager
{
    DWORD m_count = 0;
    DWORD m_tls = TLS_OUT_OF_INDEXES;
    std::mutex m_mutex;

    void Acquire()
    {
        auto lock = std::unique_lock(m_mutex);

        if (++m_count == 1) {
            m_tls = TlsAlloc();
            THROW_LAST_ERROR_IF(m_tls == TLS_OUT_OF_INDEXES);
        }
    }

    void Release()
    {
        auto lock = std::unique_lock(m_mutex);

        if (--m_count == 0) {
            TlsFree(std::exchange(m_tls, TLS_OUT_OF_INDEXES));
        }
    }
};

Now, in practice, this might end up being efficient enough if Tls­Usage objects are not frequently created and destroyed. But you might be in a case where your program is constantly creating and destroying Widget objects, and each Widget needs a Tls­Usage. That lock might end up being a bottleneck. We’ll try to address this next time.

Update: TlsUsage move constructor and assignment fixed.

6 comments

Leave a comment

  • 紅樓鍮 0

    Rant: To those who read the comment section, I highly recommend wrapping a std::unique_ptr with a custom deleter instead of hand-rolling your own RAII type. Correctly implementing all the methods an RAII type needs is very verbose and error-prone.

    Edit: The incorrect code below was in the article before it was edited, but I didn’t expect it to stir up this much confusion among commenters. Please do not follow up with more off-topic comments. (Please, DevBlogs, support more HTML tags…)

    As an example, if your move assignment operator is

    TlsUsage& operator=(TlsUsage&& other) {
        std::swap(*this, other);
    }

    it actually results in an infinite recursion, because std::swap is implemented in terms of the move assignment operator. (It’s also not noexcept, which mostly defeats the purpose of having a move assignment operator at all.)

    • Joshua Hudson 0

      My normal implementation of such methods looks like this: `void operator=(TlsUsage& other);`

      Thus trying to call it is a link time error; which is fine because it should never be called.

      It gets its standard constructor, its copy constructor (which is implemented as move), its destructor, and its link error assignment operator and nothing else. And half the time the copy constructor can be a link error copy constructor too.

  • LB 0

    The “fixed” move assignment operator now leaks by abandoning the original held state and overwriting it with the incoming state. Funny enough this same bug is in MSVC’s std::experimental::generator, so it seems to be a common mistake to make. Using std::swap instead was originally correct, it just wasn’t being passed the correct parameters. EDIT: Seems to have been updated to properly use std::swap now, yay 🙂

    • James Burgess 0

      Confused (as of Jun 14 11pm PST) TlsUsage::operator=() calls std::move, and it looks correct. Were there two updates?

      • 紅樓鍮 0

        The move assignment operator in the fixed code doesn’t seem wrong to me, it’s using std::swap correctly this time, and it doesn’t call std::move. I suspect both of you confused the move assignment operator with the move constructor.

      • LB 1

        Seems it was again updated after my comment, when I wrote my comment the move assignment operator had the same implementation as the move constructor, using std::exchange. It is now correctly using std::swap.

Feedback usabilla icon