May 7th, 2024

Awaiting a set of handles with a timeout, part 6: Capturing the handles efficiently

Last time, we created an awaiter that could await a [first, last) range of handles. It did so by pushing onto a vector for each such handle, but this is inefficient in the case where the number of handles is known in advance. Let’s add a constructor for the case where the iterators support the subtraction operator.¹

struct awaiter
{
    ⟦ data members unchanged ⟧

    template<typename Iter,                 
        typename = std::void_t<             
            decltype(std::declval<Iter>() - 
                     std::declval<Iter>())>>
    awaiter(Iter first, Iter last,          
        std::optional<TimeSpan> timeout,    
        int) :                              
        m_timeout(timeout)                  
    {                                       
        auto count = last - first;          
        m_states.resize(count);             
        for (auto& s : m_states) {          
            m_states.m_handle = *first;     
            ++first;                        
        }                                   
        create_waits();                     
    }                                       

    template<typename Iter, typename = void>
    awaiter(Iter first, Iter last,
        std::optional<TimeSpan> timeout,
        unsigned) :
        m_timeout(timeout)
    {
        std::transform(first, last, std::back_inserter(m_states),
            [](HANDLE h) { state s; s.m_handle = h; return s; });
        create_waits();
    }

    ⟦ other methods unchanged ⟧
};

template<typename Iter>
auto resume_on_all_signaled(Iter first, Iter last,
    std::optional<winrt::Windows::Foundation::TimeSpan> timeout
        = std::nullopt)
{
    return resume_all_awaiter(first, last, timeout, 0);
}

We add another constructor that is enabled via SFINAE if the iterator type supports subtraction. If so, then we use that subtraction operator to get the number of handles, then resize the vector directly to that size, so that we can fill in the handles without having to do any reallocating. This significantly reduces code size because the compiler doesn’t have to generate any resize logic.

Note that if the iterator supports subtraction, then both constructors become available, so we use an unused parameter to steer the compiler toward the int version if it is available, since int is considered a better match for 0 than unsigned.

¹ Why do we look for a subtraction operator, rather than checking the iterator category for random_access_iterator_tag? Because not all subtractable iterators satisfy the requirements of a random access iterator. In particular, dereferencing a random access iterator must produce a reference, which rules out things like IVectorView since the Get­At method returns a copy, not a reference. The C++ iterator library doesn’t have a built-in way to detect a random-access output iterator, so we have to make up our own.

Topics
Code

Author

Raymond has been involved in the evolution of Windows for more than 30 years. In 2003, he began a Web site known as The Old New Thing which has grown in popularity far beyond his wildest imagination, a development which still gives him the heebie-jeebies. The Web site spawned a book, coincidentally also titled The Old New Thing (Addison Wesley 2007). He occasionally appears on the Windows Dev Docs Twitter account to tell stories which convey no useful information.

7 comments

Discussion is closed. Login to edit/delete existing comments.

Newest
Newest
Popular
Oldest
  • Pierre Bisaillon

    Couldn’t we simply use std::distance(first, last) and not have to worry about the type of the iterators ?

    • Raymond ChenMicrosoft employee Author

      Yes, but in the case of non-subtractable iterators, you’re probably better off just letting the vector resize instead of walking the collection twice (which will probably churn L1 cache).

  • Ismo Salonen

    How much space does std::vector preallocate ( if any, I’m too lazy to lookup myself ) ? Is the resize call necessary at all, you could just loop from start to end and push_back to the vector. If the preallocated space is enough then that might be even faster.

    • 紅樓鍮 · Edited

      The growth strategy of is implementation-defined. (If you mean how much space a default-constructed allocates: zero, but it will allocate an implementation-defined amount of space on the first push. This behavior is considered part of the growth strategy.)

      In any case it's normally impossible for a non-'ing implementation to beat a 'ing one, because the latter will always allocate exactly the right amount of memory up front unless the iterator's is wonky, whereas...

      Read more
  • Henry Skoglund

    When filling the handles in the new ctor, shouldn’t the loop be:

    for (auto& s : m_states) {
        s.m_handle = *first++;
    }
    
    • Csaba Varga

      It should be, if you’re playing code golf. Otherwise, you usually want to avoid the postfix ++ on iterators because it needs to return a new copy, which could be expensive. Or now that I think about it, there could be iterators where making a copy is impossible, so they don’t implement postfix ++ at all.

      • Henry Skoglund

        I was thinking more of the leftmost part (i.e. replacing m_states.m_handle with s..m_handle).

Feedback