A customer was using the v2 task_sequencer class we developed some time ago. (Here’s the v1 task sequencer.) They found that they occasionally suffered from stack overflow crashes.
QueueTaskAsync::<lambda_2>::operator()+0x714 std::coroutine_handle<void>::resume+0x6c task_sequencer::chained_task::complete+0x88 task_sequencer::completer::~completer+0x58 QueueTaskAsync::<lambda_2>::operator()+0xaf8 std::coroutine_handle<void>::resume+0x6c task_sequencer::chained_task::complete+0x88 task_sequencer::completer::~completer+0x58 QueueTaskAsync::<lambda_2>::operator()+0xaf8 std::coroutine_handle<void>::resume+0x6c task_sequencer::chained_task::complete+0x88 task_sequencer::completer::~completer+0x58 QueueTaskAsync::<lambda_2>::operator()+0xaf8 std::coroutine_handle<void>::resume+0x6c task_sequencer::chained_task::complete+0x88 task_sequencer::completer::~completer+0x58 QueueTaskAsync::<lambda_2>::operator()+0xaf8 std::coroutine_handle<void>::resume+0x6c task_sequencer::chained_task::complete+0x88 task_sequencer::completer::~completer+0x58 ...
Reading from the bottom up (to see the sequence chronologically), a coroutine completed, so we resumed the lambda coroutine inside QueueÂTaskÂAsync:
auto task = [](auto&& current, auto&& makerParam,
auto&& contextParam, auto& suspend)
-> Async
{
completer completer{ std::move(current) };
auto maker = std::move(makerParam);
auto context = std::move(contextParam);
co_await suspend;
co_await context;
co_return co_await maker();
}(current, std::forward<Maker>(maker),
winrt::apartment_context(), suspend);
When one completer destructs, it resumes the co_await suspend in this lambda. The lambda then switches to the correct thread (which we don’t see in the stack because we are already on the correct thread), asks the maker to start the next coroutine (which we don’t see on the stack because it returned), and then awaits that coroutine. We don’t see that coroutine on the stack, which means that it completed synchronously. And then that’s the end of the lambda, so we start the next one.
Therefore, we run into this problem if there is a sequence of queued tasks, where all those tasks complete synchronously.
So what can we do about it?
We could force the stack to unwind by throwing a co_await winrt:: into the lambda after the co_await suspend, so that the coroutine resumes on a background thread’s fresh stack, releasing the thread it was resumed on so it can unwind.
This does soak up a threadpool thread in the case that the apartment_ is a single-threaded apartment, because the IContextÂCallback blocks the calling thread while the callback is running. Most people don’t worry about this problem, but I do because I’ve had to debug deadlocks that trace back to threadpool exhaustion because all the threads are just waiting for another thread to be ready or finish doing something.
The customer noted that the task_ is always used from the same thread, which happens to be a UI thread. So we can give the task sequencer a DispatcherÂQueue that it can use to get back to the UI thread asynchronously via TryÂEnqueue().
struct task_sequencer
{
task_sequencer(winrt::DispatcherÂQueue const& queue = nullptr)
: m_queue(queue) {}
task_sequencer(const task_sequencer&) = delete;
void operator=(const task_sequencer&) = delete;
private:
using coro_handle = std::experimental::coroutine_handle<>;
struct suspender
{
bool await_ready() const noexcept { return false; }
void await_suspend(coro_handle h)
noexcept { handle = h; }
void await_resume() const noexcept { }
coro_handle handle;
};
static void* completed()
{ return reinterpret_cast<void*>(1); }
struct chained_task
{
chained_task(void* state = nullptr) : next(state) {}
void continue_with(coro_handle h) {
if (next.exchange(h.address(),
std::memory_order_acquire) != nullptr) {
h();
}
}
void complete() {
auto resume = next.exchange(completed());
if (resume) {
coro_handle::from_address(resume).resume();
}
}
std::atomic<void*> next;
};
struct completer
{
~completer()
{
chain->complete();
}
std::shared_ptr<chained_task> chain;
};
winrt::slim_mutex m_mutex;
std::shared_ptr<chained_task> m_latest =
std::make_shared<chained_task>(completed());
public:
template<typename Maker>
auto QueueTaskAsync(Maker&& maker) ->decltype(maker())
{
auto node = std::make_shared<chained_task>();
suspender suspend;
using Async = decltype(maker());
auto task = [&]() -> Async
{
completer completer{ current };
auto local_maker = std::forward<Maker>(maker);
auto local_queue = m_queue;
co_await suspend;
if (m_queue == nullptr) {
co_await winrt::resume_background();
} else {
co_await winrt::resume_foreground(local_queue);
}
co_return co_await local_maker();
}();
{
winrt::slim_lock_guard guard(m_mutex);
m_latest.swap(node);
}
node->continue_with(suspend.handle);
return task;
}
};
You provide a DispatcherÂQueue when you create the task_, so that the task sequencer knows which thread the tasks should be started on. if you pass nullptr (or don’t bother to provide a parameter at all), then they start on a background thread. Otherwise, they start on the thread corresponding to the dispatcher queue.
0 comments
Be the first to start the discussion.