May 19th, 2022

Writing a sort comparison function, part 3: spaceships

Last time, we wrote a multi-level sort with deferred calculation of secondary keys. Having to compare everything twice got cumbersome. We can do better with the C++20 spaceship operator.

// three-way comparison
std::weak_ordering
compare_3way_for_sorting(T const& a, T const& b)
{
    // First compare by name
    std::weak_ordering cmp = a.name <=> b.name;
    if (cmp != 0) return cmp;

    // Names are equal, check connector names
    cmp = a.GetConnector() <=> b.GetConnector();
    if (cmp != 0) return cmp;

    // Names and connector names are equal,
    // manufacturing date is the last check.
    cmp = LookupManufacturingDate(a.part_number) <=>
          LookupManufacturingDate(b.part_number);
    return cmp;
}

// less-than comparison
bool compare_less_for_sorting(T const& a, T const& b)
{
    // First compare by name
    std::weak_ordering cmp = a.name <=> b.name;
    if (cmp != 0) return cmp < 0;

    // Names are equal, check connector names
    cmp = a.GetConnector() <=> b.GetConnector();
    if (cmp != 0) return cmp < 0;

    // Names and connector names are equal,
    // manufacturing date is the last check.
    cmp = LookupManufacturingDate(a.part_number) <=>
          LookupManufacturingDate(b.part_number);
    return cmp < 0;
}

Another way of writing this is

// three-way comparison
std::weak_ordering
compare_3way_for_sorting(T const& a, T const& b)
{
    std::weak_ordering cmp = a.name <=> b.name;
    if (cmp == 0) cmp = a.GetConnector() <=> b.GetConnector();
    if (cmp == 0) cmp = LookupManufacturingDate(a.part_number) <=>
                        LookupManufacturingDate(b.part_number);
    return cmp;
}

// less-than comparison
bool compare_less_for_sorting(T const& a, T const& b)
{
    // First compare by name
    std::weak_ordering cmp = a.name <=> b.name;
    if (cmp == 0) cmp = a.GetConnector() <=> b.GetConnector();
    if (cmp == 0) cmp = LookupManufacturingDate(a.part_number) <=>
                        LookupManufacturingDate(b.part_number);
    return cmp < 0;
}

And then we can use the std::compare_weak_order_fallback function to synthesize a missing three-way comparison:

std::weak_ordering
compare_3way_for_sorting(T const& a, T const& b)
{
    auto cmp = std::compare_weak_order_fallback(a.name, b.name);
    if (cmp == 0) cmp = std::compare_weak_order_fallback(a.GetConnector(), b.GetConnector());
    if (cmp == 0) cmp = std::compare_weak_order_fallback(LookupManufacturingDate(a.part_number),
                                                         LookupManufacturingDate(b.part_number));
    return cmp;
}

// less-than comparison
bool compare_less_for_sorting(T const& a, T const& b)
{
    auto cmp = std::compare_weak_order_fallback(a.name, b.name);
    if (cmp == 0) cmp = std::compare_weak_order_fallback(a.GetConnector(), b.GetConnector());
    if (cmp == 0) cmp = std::compare_weak_order_fallback(LookupManufacturingDate(a.part_number),
                                                         LookupManufacturingDate(b.part_number));
    return cmp < 0;
}

Next time, we’ll look at descending sorts.

Bonus chatter: There is high copy/pasta error risk because the left and right of the <=> operator differ only by the choice of a or b. Maybe we could use this helper:

template<typename T, typename U = T>
struct successive_comparisons
{
    successive_comparisons(T&& left, U&& right) :
        a(left), b(right) {}
    T&& a;
    U&& b;

    template<typename Lambda>
    auto compare(Lambda&& lambda)
    { return lambda(a) <=> lambda(b); }
};

// I don't know why these deduction guides are necessary, but
// it doesn't work if I omit them.
template<typename T, typename U = T>
successive_comparisons(T&, U&) -> successive_comparisons<T&, U&>;

template<typename T, typename U = T>
successive_comparisons(T&&, U&) -> successive_comparisons<T&, U>;

template<typename T, typename U = T>
successive_comparisons(T&, U&&) -> successive_comparisons<T, U&>;

template<typename T, typename U = T>
successive_comparisons(T&&, U&&) -> successive_comparisons<T, U>;

// Avoid copy/pasta risk.
std::weak_ordering
compare_3way_for_sorting(T const& a, T const& b)
{
    auto s = successive_comparisons(a, b);

    std::weak_ordering
        cmp = s.compare([](auto&& v) { return v.name; });
    if (cmp == 0)
        cmp = s.compare([](auto&& v) { return v.GetConnector(); });
    if (cmp == 0)
        cmp = s.compare([](auto&& v) { return LookupManufacturingDate(v.part_number); });
    return cmp;
}

Unfortunately, this copies the v.name unnecessarily since the deduced lambda return type is as if by auto, which is a copy, not a reference. To avoid the copy, you must ask for a reference return, which is either by naming the type explicitly or by using decltype(auto) with a parenthesized return value.

std::weak_ordering
compare_3way_for_sorting(T const& a, T const& b)
{
    auto s = successive_comparisons(a, b);

    std::weak_ordering
        cmp = s.compare([](auto&& v) -> decltype(auto) { return (v.name); });
    if (cmp == 0)
        cmp = s.compare([](auto&& v) { return v.GetConnector(); });
    if (cmp == 0)
        cmp = s.compare([](auto&& v) { return LookupManufacturingDate(v.part_number); });
    return cmp;
}

I’m not sure this is an improvement. It just replaces one copy/pasta risk for another.

Also, this helper class prevents you from reusing intermediates from previous comparison steps.

std::weak_ordering
compare_3way_for_sorting(T const& a, T const& b)
{
    auto s = successive_comparisons(a, b);

    std::weak_ordering
        cmp = s.compare([](auto&& v) -> decltype(auto) { return (v.name); });
    if (cmp == 0)
        cmp = s.compare([](auto&& v) { return v.GetConnector(); });
    if (cmp == 0)
        cmp = s.compare([](auto&& v) { return v.GetConnector().InstallDate(); });
    return cmp;
}

We have to call GetConnector() a second time in order to look up its installation date, even though we had already done so earlier. With the explicit cascading version, we could reuse the results of the previous lookup:

std::weak_ordering
compare_3way_for_sorting(T const& a, T const& b)
{
    std::weak_ordering cmp = a.name <=> b.name;
    if (cmp != 0) return cmp;

    auto connectorA = a.GetConnector();
    auto connectorB = b.GetConnector();
    cmp = connectorA <=> connectorB;
    if (cmp != 0) return cmp;
 
    return connectorA.InstallDate() <=> connectorB.InstallDate();
}
Topics
Code

Author

Raymond has been involved in the evolution of Windows for more than 30 years. In 2003, he began a Web site known as The Old New Thing which has grown in popularity far beyond his wildest imagination, a development which still gives him the heebie-jeebies. The Web site spawned a book, coincidentally also titled The Old New Thing (Addison Wesley 2007). He occasionally appears on the Windows Dev Docs Twitter account to tell stories which convey no useful information.

3 comments

Discussion is closed. Login to edit/delete existing comments.

Newest
Newest
Popular
Oldest
  • Michiel de Vries

    It’s been a while since I last wrote copy-paste safe code or seen it discussed.

  • Mike Winterberg

    If successive_comparisons::compare were to use std::invoke rather than requiring a direct callable, the pitfalls of the name comparison can be lessened. It does require familiarity with pointers-to-members and won’t work if T and U are actually different types. But less familiarity with decltype(auto).

  • Roger B

    The Bonus Chatter section here is just peak c++ 🙁 Such a fragile mess.

Feedback