{"id":106664,"date":"2022-05-17T07:00:00","date_gmt":"2022-05-17T14:00:00","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/oldnewthing\/?p=106664"},"modified":"2022-05-17T05:59:21","modified_gmt":"2022-05-17T12:59:21","slug":"20220517-00","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/oldnewthing\/20220517-00\/?p=106664","title":{"rendered":"Writing a sort comparison function, part 1: basics"},"content":{"rendered":"<p>I&#8217;ve noted in the past that <a title=\"Writing a sort comparison function\" href=\"https:\/\/devblogs.microsoft.com\/oldnewthing\/20031023-00\/?p=42063\"> a sort comparison function must follow certain rules<\/a>, and if you violate those rules, <a title=\"Writing a sort comparison function, redux\" href=\"https:\/\/devblogs.microsoft.com\/oldnewthing\/20090508-00\/?p=18313\"> very strange things happen<\/a>. So what are some patterns for writing sort comparison functions that don&#8217;t break the rules?<\/p>\n<p>Most of the time, sorting can be reduced to key comparison:\u00b9 From each element being sorted, you generate a <i>sort key<\/i>, and those sort keys are compared against each other.\u00b2<\/p>\n<p><b>Related reading<\/b>: <a title=\"Sorting by indices, part 2: The Schwartzian transform\" href=\"https:\/\/devblogs.microsoft.com\/oldnewthing\/20170106-00\/?p=95135\"> Sorting by indices, part 2: The Schwartzian transform<\/a>.<\/p>\n<p>The easiest way to write a sort comparison function is to generate the sort keys, and then compare the keys.<\/p>\n<pre>\/\/ Assume some function called \"key\" that produces\r\n\/\/ the sort key from an object.\r\n\r\n\/\/ three-way comparison\r\nint compare_3way_for_sorting(T const&amp; a, T const&amp; b)\r\n{\r\n    int a_key = key(a);\r\n    ant b_key = key(b);\r\n\r\n    if (a_key &lt; b_key) return -1;\r\n    if (a_key &gt; b_key) return +1;\r\n    return 0;\r\n}\r\n\r\n\/\/ less-than comparison\r\nbool compare_less_for_sorting(T const&amp; a, T const&amp; b)\r\n{\r\n    return key(a) &lt; key(b);\r\n}\r\n<\/pre>\n<p>This reduces the problem to writing a sort key. Here&#8217;s an example:<\/p>\n<pre>\/\/ Key generator: Sort by total cost (price + tax)\r\nauto key(T const&amp; t)\r\n{\r\n    return t.price + t.tax;\r\n}\r\n<\/pre>\n<p>For a multi-level sort, you can return a <code>std::pair<\/code> or <code>std::tuple<\/code> with the most significant key as the first element.<\/p>\n<pre>\/\/ Key generator: Sort by length, then by width\r\nauto key(T const&amp; t)\r\n{\r\n    return std::make_pair(t.length, t.width);\r\n}\r\n\r\n\/\/ Key generator: Sort by length, then by width,\r\n\/\/ then by weight\r\nauto key(T const&amp; t)\r\n{\r\n    return std::make_tuple(t.size, t.width, t.weight);\r\n}\r\n<\/pre>\n<p>The <code>make_tuple<\/code> function copies its arguments, which you may want to avoid if the members are expensive to copy. One solution is to wrap the member in a <code>std::ref<\/code>:<\/p>\n<pre>\/\/ Key generator: Sort by name, then age\r\nauto key(T const&amp; t)\r\n{\r\n    return std::make_tuple(std::ref(t.name), t.age);\r\n}\r\n<\/pre>\n<p>Another is to use <code>std::forward_<wbr \/>as_<wbr \/>tuple<\/code> to make everything a reference.<\/p>\n<pre>\/\/ Key generator: Sort by name, then age\r\nauto key(T const&amp; t)\r\n{\r\n    return std::forward_as_tuple(t.name, t.age);\r\n}\r\n<\/pre>\n<p>The <code>std::forward_<wbr \/>as_<wbr \/>tuple<\/code> approach is dangerous, however, if a portion of the key comes from a temporary:<\/p>\n<pre>\/\/ Key generator: Sort by name, then area\r\nauto key(T const&amp; t)\r\n{\r\n    \/\/ Don't do this!\r\n    <i>return std::forward_as_tuple(t.name, t.height * t.width);<\/i>\r\n}\r\n<\/pre>\n<p>We saw some time ago that <a title=\"What's up with std::piecewise_construct and std::forward_as_tuple?\" href=\"https:\/\/devblogs.microsoft.com\/oldnewthing\/20220428-00\/?p=106540\"> <code>std::forward_<wbr \/>as_<wbr \/>tuple<\/code> captures everything as a reference<\/a>, so it will capture the second component as a reference to a temporary integer, and that temporary integer will be destroyed at the end of the full expression, leaving the reference dangling.<\/p>\n<p>If you inline the call to <code>std::forward_<wbr \/>as_<wbr \/>tuple<\/code> into <code>compare_<wbr \/>less_<wbr \/>for_<wbr \/>sorting<\/code>, then the temporary will survive past the comparison:<\/p>\n<pre>bool compare_less_for_sorting(T const&amp; a, T const&amp; b)\r\n{\r\n    return std::forward_as_tuple(a.name, a.height * a.width) &lt;\r\n    return std::forward_as_tuple(b.name, b.height * b.width);\r\n}\r\n<\/pre>\n<p>But this is itself a violation of the principle of Don&#8217;t Repeat Yourself (DRY). So I guess that means <code>make_<wbr \/>tuple<\/code> with judicious use of <code>std::ref<\/code> is the way to go.<\/p>\n<p>To avoid making the key generator visible outside the scope of the comparison function, you can declare it as a local captureless lambda:<\/p>\n<pre>bool compare_less_for_sorting(T const&amp; a, T const&amp; b)\r\n{\r\n    auto key = [](auto&amp;&amp; t) {\r\n        return std::make_tuple(std::ref(t.name), t.height * t.width);\r\n    };\r\n    return key(a) &lt; key(b);\r\n}\r\n<\/pre>\n<p>And then you can lambda-ize the comparison function, too:<\/p>\n<pre>void f(std::vector&lt;T&gt;&amp; v)\r\n{\r\n    auto key = [](auto&amp;&amp; t) {\r\n        return std::make_tuple(std::ref(t.name), t.height * t.width);\r\n    };\r\n    std::sort(v.begin(), v.end(), [key](auto&amp;&amp; t, auto&amp;&amp; b) {\r\n        return key(a) &lt; key(b);\r\n    });\r\n}\r\n<\/pre>\n<p>Note that even if you limit your use of <code>forward_<wbr \/>as_<wbr \/>tuple<\/code> to references to members of the objects being sorted, you have to consume the forwarded tuple immediately. Sorting may move the object, and once that happens, your forwarded references become invalid.<\/p>\n<p>Next time, we&#8217;ll look at what we could do if one of the sort keys is expensive to calculate.<\/p>\n<p>\u00b9 And in fact, if your sorting does <i>not<\/i> reduce to key comparison, then it&#8217;s a clue that your sort function might not be valid.<\/p>\n<p>\u00b2 Indeed, in the Python programming language, this is the <i>only<\/i> way to sort. The ability to provide a custom comparison function was removed in Python 3. Instead, you pass a function that produces a sort key, and the sort occurs on the keys.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Trying to avoid the pitfalls.<\/p>\n","protected":false},"author":1069,"featured_media":111744,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[25],"class_list":["post-106664","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-oldnewthing","tag-code"],"acf":[],"blog_post_summary":"<p>Trying to avoid the pitfalls.<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/106664","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/users\/1069"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/comments?post=106664"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/106664\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media\/111744"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media?parent=106664"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/categories?post=106664"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/tags?post=106664"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}