{"id":112227,"date":"2026-04-13T07:00:00","date_gmt":"2026-04-13T14:00:00","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/oldnewthing\/?p=112227"},"modified":"2026-04-13T12:50:11","modified_gmt":"2026-04-13T19:50:11","slug":"20260413-00","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/oldnewthing\/20260413-00\/?p=112227","title":{"rendered":"Finding a duplicated item in an array of <VAR>N<\/VAR> integers in the range 1 to <VAR>N<\/VAR> &minus; 1"},"content":{"rendered":"<p>A colleague told me that there was an <var>O<\/var>(<var>N<\/var>) algorithm for finding a duplicated item in an array of <var>N<\/var> integers in the range <var>1<\/var> to <var>N<\/var> \u2212 1. There must be a duplicate due to the pigeonhole principle. There might be more than one duplicated value; you merely have to find any duplicate.\u00b9<\/p>\n<p>The backstory behind this puzzle is that my colleague had thought this problem was solvable in <var>O<\/var>(<var>N<\/var> log <var>N<\/var>), presumably by sorting the array and then scanning for the duplicate. They posed this as an interview question, and the interviewee found an even better linear-time algorithm!<\/p>\n<p>My solution is to interpret the array as a linked list of 1-based indices, and borrow the sign bit of each integer as a flag to indicate that the slot has been visited. We start at index 1 and follow the indices until they either reach a value whose sign bit has already been set (which is our duplicate), or they return to index 1 (a cycle). If we find a cycle, then move to the next index which does not have the sign bit set, and repeat. At the end, you can restore the original values by clearing the sign bits.\u00b2<\/p>\n<p>I figured that modifying the values was acceptable given that the <var>O<\/var>(<var>N<\/var> log <var>N<\/var>) solution also modifies the array. At least my version restores the original values when it&#8217;s done!<\/p>\n<p>But it turns out the interview candidate found an even better <var>O<\/var>(<var>N<\/var>) algorithm, one that doesn&#8217;t modify the array at all.<\/p>\n<p>Again, view the array values as indices. You are looking for two nodes that point to the same destination. You already know that no array entry has the value <var>N<\/var>, so the entry at index <var>N<\/var> cannot be part of a cycle. Therefore, the chain that starts at <var>N<\/var> must eventually join an existing cycle, and that join point is a duplicate. Start at index <var>N<\/var> and use <a href=\"https:\/\/en.wikipedia.org\/wiki\/Cycle_detection#Floyd's_tortoise_and_hare\"> Floyd&#8217;s cycle detector algorithm<\/a> to find the start of the cycle in <var>O<\/var>(<var>N<\/var>) time.<\/p>\n<p>\u00b9 If you constrain the problem further to say that there is exactly one duplicate, then you can find the duplicate by summing all the values and then subtracting <var>N(N\u22121)\/2<\/var>.<\/p>\n<p>\u00b2 I&#8217;m pulling a fast one. This is really <var>O<\/var>(<var>N<\/var>) space because I&#8217;m using the sign bit as a convenient &#8220;initially zero&#8221; flag bit.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Taking advantage of special characteristics of the array.<\/p>\n","protected":false},"author":1069,"featured_media":111744,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[25],"class_list":["post-112227","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-oldnewthing","tag-code"],"acf":[],"blog_post_summary":"<p>Taking advantage of special characteristics of the array.<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/112227","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/users\/1069"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/comments?post=112227"}],"version-history":[{"count":1,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/112227\/revisions"}],"predecessor-version":[{"id":112228,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/112227\/revisions\/112228"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media\/111744"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media?parent=112227"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/categories?post=112227"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/tags?post=112227"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}