{"id":107107,"date":"2022-09-05T07:00:00","date_gmt":"2022-09-05T14:00:00","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/oldnewthing\/?p=107107"},"modified":"2022-08-31T07:18:55","modified_gmt":"2022-08-31T14:18:55","slug":"20220905-00","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/oldnewthing\/20220905-00\/?p=107107","title":{"rendered":"Is there any meaningful way to compare two Time Travel Debugging positions?"},"content":{"rendered":"<p><a href=\"https:\/\/docs.microsoft.com\/en-us\/windows-hardware\/drivers\/debugger\/time-travel-debugging-overview\"> Time Travel Debugging<\/a> uses <a href=\"https:\/\/docs.microsoft.com\/en-us\/windows-hardware\/drivers\/debugger\/time-travel-debugging-position-objects\"> position objects<\/a> to represents points in the recorded debugging trace. A position object consists of two numbers (represented in hexadecimal) separated by a colon. The first number is a 64-bit number known as the <i>sequence number<\/i>, and the second number is a 32-bit number known as the <i>step number<\/i>.<\/p>\n<p>So if you have two position objects, is there a way to determine which one occurred first in the trace?<\/p>\n<p>Generally speaking, the position objects can be sorted chronologically by sequence number, and then by step within a sequence number.\u00b9 This ordering is valid even across traces from different processes running simultaneously on the same system. (The consistent cross-process ordering is handy when you are debugging a problem that spans multiple processes.)<\/p>\n<p>Roughly speaking, what you can say is that the <i>first<\/i> step (step zero) of each sequence is chronological by sequence number. The remaining steps in the sequence proceed chronologically within the sequence, but chronological ordering between steps from different sequences is a bit fuzzier because multiprocessing is like that.<\/p>\n<p>For example, consider this timeline:<\/p>\n<div id=\"p20220905_head\" style=\"display: none;\">\u00a0<\/div>\n<table class=\"cp3\" style=\"border-collapse: collapse; text-align: center;\" border=\"0\" cellspacing=\"0\" cellpadding=\"3\">\n<tbody>\n<tr>\n<th>Thread 1<\/th>\n<td class=\"p20220905_border p20220905_color1bg\" colspan=\"1\">3:0<\/td>\n<td class=\"p20220905_border p20220905_color1bg\" colspan=\"2\">3:1<\/td>\n<td class=\"p20220905_border p20220905_color1bg\" colspan=\"2\">3:2<\/td>\n<td class=\"p20220905_border p20220905_color2bg\" colspan=\"1\">6:0<\/td>\n<td class=\"p20220905_border p20220905_color2bg\" colspan=\"1\">6:1<\/td>\n<\/tr>\n<tr>\n<td>&nbsp;<\/td>\n<\/tr>\n<tr>\n<th>Thread 2<\/th>\n<td class=\"p20220905_border p20220905_color3bg\" colspan=\"1\">4:0<\/td>\n<td class=\"p20220905_border p20220905_color3bg\" colspan=\"1\">4:1<\/td>\n<td class=\"p20220905_border p20220905_color3bg\" colspan=\"1\">4:2<\/td>\n<td class=\"p20220905_border p20220905_color3bg\" colspan=\"1\">4:3<\/td>\n<td class=\"p20220905_border p20220905_color4bg\" colspan=\"1\">5:0<\/td>\n<td class=\"p20220905_border p20220905_color4bg\" colspan=\"2\">5:1<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>At the start of the trace, thread 1 is assigned sequence number 3, and thread 2 is assigned sequence number 4. Thread 1 then executes instructions 3:0, 3:1, and 3:2. Meanwhile, thread 2 executes instructions 4:0, 4:1, and 4:2. Instructions do not all take the same amount of time to execute, so you can&#8217;t compare step numbers between sequences. In the above example, thread 1 executed some slow instructions, so it is stuck on step 2 (within sequence 3). while thread 2 is already on step 3 (within sequence 4).<\/p>\n<p>In the above trace, thread 2 concluded sequence 4 after four instructions, and its next sequence was assigned the number 5. A little while later, thread 1 concluded sequence 3 and it received sequence number 6 for its next sequence of instructions.<\/p>\n<p>So how does the Time Travel Debugger decide when to end a sequence and start a new one?<\/p>\n<p>Sequence boundaries occur at these points:<\/p>\n<ul>\n<li>The thread issues a synchronizing instruction, such as an interlocked memory access or a memory barrier.<\/li>\n<li>The thread transitions to kernel mode, say, because of a system call.<\/li>\n<li>At the Time Travel Debugger&#8217;s discretion. (Say, because the sequence is getting kind of long.)<\/li>\n<\/ul>\n<p>Don&#8217;t be surprised when you see the sequence number jump forward a lot when you step over a system call. After all, that system call could have taken a long time, and other threads got to run thousands of sequences while the thread you&#8217;re stepping through was stuck in the system call.<\/p>\n<p><b>Bonus chatter<\/b>: The Time Travel Debugging folks tell me that the realities of multiprocessing can result in apparent contradictions in the Time Travel Debugging trace. For example, if two threads access the same memory without synchronization, they may observe values that appear to be contradictory with respect to each other. You might observe one thread write a value to memory, and then another thread reads from that memory and gets the old value! This problem can be exacerbated by <a href=\"https:\/\/en.wikipedia.org\/wiki\/False_sharing\"> false sharing<\/a>.<\/p>\n<p>\u00b9 This is what we did some time ago when we <a title=\"The case of the mysterious over-release from deep inside the marshaling infrastructure\" href=\"https:\/\/devblogs.microsoft.com\/oldnewthing\/20220624-00\/?p=106786\"> studied the case of the mysterious over-release from deep inside the marshaling infrastructure<\/a>.<\/p>\n<p>\n<script>\nwindow.addEventListener(\"load\", function() {\n  \/\/ break up \"style\" to prevent wordpress from injecting random junk\n  document.getElementById(\"p20220905_head\").innerHTML =\n`<s` + `tyle>\nhtml:not([theme=dark]) .p20220905_border { border: solid 1px black; }\nhtml:not([theme=dark]) .p20220905_color1bg { background-color: #E0BBE4; }\nhtml:not([theme=dark]) .p20220905_color2bg { background-color: #FFDFD3; }\nhtml:not([theme=dark]) .p20220905_color3bg { background-color: #C2B1A8; }\nhtml:not([theme=dark]) .p20220905_color4bg { background-color: #B3C28A; }\nhtml[theme=dark] .p20220905_border { border: solid 1px white; }\nhtml[theme=dark] .p20220905_color1bg { background-color: #56245B; }\nhtml[theme=dark] .p20220905_color2bg { background-color: #802200; }\nhtml[theme=dark] .p20220905_color3bg { background-color: #5A483F; }\nhtml[theme=dark] .p20220905_color4bg { background-color: #758646; }\n<\/s` + `tyle>`;\n});\n<\/script><\/p>\n","protected":false},"excerpt":{"rendered":"<p>They are basically in chronological order, subject to the uncertainty of multiprocessing.<\/p>\n","protected":false},"author":1069,"featured_media":111744,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[25],"class_list":["post-107107","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-oldnewthing","tag-code"],"acf":[],"blog_post_summary":"<p>They are basically in chronological order, subject to the uncertainty of multiprocessing.<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/107107","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/users\/1069"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/comments?post=107107"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/107107\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media\/111744"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media?parent=107107"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/categories?post=107107"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/tags?post=107107"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}