Debugger Extension for DRED
When a TDR occurs in an app with DRED enabled, the runtime preserves the DRED output in the application memory heap. Using WinDbg attached to a process or heap dump with D3DDred.js loaded, the DRED output can be trivially observed by running !d3ddred from the WinDbg console.
The following is an example using a busted version of Microsoft’s D3D12 ModelViewer sample.
@$d3ddred() : [object Object] [Type: D3D12_DEVICE_REMOVED_EXTENDED_DATA1]
[<Raw View>] [Type: D3D12_DEVICE_REMOVED_EXTENDED_DATA1]
DeviceRemovedReason : 0x887a0006 (The GPU will not respond to more commands, most likely because of an invalid command passed by the calling applicat [Type: HRESULT]
AutoBreadcrumbNodes : Count: 1
PageFaultVA : 0x29b450000
ExistingAllocations : Count: 0
RecentFreedAllocations : Count: 2
In this example, there is only one AutoBreadcrumbNode object. Clicking on AutoBreadcrumbNodes shows:
(*((ModelViewer!D3D12_DEVICE_REMOVED_EXTENDED_DATA1 *)0x7fffee841a08)).AutoBreadcrumbNodes : Count: 1
[0x0] : 0x1e2ed2dcf58 : [object Object] [Type: D3D12_AUTO_BREADCRUMB_NODE *]
((ModelViewer!D3D12_AUTO_BREADCRUMB_NODE *)0x1e2ed2dcf58) : 0x1e2ed2dcf58 : [object Object] [Type: D3D12_AUTO_BREADCRUMB_NODE *]
[<Raw View>] [Type: D3D12_AUTO_BREADCRUMB_NODE]
CommandListDebugName : 0x1e2eceb04a0 : "ClearBufferCL" [Type: wchar_t *]
CommandQueueDebugName : 0x1e2ecead4a0 : "CommandListManager::m_CommandQueue" [Type: wchar_t *]
NumCompletedAutoBreadcrumbOps : 0x1
NumAutoBreadcrumbOps : 0x3
ReverseCompletedOps : [object Object]
OutstandingOps : [object Object
This implies that queue “CommandListManager::m_CommandQueue” and command list “ClearBufferCL” contain the likely suspect operation.
The ReverseCompletedOps value is an array (in reverse order) of command list operations that completed without error:
((ModelViewer!D3D12_AUTO_BREADCRUMB_NODE *)0x1e2ed2dcf58)->ReverseCompletedOps : [object Object]
[0x0] : D3D12_AUTO_BREADCRUMB_OP_CLEARUNORDEREDACCESSVIEW (13) [Type: D3D12_AUTO_BREADCRUMB_OP]
This shows that only one operation completed before faulting. In this case it was a ClearUnorderedAccessView command.
The OutstandingOps value is an array (in normal forward order) of command list operations that are not guaranteed to have completed without error.
((ModelViewer!D3D12_AUTO_BREADCRUMB_NODE *)0x1e2ed2dcf58)->OutstandingOps : [object Object]
[0x0] : D3D12_AUTO_BREADCRUMB_OP_COPYRESOURCE (9) [Type: D3D12_AUTO_BREADCRUMB_OP]
[0x1] : D3D12_AUTO_BREADCRUMB_OP_RESOURCEBARRIER (15) [Type: D3D12_AUTO_BREADCRUMB_OP]
In most cases, the first outstanding operation is the strongest suspect. The outstanding CopyResource operation shown here is in fact the culprit.
Notice that PageFaultVA is not zero in the initial !d3ddred output. This indicates that the GPU faulted due to a read or write error (and that the GPU supports reporting of page faults). Beneath PageFaultVA is ExistingAllocations and RecentFreedAllocations. These contain arrays of allocations that match the faulting virtual address. Since ExistingAllocations is 0, it is not interesting in this case. However, RecentFreedAllocations has two entries that match the faulting VA:
(*((ModelViewer!D3D12_DEVICE_REMOVED_EXTENDED_DATA1 *)0x7fffee841a08)).RecentFreedAllocations : Count: 2
[0x0] : 0x1e2e2599120 : [object Object] [Type: D3D12_DRED_ALLOCATION_NODE *]
[0x1] : 0x1e2e25990b0 : [object Object] [Type: D3D12_DRED_ALLOCATION_NODE *]
Allocation [0x0] is an internal heap object, and thus is not very interesting. However, allocation [0x1] reveals:
((ModelViewer!D3D12_DRED_ALLOCATION_NODE *)0x1e2e25990b0) : 0x1e2e25990b0 : [object Object] [Type: D3D12_DRED_ALLOCATION_NODE *]
[<Raw View>] [Type: D3D12_DRED_ALLOCATION_NODE]
ObjectName : 0x1e2ed352730 : "UAVBuffer01" [Type: wchar_t *]
AllocationType : D3D12_DRED_ALLOCATION_TYPE_RESOURCE (34) [Type: D3D12_DRED_ALLOCATION_TYPE]
So, a buffer named “UAVBuffer01” that mapped to the faulting VA was recently deleted.
The verdict in this case is that the CopyResource operation on CommandList “ClearBufferCL” tried to access buffer “UAVBuffer01” after it had been deleted.
Unfortunately, the public symbols for D3D12 do not include the type data needed for the D3DDred.js extension (type information is typically stripped from public OS symbols). Fortunately, D3DDred.js can usually work around this by searching though other loaded modules for the DRED data types. However, since older SDK’s will not have the DRED types this workaround requires building with the Windows 10 May 2019 SDK. The good news is this has been addressed in the next OS release, and we are currently working to update the public symbols for May 2019 with the DRED data types.
As of the May 2019 SDK, the most efficient way to enable DRED is by using the DRED API’s. DRED must be enabled before creating the D3D12 Device.
Thanks for reading:
If TDR’s are keeping you up at night, you want to use DRED – and you should check out the D3DDred.js debugger extension. As always, we look forward to your feedback and suggestions.