June 26th, 2019

Debugger Extension for DRED

Bill Kristiansen
Principal Developer

Microsoft recently announced the release of DRED (Device Removed Extended Data) for D3D12 in the Windows 10 May 2019 Update (previously referred to as the Windows 10 19H1 Preview).  Buried in that post is a mention that Microsoft is working on a debugger extension to help simplify post-mortem analysis of DRED.  Good news, that debugger extension is now available on GitHub.  D3DDred.js is a JavaScript debugger extension for WinDbg (available here). This extension makes it possible to examine the DRED output with clear context and a human-readable layout.

Why WinDbg? Besides being a powerful, lightweight debugger, WinDbg supports JavaScript extensions.  There is no need to configure build tools or run any installers to use D3DDred.js.  Simply load the script into WinDbg and you are ready to roll.  Using the WinDbg console, type:

.scriptload c:\my-windbg-extensions\d3ddred.js

When a TDR occurs in an app with DRED enabled, the runtime preserves the DRED output in the application memory heap.  Using WinDbg attached to a process or heap dump with D3DDred.js loaded, the DRED output can be trivially observed by running !d3ddred from the WinDbg console.

Example:

The following is an example using a busted version of Microsoft’s D3D12 ModelViewer sample.

0:000> !d3ddred
@$d3ddred()                 : [object Object] [Type: D3D12_DEVICE_REMOVED_EXTENDED_DATA1]
    [<Raw View>]     [Type: D3D12_DEVICE_REMOVED_EXTENDED_DATA1]
    DeviceRemovedReason : 0x887a0006 (The GPU will not respond to more commands, most likely because of an invalid command passed by the calling applicat [Type: HRESULT]
    AutoBreadcrumbNodes : Count: 1
    PageFaultVA      : 0x29b450000
    ExistingAllocations : Count: 0
    RecentFreedAllocations : Count: 2

In this example, there is only one AutoBreadcrumbNode object.  Clicking on AutoBreadcrumbNodes shows:

(*((ModelViewer!D3D12_DEVICE_REMOVED_EXTENDED_DATA1 *)0x7fffee841a08)).AutoBreadcrumbNodes                 : Count: 1
    [0x0]            : 0x1e2ed2dcf58 : [object Object] [Type: D3D12_AUTO_BREADCRUMB_NODE *]

Click [0x0]:

((ModelViewer!D3D12_AUTO_BREADCRUMB_NODE *)0x1e2ed2dcf58) : 0x1e2ed2dcf58                 : [object Object] [Type: D3D12_AUTO_BREADCRUMB_NODE *]
    [<Raw View>]     [Type: D3D12_AUTO_BREADCRUMB_NODE]
    CommandListDebugName : 0x1e2eceb04a0 : "ClearBufferCL" [Type: wchar_t *]
    CommandQueueDebugName : 0x1e2ecead4a0 : "CommandListManager::m_CommandQueue" [Type: wchar_t *]
    NumCompletedAutoBreadcrumbOps : 0x1
    NumAutoBreadcrumbOps : 0x3
    ReverseCompletedOps : [object Object]
    OutstandingOps   : [object Object

This implies that queue “CommandListManager::m_CommandQueue” and command list “ClearBufferCL” contain the likely suspect operation.

The ReverseCompletedOps value is an array (in reverse order) of command list operations that completed without error:

((ModelViewer!D3D12_AUTO_BREADCRUMB_NODE *)0x1e2ed2dcf58)->ReverseCompletedOps                 : [object Object]
    [0x0]            : D3D12_AUTO_BREADCRUMB_OP_CLEARUNORDEREDACCESSVIEW (13) [Type: D3D12_AUTO_BREADCRUMB_OP]

This shows that only one operation completed before faulting.  In this case it was a ClearUnorderedAccessView command.

The OutstandingOps value is an array (in normal forward order) of command list operations that are not guaranteed to have completed without error.

((ModelViewer!D3D12_AUTO_BREADCRUMB_NODE *)0x1e2ed2dcf58)->OutstandingOps                 : [object Object]
    [0x0]            : D3D12_AUTO_BREADCRUMB_OP_COPYRESOURCE (9) [Type: D3D12_AUTO_BREADCRUMB_OP]
    [0x1]            : D3D12_AUTO_BREADCRUMB_OP_RESOURCEBARRIER (15) [Type: D3D12_AUTO_BREADCRUMB_OP]

In most cases, the first outstanding operation is the strongest suspect.  The outstanding CopyResource operation shown here is in fact the culprit.

Notice that PageFaultVA is not zero in the initial !d3ddred output.  This indicates that the GPU faulted due to a read or write error (and that the GPU supports reporting of page faults).  Beneath PageFaultVA is ExistingAllocations and RecentFreedAllocations.  These contain arrays of allocations that match the faulting virtual address.  Since ExistingAllocations is 0, it is not interesting in this case.  However, RecentFreedAllocations has two entries that match the faulting VA:

(*((ModelViewer!D3D12_DEVICE_REMOVED_EXTENDED_DATA1 *)0x7fffee841a08)).RecentFreedAllocations                 : Count: 2
    [0x0]            : 0x1e2e2599120 : [object Object] [Type: D3D12_DRED_ALLOCATION_NODE *]
    [0x1]            : 0x1e2e25990b0 : [object Object] [Type: D3D12_DRED_ALLOCATION_NODE *]

Allocation [0x0] is an internal heap object, and thus is not very interesting.  However, allocation [0x1] reveals:

((ModelViewer!D3D12_DRED_ALLOCATION_NODE *)0x1e2e25990b0)                 : 0x1e2e25990b0 : [object Object] [Type: D3D12_DRED_ALLOCATION_NODE *]
    [<Raw View>]     [Type: D3D12_DRED_ALLOCATION_NODE]
    ObjectName       : 0x1e2ed352730 : "UAVBuffer01" [Type: wchar_t *]
    AllocationType   : D3D12_DRED_ALLOCATION_TYPE_RESOURCE (34) [Type: D3D12_DRED_ALLOCATION_TYPE]

So, a buffer named “UAVBuffer01” that mapped to the faulting VA was recently deleted.

The verdict in this case is that the CopyResource operation on CommandList “ClearBufferCL” tried to access buffer “UAVBuffer01” after it had been deleted.

Symbols:

Unfortunately, the public symbols for D3D12 do not include the type data needed for the D3DDred.js extension (type information is typically stripped from public OS symbols).  Fortunately, D3DDred.js can usually work around this by searching though other loaded modules for the DRED data types.  However, since older SDK’s will not have the DRED types this workaround requires building with the Windows 10 May 2019 SDK.  The good news is this has been addressed in the next OS release, and we are currently working to update the public symbols for May 2019 with the DRED data types.

Enabling DRED:

As of the May 2019 SDK, the most efficient way to enable DRED is by using the DRED API’s.  DRED must be enabled before creating the D3D12 Device.

CComPtr<ID3D12DeviceRemovedExtendedDataSettings> pDredSettings;
if (SUCCEEDED(D3D12GetDebugInterface(IID_PPV_ARGS(&pDredSettings))))
{
    pDredSettings->SetAutoBreadcrumbsEnablement(D3D12_DRED_ENABLEMENT_FORCED_ON);
    pDredSettings->SetPageFaultEnablement(D3D12_DRED_ENABLEMENT_FORCED_ON);
}
Thanks for reading:

If TDR’s are keeping you up at night, you want to use DRED – and you should check out the D3DDred.js debugger extension.  As always, we look forward to your feedback and suggestions.

 

Category
DirectX

Author

Bill Kristiansen
Principal Developer

Principal Developer, Microsoft DirectX

0 comments

Discussion are closed.

Feedback