PIX 2104.20: DirectX 12 Agility SDK Support, New Occupancy Graphs, Timing Capture Improvements
Today we released PIX version 2104.20 which can be downloaded here. This release coincides with the announcement of the DirectX 12 Agility SDK – read more about that announcement here.
We’ve added support for all new features in the Agility SDK, including Shader Model 6.6. We’ve added new occupancy graphs based on high frequency counter data for supported GPUs, and added support for Slim PDBs. We’ve also added an experimental new feature that allows you to attach PIX on Windows to an existing live process and take a GPU capture, and made improvements to the buffer formatter. Numerous improvements to Timing Captures are also provided, including enhancements to the Metrics View and the Sampling Profiler. Read on to learn more, and remember, as always, you can send us your feedback using the button in the top right corner of PIX!
Capture/Replay with the Agility SDK
Today at Game Stack Live, Microsoft announced the new DirectX 12 Agility SDK. With the Agility SDK, gamers and game developers have access to the latest and greatest DirectX features even sooner. This PIX release fully supports the Agility SDK and its features on all versions of Windows 10 that support the Agility SDK. That means you can take a GPU capture of any application that uses the Agility SDK and play it back in PIX.
HLSL Shader Model 6.6
Today’s Agility SDK release introduces HLSL Shader Model 6.6. Shader Model 6.6 introduces the ability to directly index into descriptor heaps from shaders without a root descriptor table, which PIX fully supports. PIX can even tell you which resources you accessed dynamically in your shaders:
PIX also supports other Shader Model 6.6 features, such as 64-bit integer and limited bitwise floating-point atomic operations – read more about the Shader Model 6.6 spec here.
This version of PIX on Windows also adds support for the new DXCompiler feature Slim PDBs. These optimized PDB files contain only the compiler version, shader sources, compile options, and defines, resulting in a 30% decrease in compile time vs -Zi and PDBs that are 89% smaller on average.
To enable shader debugging and viewing HLSL inline with GPU instructions, PIX has a new “Generate full PDB” option in the Shader “Commands” panel. This uses Edit-and-Continue to generate the lower level debug information needed from the source code and compilation flags stored in the slim PDB. While this does take some time depending on the shader, we suspect the 30% decrease in shader cook time and the drastic decrease in PDB size is well worth it. PIX will report if the current compiler shipped with PIX does not match the one used to generate the shader and the matching compiler can be supplied if desired.
We’ve added new occupancy graphs to PIX on Windows based on High Frequency Counter data. These are our long-term replacement for PIX’s existing Occupancy graph, which is being deprecated for several technical reasons.
The new graphs have two main features:
- A Wave Distribution graph, which helps understand how many waves (or what percentage of possible waves) are active at any given point in time. On some hardware, this graph is broken down into per-shader stage numbers.
- Additional graphs to help understand what’s limiting your GPU’s occupancy at any given point in time.
Which counters are collected varies by GPU manufacturer. Right now, the new Occupancy Graphs are supported for NVIDIA and Intel, with support for more manufacturers coming in a future release. Many thanks to our IHV partners for making this possible!
Capture taken on an Intel® UHD Graphics 620
Capture taken on a NVIDIA RTX 3080
Attaching to Live Processes
Some users requested a way to launch their game through Visual Studio or another program, and then attach PIX to that live process to take a GPU capture. This is now possible – you can attach PIX on Windows to an existing live process and take a GPU capture if that process loaded the matching version of WinPixGpuCapturer.dll before creating its D3D12 device. Additional documentation can be found here.
- We’ve added some new buffer viewer enhancements, including
- New “globals” buffer, allows you to define structures that are available to all buffers
- Import/Export of buffer formats
- Better layout of formats list and buttons, and a splitter for size control
- Buffer format editor is now available in Settings tab
- Auto-generate buffer viewer formatting for structured buffers
CPU Sampling Profiler C++ Source Code View
The Sampling Profiler that is built into Timing Captures now includes a C++ source view. The source view uses coloring to attribute the collected CPU samples with source lines.
Timing Capture Metrics View Improvements
The Metrics view in Timing Captures now contains a panel that displays all currently active metrics. The table of graphed metrics includes a checkbox that can be used to toggle whether a given metric is currently graphed, along with dropdowns to customize various aspects of how individual Metrics are graphed. A metric’s line style, color, and the aggregation mode (minimum, maximum, average) can all be customized. The ability to remove a metric from the active list is also provided.
In addition, the minimum and maximum y-axis values can now be specified per metric. The ability to customize the y-axis makes several analysis scenarios easier, including those in which a few outlying points obscure the differences in the majority of graph points.
Timing Capture Migration Support
Previous versions of Timing Captures can now be migrated to the newest version. In some releases of PIX, the Timing Capture file format must be changed as new features are added. The 2104.20 release of PIX on Windows includes such a format change. PIX now includes support for migrating old captures forward to the current file format so they can be opened in the latest version of PIX. See the Convert menu on the Home tab:
- Add button to reload symbols to Timing Captures
- Fix UI hang when opening Timing Capture with very many PIX events
- Timing Captures: Add UI option to truncate module names in stack trees
- Timing Captures: Add option to clamp memory stack tree depth