PIX 2303.02: You asked, we listened! A bumper PIX release
This release has many new features and bug fixes in direct response to requests and feedback we’ve received from PIX users. Thank you for all of your feedback and suggestions so far, and please keep ’em coming! Ways to contact us include the DirectX Discord server (discord.gg/directx) and the feedback button in PIX.
This release includes:
- RayGen shader debugging
- DXR pipeline view improvements: see HLSL/DXIL and see resource accesses
- Plugins from AMD, NVIDIA, and Qualcomm supporting new features and new GPUs
- Capture/replay support for new D3D features
- Enhanced Barriers
- Independent Devices
- Many other smaller features
- Misc new in-application PIX APIs (pix3.h)
- New image visualizers in the Texture Viewer
- Revamped child process debugging
- PIX event coloring in GPU Captures
- Event List and counter diff’ing between captures
- D3D12 resource/heap residency events
- Support for taking Timing Captures programmatically via pix3.h
- New Summary Layout page
- Metrics View Enhancements
- Display units can now be customized
- New histogram control
- C++ source code analysis for memory allocations
- Data type analysis in Timing Captures
- Tracking custom memory allocators using the PIXRecordMemoryAllocationEvent API
RayGen shader debugging
This release includes full support for debugging ray generation (DXR 1.0) shaders in PIX’s shader debugger. This builds on the preview build of PIX that we released in December, fixing the issues that many of you reported to us. Thank you again for your feedback, and please contact us again if you have any other issues or suggestions.
DXR pipeline view improvements
This release includes full support for “shader access tracking”, where PIX’s pipeline view tells you exactly which resources were accessed by each shader entry in your raytracing shader tables. This also builds on the preview build of PIX that we released in December, fixing the issues that many of you reported to us. Thank you again.
This release also includes the ability to see the HLSL and DXIL for your raygen and miss shaders.
Plugin support for new GPUs
This release includes new plugins from AMD, NVIDIA and Qualcomm, complementing the plugin from Intel that was updated in the last PIX release. Many thanks to our hardware partners for their ongoing support and collaboration!
The new AMD plugin adds support for the AMD Radeon RX 7900 XTX and AMD Radeon RX 7900 XT GPUs. The new NVIDIA plugin adds support for the NVIDIA 4000-series GPUs. The new plugins support all the latest PIX plugin features, including low-level hardware counters in the event list and hardware counter graphs in the PIX timeline.
The new Qualcomm plugin adds support for the GPU Power State Selection feature on Qualcomm Snapdragon 8cx Gen3 devices such as the Surface Pro 9 or the Windows Dev Kit 2023 Desktop PC.
The new NVIDIA plugin comes with some additional terms and restrictions. To view these, please read the EULA that you must accept when you install PIX.
Capture/replay support for new D3D features
In October we released a preview version of PIX with support for Enhanced Barriers and misc other D3D12 features. We’re pleased to say that today’s PIX release includes full support for Enhanced Barriers, those D3D12 features, and several newer D3D12 features too.
Here’s the full list of newly supported D3D12 features:
- Enhanced Barriers (ID3D12GraphicsCommandList7 and ID3D12Device8)
- Independent Devices (ID3D12DeviceFactory, ID3D12SDKConfiguration1, etc)
- ID3D12GraphicsCommandList8 and ID3D12GraphicsCommandList9
- ID3D12Device11 (CreateSampler2)
- Triangle fans
- Software command queues (D3D12_COMMAND_LIST_TYPE_NONE)
New In-Application APIs
This release also adds the following APIs that you can use in your game while taking programmatic GPU captures:
- PIXIsAttachedForGpuCapture(): this function returns true if the PIX UI (or pixtool) is running and is attached to the current process, and otherwise it returns false.
- PIXOpenCaptureInUI(): this function opens the inputted filename (either a GPU Capture or a Timing Capture) in the PIX UI.
This release also fixes some bugs in other PIX APIs. For example, it fixes a bug causing the HUD to be displayed on all windows even if SHOW_ON_TARGET_WINDOW is set.
New image visualizers in the Texture Viewer
This release adds two new highly requested visualizers to the texture viewer. They are designed to make it easier to find your draw: “Highlight current draw” and “Clear before draw”.
We plan to add more image visualizers like this to PIX later in 2023. If you have specific visualizations that you would like us to add then please contact us!
Revamped child process debugging
If you launch an application for GPU Capture via PIX, then PIX will inject itself into any child processes that your application launches. This PIX release fixes some long-standing bugs in this area, and it improves the user experience by letting you select which child process you want to capture:
The paintbrush icon next to a process means that it’s created a D3D12 device and is potentially capturable.
This functionality will be particularly helpful for multi-process scenarios, such as games with launchers or applications such as Microsoft Edge
Event Coloring for GPU Captures
The PIXBeginEvent() and PIXSetMarker() APIs include a UINT64 color parameter. PIX’s GPU Captures will now display the color next to each event.
This feature is turned off by default for now. To turn on the feature, click on Home->Settings and check this box:
Event List Diffing (GPU Captures)
We’ve added a new diff tool to help compare the timing data between two similar GPU Captures. To use the tool, you’ll want to:
- Open the first capture in PIX, start analysis, then collect timing data.
- Right click the first capture’s event list, go to “Compare”, and set it as the left side for comparison
- On AMD hardware: you must close the first capture at this point. This requirement will be removed in a future version of PIX.
- Open the second capture in PIX, start analysis, then collect timing data.
- Right click on the other capture’s event list, go to “Compare” and click “Compare to…”
The captures don’t need to contain the exact same events in the same order, since the diff tool will try to line up equivalent events between captures. However, the diff tool will work best if you try to minimize the differences between the captures.
We plan to expand this feature in future releases. Please get in touch with your thoughts, feedback, and feature requests!
Resource Residency Events
In a previous release we added support for viewing information about D3D12 resource and heap objects to Timing Captures. In this release we’ve added support to view residency events related to those allocations. Specifically, the D3D12 MakeResident and Evict calls are displayed as well as PageIn and PageOut events, which indicate when memory is paged between GPU and system memory.
To gather this information, you must enable GPU Resources in Timing Capture options. Then, in the Timing Capture Range Details View you’ll see a new Residence Operations option in the Items to Show dropdown. You will also see a new Residence Operations lane in the timeline that shows the residency events.
Programmatic Timing Captures
This release adds support for taking Timing Captures programmatically via the PIXBeginCapture/PIXEndCapture APIs. This complements PIX’s existing support for taking GPU Captures programmatically via the same APIs.
To take a programmatic Timing Capture, you must do the following:
- Run your application as an administrator (also known as “elevated”).
- Load WinPixTimingCapturer.dll out of the PIX installation directory
- We have added the PIXLoadLatestWinPixTimingCapturerLibrary () API to pix3.h to simplify this. The API will find your newest installation of PIX and load WinPixTimingCapturer.dll out of it
- Call PIXBeginCapture(PIX_CAPTURE_TIMING, params) to begin your capture. The params parameter is used to specify the file name and to specify features you want to capture.
- Call PIXEndCapture(PIX_CAPTURE_TIMING) to end your capture.
New Summary Layout in Timing Captures
Timing Captures include a new Summary layout which provides a set of capture statistics and highlights aspects of performance that are likely candidates for additional investigation, such as the longest PIX events.
Metrics View Enhancements
Display units can now be customized
The display units for PIX events and for all memory-related metrics can be customized. For example, the duration, execution and stalled time for PIX events can now be graphed in milliseconds instead of the default nanoseconds.
New histogram control
When a metric is graphed in the Timing Capture Metrics View, a histogram is now created that shows the distribution of the values of the metric over the duration of the capture. Users can navigate from the histogram to the Timeline view for deeper analysis of the metrics for each histogram bucket.
C++ source code analysis for memory allocations
The memory profiling features in PIX Timing Captures now include a source analysis tab that displays a source code listing that shows which lines within a selected function allocated memory.
Data type analysis in Timing Captures
The memory profiling features in PIX Timing Captures now include an analysis of all memory allocations by data type. The analysis of allocations per data type can be used to see the amount of memory allocated per data type. Additionally, the amount of padding present in each type is displayed, along with the overall amount of memory taken up by the padding. Knowledge of the amount of padding can be used to find opportunities to eliminate unnecessary padding, thereby shrinking the size of the data type and reducing the amount of memory used.
Tracking custom memory allocators using the PIXRecordMemoryAllocationEvent API
The new PIXRecordMemoryAllocationEvent and PIXRecordMemoryFreeEvent APIs provide the data that PIX needs to display information about all memory allocations made from within your title’s custom memory allocators in timing captures. By providing this data, PIX shows all the same data for your custom allocators that it does for calls to XMemVirtualAlloc/VirtualFree and HeapAlloc/HeapFree. These new APIs are part of the latest version of the WinPIXEventRuntime.
The following code snippet provides an example of using the new APIs to instrument your custom allocator.
void* TitleAllocate(size_t size, UINT64 metadata)
void *pAddress = layer_allocate(size);
if (pAddress == NULL)
PIXRecordMemoryAllocationEvent(TITLE_ALLOCATOR, pAddress, size, metadata);
void TitleFree(void* baseAddress, size_t size, UINT64 metadata)
PIXRecordMemoryFreeEvent(TITLE_ALLOCATOR, baseAddress, size, metadata);
To see memory events corresponding to your calls to PIXRecordMemoryAllocationEvent and PIXRecordMemoryFreeEvent, select the Custom Allocator events check box before starting a timing capture as shown in the following figure.
New PresentMon counters in System Monitor
This version of PIX includes the latest version of PresentMon. This enables PIX to display far more PresentMon information in PIX’s System Monitor view: for example, you can now see which presentation mode was used by the target application and see graphs of important presentation statistics. Many thanks to the folks who maintain PresentMon on GitHub!
Other improvements and bug fixes
- GPU Captures:
- Show surrounding PIX Event Name in Resouce History entries
- Show PSO, RTPSO, Root Signature, Local Root Signature in the Pipeline view
- Add more “Play” buttons to make it easier to start shader debugging
- Fix misc issues capture/replaying >4GB buffers
- Improve capture-time performance for multi-frame captures
- Improve stability while repeatedly taking programmatic GPU Captures
- Fix bug resulting in PIX sometimes not showing all valid vertex slot bindings
- Misc fixes to the Texture Viewer when using “Flip” checkboxes to flip the texture
- Avoid crash during SetStateObject() by avoiding modifying the app’s state object desc
- Improve timing data accuracy around command list boundaries
- Improve performance when copying data out of buffer viewer
- Mesh Viewer: Recenter the arcball camera in when the window size changes
- Mesh Viewer: Support Azerty keyboard layouts in camera controls
- Fix capture/replay of video apps that leave resources in video-specific states
- WriteToSubresource(): fix capture-time box issue for mips > 0
- WriteToSubresource(): fix capture-time block-compressed issues
- Reduce change of ETW hang while collecting Timing Data
- Fix the UV Atlas panel
- Fix WinPixGpuCapturer.dll mismatch error when attaching to a process in some arm64 scenarios
- Fix attach to processes for GPU Capture in some arm64 scenarios
- Support ID3D12DeviceRemovedExtendedData2
- Export to C++
- Fix null terminator issue when enumerating playback adapters
- Fix issue with acceleration structure input geometry
- Support for compute-only devices
- Fix some 11On12 scenarios by convert to flip model compatible DXGI_FORMAT
- Timing Captures:
- Hash PIX events when there’s no correlated GPU execution
- Add –disable-gpu-plugins option to pixtool
- Fix misc relative path issues when launching/taking captures
- Add debug name for internal threads
- Perf fix by using the correct template specialization for simple PIX events and markers
- Fix misc compilation errors (e.g. with clang 15.0.1 or with GDK projects)
- Misc fixes/improvements
- Enabled Drag & Drop captures into PIX
- Fix misleading error when Developer Mode isn’t enabled
- Fix UI hyperlinks (e.g. to websites) on ARM64 (.net6) builds
- Remove commercial use clause from EULA
- Log far more capture time errors to the PIX output window
- Fix misc HoloLens-specific issues
- Update to D3D12 Agility SDK 1.608.2
- Update PIX to .NET6