This release of PIX includes a Preview version of a new implementation of Timing Captures. This new version is labeled a Preview both because various aspects of the new captures may change based on your feedback, and because the new implementation of Timing Captures does not yet include all of the features that existing Timing Captures have.
PIX timing captures record information about when each piece of work was carried out by the CPU and GPU. This data is gathered while the game is running, and with minimal overhead, so you can see things like how work is distributed across CPU cores, the latency between graphics work being submitted by the CPU and executed by the GPU, and how GPU rendering workloads are overlapping with async compute.
Timing captures display time-oriented data from a variety of sources both from within your title and from the system itself. Data from the instrumentation you’ve added to your code using PIX events and PIX markers will always be displayed. PIX can also optionally capture callstacks for context switches and CPU samples. These additional types of data enable more profiling scenarios, but also increase the overhead of collection.
The new implementation of Timing Captures offers several advantages over existing Timing Captures, including:
- Longer capture durations. Existing Timing Captures allow you to view a maximum of 2 seconds of profiling data. Capture durations are significantly extended in the new implementation. Capturing for durations on the order of hours is now supported.
- Faster capture opening times. New Timing Captures open much more quickly than existing Timing Captures. With larger captures, it’s not practical for PIX to show all the data immediately, but the goal is to show you enough data to get started within a few seconds.
- Improved analysis tools. A Metrics View has been added to Timing Captures to help you analyze the large volumes of data present in larger captures. This new view allows you to graph metrics such as event durations and counter values. Graphing multiple metrics together allows you to see correlations between the performance characteristics of different PIX events and counters in your title.
- Integration of additional data sources. The addition of counter data (both system-defined and title-defined) to Timing Captures enables you to correlate the value of various counters with the code that is executing in your title at any point in time. For example, you can correlate the value of title-defined metrics like “NumEnemies” with the PIX events in your title.
- Numerous improvements to the Timeline. The Timeline view includes numerous usability and navigation improvements. For example, you can now see Core and Thread activity together in one view, and context switches are easier to find and analyze.
New Timing Captures currently include a subset of the features present in existing PIX Timing Captures. These features will be added to New Timing Captures over the next several releases. The set of features currently missing from New Timing Captures is:
- GPU data
- Fence signals and waits
- GPU memory usage
- Tracked functions
- Stack Analysis view
- Warnings view
- Visualization that shows the correlation between GPU and CPU PIX events
- Execution times (in addition to duration) in the event list
- Setting a time limit for a capture
- Ready Thread visualization in Timeline
Taking a Capture
The Device Connection tab currently includes two buttons for taking Timing Captures. One button is used to take an existing Timing Capture while the other button is used to take a New Timing Capture. The ability to take either an old or a new capture will remain in PIX until the New Timing Captures contain all of the features the current captures do.
Before starting a new capture, select the capture options you’d like using the Options pane on the button labeled Start Timing Capture (Preview). In addition to specifying whether you’d like to collect CPU samples and callstacks for context switches, you can also specify a Capture Output Directory. While a capture is running, PIX stores the capture data in a file on your PC’s harddrive. Changing the default Capture Output Directory is useful for a few reasons. The first is available disk space. The size of the capture data can be quite large for long running captures. It’s not unusual to see capture files in the 10’s of GBs or more, for example. If the drive on which the Capture Output Directory is stored may be too small, you may want to change the Capture Output Directory such that it’s on another drive. You may also want to change the storage directory based on the type of disk hardware you have. While not, required, using an SSD to store capture data is desirable from a performance perspective.
After you’ve set your capture options, select Start Timing Capture (Preview) to start the capture.
While a capture is running, the graphs in System Monitor are shaded green and the Start button changes to Stop. Pressing the Stop button causes the capture to complete and to open.
Timing Capture Views and Panels
The default layout for a New Timing Capture contains several UI views and panels. The Timeline Tab contains the Timeline View, Range Details, Element Details, and Lane Configuration.
- The Timeline View shows a graphical representation of title activity over the duration of the capture. Core lanes and Thread lanes allow you to see how work is distributed across CPU cores, what portions of your title are running at any point in time, and so on.
- The Range Details View shows the data in the Core and Thread lanes in tabular form for a specified time range. The Range Details view is useful for sorting by metrics like duration and for quickly navigating through all instances of a particular type of data.
- The Element Details View provides details about the element that is currently selected in the Timeline or in Range Details. For example, when a CPU sample is selected, Element Details displays the full callstack for the sample. The Element Details view also allows you to navigate to the selected element in the Timeline or to graph the element in the Metrics View.
- The Lane Configuration Panel is used to customize the way Core and Thread lanes are displayed in the UI. Several aspects of the UI can be configured, including the ordering of lanes, the type of data that is displayed in each lane, and how it is presented.
The Metrics Tab contains the Metrics View.
- The Metrics View is a graphical analysis tool aimed at helping you navigate large amounts of data to quickly find and diagnose anomalies in title behavior. The duration of PIX events, and the values of title-defined counters can be graphed to more quickly spot correlations between different aspects of your title.
The Timeline View provides a graphical representation of title activity over the duration of the capture. Time oriented data in shown in 2 different lane types: Core lanes and Thread lanes.
A ribbon displayed at the top of the Timeline shows the time range of the capture from beginning to end. The following Timeline shows a capture that is just under 59 minutes in duration.
As you zoom in and out with Ctrl+Mouse Wheel, or scroll horizontally, the ribbon updates to show the portion of the capture you’re currently viewing.
Depending on the size of your capture, and how far zoomed out you are, PIX may not be able to show all the details for every event, context switch, sample, and so on. When a capture is opened, PIX automatically scrolls to the middle of the capture, and zooms in enough to the point where you can often see the details of several frames of data.
Then, as you zoom out with Ctrl+Mouse Wheel, PIX may get to a point where the data is too dense to show all the detail for every element. When this happens, PIX will aggregate data and display a tooltip showing you the amount of data that has been aggregated.
PIX automatically creates one Core lane for each CPU on your PC. Core lanes allow you to see:
- which thread is running on each core at any point in time
- when context switches occur
- CPU events (created by calling PIXBeginEvent)
- CPU markers (created by calling PIXSetMarker)
- CPU samples
- API Markers used to populate GPU command lists
PIX assigns colors to Core and Thread lanes, and uses those colors to help you visualize which thread is running on each core. Hovering over a colored portion of a Core lane causes PIX to display a tooltip that identifies which thread is running. The color displayed on the Core lane matches the color of the Thread lane for the thread that is running on the core.
Sections of the Core lanes that are white indicate that no title code is running at that time. In this case, the tooltip will display the name of the process that is running.
Hovering just below a colored section of a Core lane displays a tooltip that shows the stack of PIX CPU events running at that time. In the following picture, Core 4 has switched from running the thread LD1 and is now running thread WL2. The colors of the events in the tooltip match the colors that were assigned to the event when PIXBeginEvent was called.
Context switches are identified by red vertical lines on the Core lanes. The following picture shows the context switch on Core 4 from thread LD1 to thread WL2. The tooltip displays some basic information about the context switch. More detailed information is provided in the Element Details view.
Several aspects of how the Core lanes are displayed can be configured using the Lane Configuration panel. You can choose which types of data to display, whether to display events as expanded or flattened and so on. In the following picture, the Core lanes have been configured to display all PIX events, PIX CPU Markers, API Markers and CPU samples. CPU Markers are drawn as vertical blue lines at the bottom of the Core lane. API Markers are drawn as vertical black lines at the bottom of the Core lane. CPU Samples are drawn as vertical black lines at the top of the Core lane.
PIX creates one thread lane for every thread running in the title process. Threads are identified in the UI by either their ID or their name. If you name your threads using the SetThreadDescription API, PIX will display the thread’s name in the UI. Unnamed threads are identified by their ID. Assigning names to your threads makes it significantly easier to find them in the capture.
By default, PIX sorts threads for display in the Timeline by the amount of activity occurring on the thread. Threads with a large number of PIX events are sorted above those with relatively few events, for example. You can change this default sorting, along with several other aspects of how the thread lanes are displayed, using the Lane Configuration panel.
Thread lanes show you:
- which core a thread is running on at any point in time
- when context switches occur
- CPU events (created by calling PIXBeginEvent)
- CPU markers (created by calling PIXSetMarker)
- CPU samples
- API Markers used to populate GPU command lists
The following picture shows how these various types of data are displayed in the lane:
An indication of which core a thread is running on is displayed in a few different ways. First, when hovering over the Core indicator on a Thread lane, a tooltip is displayed identifying which core the thread is running on. The color of the core indicator matches the color of that core in the Core lane.
Core information is also shown when you select a PIX event in the Thread lane. When an event is selected, the Core lanes are updated to show where the thread is running. This visualization makes it easy to spot threads that aren’t affinitized to a single CPU core. In the picture below, an event named Frame is selected in the lane for Thread 1316. The Core lanes show that the Frame event starts on Core 5 but also switches to Cores 6, 0, and 3 while it is executing.
Unscheduled time (time when the thread is not running) is shown in the thread lane using the color red by default. Periods of unscheduled time always begin and end with a context switch. Note also that the periods of unscheduled time align with periods where the Core lanes indicate that the thread is not running.
The visualization that shows unscheduled time is on by default. The Lane Configuration panel allows you to turn off this visualization if you like. You can also change the color of the shading from red to either white or black.
Range Details View
The Range Details view displays a tabular representation of all timing data within a selected time range. Viewing data in a table is convenient for sorting the data based on criteria like start time or duration. The Range Details view also allows you to copy the contents of the table so it can be pasted into other analysis tools like Excel.
To select a time range, left click an area of the timeline and drag to the right while holding the mouse button down. A green highlight appears to track the range you’re selecting. The ribbon across the top of the capture also updates to show you the duration of your selected range. After releasing the mouse button, the Range Details view is populated with the data from all Core and Thread lanes in the range you selected.
A time range can also be specified by entering a start time and a duration directly in the Range Details view.
PIX places a limit on the amount of data that Range Details can hold. If you select a time range that contains too much data, Range Details will prompt you to select a smaller time range.
By default, Range Details displays data for PIX CPU events. Other types of data, such as CPU samples, context switches and so on, can be viewed by choosing the type in the dropdown in the upper right corner of the view.
The contents of the table can be sorted by clicking on one or more column headers. To sort by more than one column, hold down the Ctrl key while selecting. When sorting by multiple columns, PIX displays a number in the column that identifies the sort order. In the following picture, the table is sorted first by Thread and then by Duration.
The selector pane on the left hand side of the view can be used to restrict the data displayed in the table to particular threads or cores. The contents of the selector pane changes based on the type of data you’re viewing. For example, when viewing PIX events, the selector pane lets you filter by threads. When viewing context switches, the selector pane lets you filter by cores and so on.
The colors in the selector pane and in the table match the colors of the threads and cores in the timeline. This visualization helps you see the relationships between the data displayed in Range Details and that displayed in the Timeline.
By default, the contents of Range Details remains synchronized with your selection in the timeline. There may be cases where you’d rather keep the contents of Range Details fixed, regardless of the range you have selected. To enabled this behavior, uncheck the Sync with timeline selection checkbox in Display Options.
When viewing hierarchical data, such as PIX events, Range Details maintains the event hierarchy by default. If you’d rather view the data as a flat list, uncheck the Show hierarchy checkbox in Display Options.
The Element Details view provides detailed data on the currently selected item. The data shown in the Element Details view varies based on the type of element that is currently selected. For example, Element Details displays data including the thread, start time, duration and number of stalls when a PIX event is selected. A full callstack is shown when a CPU sample is selected, and so on.
Element Details also includes buttons to graph the duration of the selected element in the Metrics view or to jump to the element in the timeline, if applicable.
The Metrics view is a graphical tool that helps you analyze large volumes of data to quickly find areas of interest as they relate to performance, such as outliers in frame time or correlations between different portions of your title.
Various types of data can be graphed in the Metrics view, including the duration of CPU PIX events and the values of counters reported through PIXReportCounter.
The Metrics view can be accessed in a few different ways. First, various places in the Timeline allow you to graph the duration of a PIX event in the Metrics view. The context menus in the Thread lanes, Core lanes and Range Details provide this capability as does the Graph in Metric View button in Element Details. When one of these options is chosen, the focus switches to the Metrics view and the duration of the selected PIX event is graphed.
Alternatively, the Metrics view can be accessed by clicking on the Metrics tab in a Timing Capture directly. When accessing the Metrics view in this way, the Metrics to graph are typically added using the Selector panel at the right hand side of the view.
The x-axis in the Metrics view represents time. The ribbon across the top of the view marks time from the beginning of the capture to the end. The y-axis is the value of the currently selected metric. The units on the y-axis vary based on the type of data you are graphing. For example, when graphing a counter such as CPU Busy% the units are percentages. When graphing the duration of a PIX event, the units are nanoseconds.
When hovering over a metric in the graph, a tooltip is displayed that shows the name of the metric along with the values of the x and y axes. Depending on the volume of data in the capture, the Metrics view graph may aggregate data points. When this occurs, the tooltip also includes the number of points that have been aggregated.
The Metrics view provides the ability to zoom in and out using Ctrl-Mouse Wheel in the same way that the Timeline does. Also, a context menu is provided that allows you to zoom the Metrics view to a selected range, or to zoom the Timeline to a selected range.
A typical use of the Metrics view is to find areas of interest using the graph, then zoom the Timeline to that range to see more detail about what is happening in your title at that point in time.
The Selector Panel is used to choose the set of events and counters to graph. The available Metrics are grouped into two categories: Counters and PIX CPU Events. Title-defined counters (created through calls to PIXReportCounter) are under a node entitled Title Custom in the Counters group.
No Metrics are graphed by default (unless you’ve selected an event to graph in the Timeline).
To graph counters, either find the counter you want to graph by expanding the Counters tree or by typing the name (or partial name) of the desired counter in the filter bar at the top of the panel. Once you’ve found the Counters you’d like to graph, select the checkbox next to the name of the counter with either the mouse or the spacebar to add it to the graph.
The set of PIX CPU events that are available to graph are hidden by default. The potential for a large number of PIX events makes it impractical to list them all by default from a UI navigation perspective. The easiest way to find particular PIX events is to use the filter bar at the top of the panel. After entering your search text and hitting enter, the Selector Panel will be populated with all events and counters that matched the search criteria. The following picture shows set the of events that were found when searching for the string AppController.
Select the checkbox next to the name of the events you’d like to graph. Toggling the checkbox will cause an event to be removed from the graph.
Selecting the Show all checkbox next to CPU events will display the full set of events available to graph.
The Selector Panel has a context menu that allows you to customize various aspects of how individual Metrics are graphed. Line styles, colors, and the aggregation mode (min, max, average) can be customized.
Lane Configuration Panel
Several aspects of how PIX displays data in Thread and Core lanes can be configured. The color assigned to the lane, the types of data that are displayed, the pinning behavior, and so on, can all be changed either on a temporary or a more permanent basis.
To change the configuration for a single lane, select the down arrow next to the name of the lane. Doing so brings up a panel you can use to change the display settings for that lane.
Note that changing lane settings in this way applies only to the current UI session and the current capture. The updated settings are not preserved if you close and re-open PIX, or switch to a different capture.
PIX supports more permanent changes to the Timeline display settings through the concept of Configuration. A Configuration is a group of settings that are applied to all lanes in the Timeline. The settings for Thread lanes and Core lanes are configured separately.
Configurations are edited using the Lane Selector. To bring up the Selector, click on the Lane Selector icon in the upper left corner of the Timeline view.
PIX includes two configurations by default: Cores pinned and flattened and Cores expanded, and API Queues pinned. These default configurations are oriented around the type of data that is your primary focus when looking at a capture.
These built-in configurations can be edited, and new configurations can be created. Configurations can also be deleted.
To edit a configuration, select its name in the listbox and choose Edit … from the drop down menu.
The settings that can be configured are displayed in a dialog box. The settings on the left hand side of the dialog control the sort order of lanes in the capture. The settings on the right hand side of the dialog control the settings for Core and Thread lanes.
To create a new configuration, select New Configuration… from the drop down next to the Apply button. The same configuration editing dialog appears, allowing you to name your configuration and to customize the various settings.
To switch between configurations, select the configuration you’d like to use and hit the Apply button.
To restore the default configurations to their original state, select the Restore Defaults option from the dropdown menu next to the Apply button.
PIX remembers the configuration that was in use when you close a capture or close PIX itself. The next time you open a capture, the previous configuration is restored.
Note: Future versions of PIX will allow you to export configurations so they can be shared with others in your studio.
The Lane Selector panel also allows you to choose which specific lanes to display, to change the order in which they are displayed, and to specify whether an individual lane should be pinned.
Extracting Portions of a Capture
Sharing a timing capture, or a portion of a capture, is a common workflow in many studios. It may be that one person narrows in on a performance issue that he wants another developer to investigate further. Given the size of a long timing capture, it may not be practical to easily share the entire file. Sharing the entire file also requires the person receiving the file to know exactly where to look in the capture to find the issue to be debugged.
To facilitate sharing within a studio, PIX allows you to extract sections of larger captures. Extracting from a capture creates a new capture that contains only the selected time range. To extract a capture, select a region of time in either the Timeline or the Metrics view, right click, and choose Save Selected Range… from the context menu.
A File Save dialog will appear prompting you for the file name of the new, extracted capture. The new capture will be automatically opened in PIX.