Announcing new DirectX 12 feature – Video Encoding!

Sil Vilerino

Introduction

Today DirectX 12 provides APIs to support GPU acceleration for several video applications such as Video Decoding, Video Processing and Motion estimation as detailed in Direct3D 12 Video Overview. 

We are happy to announce that D3D12 has added a new Video Encode feature to the existing video API families, with a new set of interfaces that allow developers to perform video encoding using GPU accelerated video engines. 

This feature provides a new way for apps to implement video encoding consistently with the DirectX 12 principles and style. 

Video Encode API Remarks 

In terms of data flow, the API takes each video frame represented by ID3D12Resource textures and compresses them into an ID3D12Resource buffer, that contains the slice headers and payload of each encoded frame. Currently only DXGI_FORMAT_NV12 and DXGI_FORMAT_P010 are available depending on driver support, so input content may need to be color converted and down sampled previously by the API user. 

The available codecs today are H264 and HEVC and specific support for each codec and their encoding tools must be queried using ID3D12VideoDevice::CheckFeatureSupport as there are driver support requirements.

The responsibility for handling the rest of the bitstream codec headers (i.e. SEI/VUI/VPS/SPS/PPS) is completely delegated to the user, who will generate and pack them into the final bitstream along with the compressed bitstream obtained from the GPU operation for each frame. 

Following DirectX 12 principles and style, reference frames are managed explicitly, and their memory is completely tracked by the API user. This allows for clear usage of array of textures or texture arrays to store the DPB having explicit memory budgeting/management control and enables the API user to have full control of the DPB size and the reference picture selection strategies. Other aspects of reference picture management such as I/P/B picture type selection and B-frame group-of-pictures reordering are also in control in the user, who will track the display ordering and reference frame dependencies topology and submit GPU operations in encode order with the desired picture type. 

A considerable number of configurable parameters are exposed by this API for the user to tweak different aspects of the encoding process and make them fit best for their scenarios such as: custom slices partitioning scheme, active (i.e. CBR, VBR, QBVR) and passive (Absolute/Delta custom QP maps) rate control configuration modes, custom codec encoding tools usage, custom codec block and transform sizes, motion vector precision limit, explicit usage of intra-refresh sessions, dynamic reconfiguration of video stream resolution/rate control/slices partitioning and more. 

This API also reports encode statistics and can be used along with SetPredication and Timestamp D3D12 Features. 

Video Encode API documentation resources 

The API usage is similar to other existing Video DirectX 12 features and the high-level steps are: 

The detailed API interfaces and structures definition can be found here.

For finding more design details and detailed documentation for the API, please refer to this document.

Video Encode API supported platforms 

The Video Encode API is included as part of Windows 11 and can also be found in the DirectX 12 Agility SDK (version 1.700.10-preview or newer).

Please see below the list of hardware platforms that currently have support for Video Encode for both H264 and HEVC codecs and their minimum driver version requirements. 

Vendor Supported platforms Minimum video driver version
AMD
  • Radeon RX 5000 series or greater
  • Ryzen 2xxxx series or greater
In development – ETA Q2 ‘2022
Intel
  • Tiger Lake
  • Ice Lake
  • Alder Lake (from early 2022)
v30.0.100.9955
NVIDIA
  • GeForce GTX 10xx and above
  • GeForce RTX 20xx and above
  • Quadro RTX
  • NVIDIA RTX
v471.41

5 comments

Discussion is closed. Login to edit/delete existing comments.

  • Mystery Man 0

    The subject of this article seems to be yet another API for accessing the intrinsic encoding capabilities of NVIDIA and Intel chips. Is there anything exciting about the subject of this article that makes it stand out from the competition? Is it easier to use? Does it offer broader platform support? (Or is it Windows 11-exclusive?)

  • Chandrasekaran Sengottiyan 0

    Hi,

    Is this supported in Windows on ARM devices?

  • Tommy Vercetti 0

    How about Windows on ARM? There is practically 0 documentation on hardware encoding for Windows on ARM devices.

    • Mystery Man 0

      It doesn’t support ARM.

      You probably know that Intel CPUs (Tiger Lake, Ice Lake, and Alder Lake) use the x86 architecture, not ARM. So, strike those items from the table. The only one remaining is NVIDIA, which so far doesn’t offer any ARM bindings. I imagine NVIDIA has suspended those developments pending its acquisition of Arm Ltd.

  • Ion Sorin Torjo 0

    Do you have any examples?
    Also, are you ever planning of creating a friendly API on C# ?

Feedback usabilla icon