Announcing new DirectX 12 feature – Video Encoding!
Today DirectX 12 provides APIs to support GPU acceleration for several video applications such as Video Decoding, Video Processing and Motion estimation as detailed in Direct3D 12 Video Overview.
We are happy to announce that D3D12 has added a new Video Encode feature to the existing video API families, with a new set of interfaces that allow developers to perform video encoding using GPU accelerated video engines.
This feature provides a new way for apps to implement video encoding consistently with the DirectX 12 principles and style.
Video Encode API Remarks
In terms of data flow, the API takes each video frame represented by ID3D12Resource textures and compresses them into an ID3D12Resource buffer, that contains the slice headers and payload of each encoded frame. Currently only DXGI_FORMAT_NV12 and DXGI_FORMAT_P010 are available depending on driver support, so input content may need to be color converted and down sampled previously by the API user.
The available codecs today are H264 and HEVC and specific support for each codec and their encoding tools must be queried using ID3D12VideoDevice::CheckFeatureSupport as there are driver support requirements.
The responsibility for handling the rest of the bitstream codec headers (i.e. SEI/VUI/VPS/SPS/PPS) is completely delegated to the user, who will generate and pack them into the final bitstream along with the compressed bitstream obtained from the GPU operation for each frame.
Following DirectX 12 principles and style, reference frames are managed explicitly, and their memory is completely tracked by the API user. This allows for clear usage of array of textures or texture arrays to store the DPB having explicit memory budgeting/management control and enables the API user to have full control of the DPB size and the reference picture selection strategies. Other aspects of reference picture management such as I/P/B picture type selection and B-frame group-of-pictures reordering are also in control in the user, who will track the display ordering and reference frame dependencies topology and submit GPU operations in encode order with the desired picture type.
A considerable number of configurable parameters are exposed by this API for the user to tweak different aspects of the encoding process and make them fit best for their scenarios such as: custom slices partitioning scheme, active (i.e. CBR, VBR, QBVR) and passive (Absolute/Delta custom QP maps) rate control configuration modes, custom codec encoding tools usage, custom codec block and transform sizes, motion vector precision limit, explicit usage of intra-refresh sessions, dynamic reconfiguration of video stream resolution/rate control/slices partitioning and more.
This API also reports encode statistics and can be used along with SetPredication and Timestamp D3D12 Features.
Video Encode API documentation resources
The API usage is similar to other existing Video DirectX 12 features and the high-level steps are:
- Call ID3D12VideoDevice::CheckFeatureSupport to check for the support details for DirectX12 video encoding operations. Pass a value from the D3D12_FEATURE_VIDEO enumeration to specify the feature for which you are requesting support information.
- Call ID3D12VideoDevice::CreateVideoEncoder to create an instance of the ID3D12VideoEncoder interface, which holds the state of the encoding session.
- Call ID3D12VideoDevice::CreateVideoEncoderHeap to create an instance of the ID3D12VideoEncoderHeap interface. This object contains the resolution-dependent driver resources and state.
- To Encode a frame:
- All input and output parameters for video encode operations are organized into the input and output parameters structures: D3D12_VIDEO_ENCODER_ENCODEFRAME_INPUT_ARGUMENTS and D3D12_VIDEO_ENCODER_ENCODEFRAME_OUTPUT_ARGUMENTS to record the ID3D12VideoCommandList2::EncodeFrame command.
- After the encoding operation, the user must also perform ID3D12VideoCommandList2::ResolveEncoderOutputMetadata to resolve hardware dependent encoding results from D3D12_VIDEO_ENCODER_RESOLVE_METADATA_INPUT_ARGUMENTS into D3D12_VIDEO_ENCODER_RESOLVE_METADATA_OUTPUT_ARGUMENTS.
- When the command list is properly recorded, call ID3D12CommandQueue::ExecuteCommandLists on the video command queue to submit the frame encoding and metadata resolving operations to the GPU.
The detailed API interfaces and structures definition can be found here.
For finding more design details and detailed documentation for the API, please refer to this document.
Video Encode API supported platforms
The Video Encode API is included as part of Windows 11 and can also be found in the DirectX 12 Agility SDK (version 1.700.10-preview or newer).
Please see below the list of hardware platforms that currently have support for Video Encode for both H264 and HEVC codecs and their minimum driver version requirements.
|Vendor||Supported platforms||Minimum video driver version|
||In development – ETA Q2 ‘2022|
The subject of this article seems to be yet another API for accessing the intrinsic encoding capabilities of NVIDIA and Intel chips. Is there anything exciting about the subject of this article that makes it stand out from the competition? Is it easier to use? Does it offer broader platform support? (Or is it Windows 11-exclusive?)
Is this supported in Windows on ARM devices?
How about Windows on ARM? There is practically 0 documentation on hardware encoding for Windows on ARM devices.
It doesn’t support ARM.
You probably know that Intel CPUs (Tiger Lake, Ice Lake, and Alder Lake) use the x86 architecture, not ARM. So, strike those items from the table. The only one remaining is NVIDIA, which so far doesn’t offer any ARM bindings. I imagine NVIDIA has suspended those developments pending its acquisition of Arm Ltd.
Do you have any examples?
Also, are you ever planning of creating a friendly API on C# ?