{"id":11067,"date":"2025-01-30T10:00:20","date_gmt":"2025-01-30T18:00:20","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/directx\/?p=11067"},"modified":"2025-09-25T10:17:24","modified_gmt":"2025-09-25T17:17:24","slug":"agility-sdk-1-716-0-new-d3d12-video-encode-features","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/directx\/agility-sdk-1-716-0-new-d3d12-video-encode-features\/","title":{"rendered":"Agility SDK 1.716.0-preview: New D3D12 Video Encode Features"},"content":{"rendered":"<p><strong>UPDATE 9\/25\/25: This feature is now supported in retail as of the <a href=\"https:\/\/www.nuget.org\/packages\/Microsoft.Direct3D.D3D12\/1.618.1\">1.618 Agility SDK<\/a>.<\/strong><\/p>\n<p>Today DirectX 12 provides APIs to support GPU video encode acceleration for several applications, as detailed in\u00a0<a href=\"https:\/\/learn.microsoft.com\/en-us\/windows-hardware\/drivers\/display\/video-encoding-d3d12\">D3D12 Video Encoding &#8211; Windows drivers | Microsoft Learn<\/a> previous blog posts such as <a href=\"https:\/\/devblogs.microsoft.com\/directx\/announcing-new-directx-12-feature-video-encoding\/\">Announcing new DirectX 12 feature &#8211; Video Encoding! &#8211; DirectX Developer Blog<\/a>.<\/p>\n<p>In this blog post we\u2019re happy to announce a series of new features included in the <a href=\"https:\/\/www.nuget.org\/packages\/Microsoft.Direct3D.D3D12\/1.716.0-preview\">Agility SDK 1.716.0-preview<\/a> that provide more control to apps using the D3D12 Video Encode API. These new features help reduce latency and improve quality in several scenarios.<\/p>\n<h3>New feature list<\/h3>\n<ul>\n<li>Subregion notifications: Slice\/tile partial encoding and async completion signaling<\/li>\n<li>Dirty regions: Configurable skip encoding for frame regions<\/li>\n<li>Motion vector hints provided externally to the encoder<\/li>\n<li>Enhanced frame\/block statistics per encoded frame<\/li>\n<li>HEVC 4:2:2\/4:4:4 profiles support<\/li>\n<li>Readable DPB reconstructed pictures<\/li>\n<li>Input ID3D12Resource QP Map<\/li>\n<\/ul>\n<p>Let\u2019s take a look at each of the features. Except for the HEVC additions, the rest of them are codec-agnostic features, meaning that the interfaces are defined without using any specific codec structures. For a detailed description of the features and their interfaces, please refer to the new video specs uploaded to <a href=\"https:\/\/microsoft.github.io\/DirectX-Specs\/\">DirectX-Specs | Engineering specs for DirectX features.<\/a><\/p>\n<h3>Subregion notifications<\/h3>\n<p>When encoding a frame and requesting it to be partitioned into multiple slices or tiles, until now the apps had to wait for the commands to be finished executing in the GPU, before being able to access the compressed bitstream buffer for the entire frame containing all slices.<\/p>\n<p>By introducing the subregion notifications feature, it is now possible to execute the <a href=\"https:\/\/learn.microsoft.com\/en-us\/windows\/win32\/api\/d3d12video\/nf-d3d12video-id3d12videoencodecommandlist2-encodeframe\"><em>EncodeFrame<\/em><\/a> command but split each slice\/tile in different <a href=\"https:\/\/learn.microsoft.com\/en-us\/windows\/win32\/api\/d3d12\/nn-d3d12-id3d12resource\">ID3D12Resource<\/a> objects (or a single one suballocated buffer) containing each subregion bitstream and waiting for completion on independent ID3D12Fence objects, one for each of the frame subregions. The full frame metadata is still reported at the end of the frame encoding in <a href=\"https:\/\/learn.microsoft.com\/en-us\/windows\/win32\/api\/d3d12video\/nf-d3d12video-id3d12videoencodecommandlist2-resolveencoderoutputmetadata\">ResolveEncoderOutputMetadata<\/a> as usual, but the subregion offsets\/sizes are reported asynchronously during <a href=\"https:\/\/learn.microsoft.com\/en-us\/windows\/win32\/api\/d3d12video\/nf-d3d12video-id3d12videoencodecommandlist2-encodeframe\"><em>EncodeFrame<\/em><\/a> as their completion is signaled, allowing the apps to start consuming these compressed bitstream buffers asynchronously while the rest of the subregions are still being encoded. The latter is useful in helping reduce latency in scenarios such as streaming, where subregions now can be sent over the network while the rest of the subregions are still being encoded.<\/p>\n<h3>Dirty regions<\/h3>\n<p>In scenarios where most regions of a video don\u2019t change between consecutive frames (e.g. screen sharing), and the app knows the regions where small change occurs (the dirty regions), now it will be possible to feed this information to the D3D12 Video Encoder, so the encoder can \u201cskip\u201d regions that didn\u2019t change accelerating the encode operation. This helps improve encoding speed as opposed to having the encoder re-scanning the entire frame. To avoid having to stall CPU\/GPU between encoding frames if the input dirty regions come as a result from the GPU timeline, the API supports both CPU and GPU buffer inputs, but initial driver support will be only for CPU input.<\/p>\n<h3>Motion vector hints<\/h3>\n<p>When the application driving the D3D12 video encoder has knowledge about the motion of the content being encoded, such as, for example, a scroll in a shared screen with an open document or when the app rendering the content knows the exact motion vectors, it can now feed motion vector hints to be used by the encoder to help accelerate the motion search process and help guide the encoder to produce higher quality results. Similarly to dirty rects, the API also supports both GPU and CPU inputs for these motion hints, however initial driver support will only be for CPU inputs.<\/p>\n<h3>Enhanced frame statistics<\/h3>\n<p>Three new optional stats can be collected at the end of the <a href=\"https:\/\/learn.microsoft.com\/en-us\/windows\/win32\/api\/d3d12video\/nf-d3d12video-id3d12videoencodecommandlist2-resolveencoderoutputmetadata\">ResolveEncoderOutputMetadata<\/a> execution in the GPU: Quantization parameters (QPMap) utilized per block, Sum of absolute transformed differences (SATD)\u00a0 per block and rate-control bit allocations per block. These stats are provided as <a href=\"https:\/\/learn.microsoft.com\/en-us\/windows\/win32\/api\/d3d12\/nn-d3d12-id3d12resource\">ID3D12Resource<\/a> GPU textures with per block information that can be accessed in the GPU timeline.<\/p>\n<p>These new stats give more information to the application, which in turn would be able to tweak future frame parameters in a closed-loop feedback context. For example, by analyzing the SATD and the QP map, the application can identify blocks with high or low distortion and dynamically adjust these regions of interest in future frames (e.g. using delta QP), while using the per block bits used as guide not to go above the expected bitrate usage. Similarly, by analyzing the used bits per block, the app can identify regions that are consuming a disproportionate amount of bandwidth that may not be of that much interest and dynamically adjust the QP map into those regions for future frames, improving the bitrate.<\/p>\n<h3>HEVC 4:2:2\/4:4:4 profiles support<\/h3>\n<p>The D3D12 Encoding API has been extended to support the HEVC profiles for 4:2:2 and 4:4:4 subsampling formats and different color depths.<\/p>\n<h3>Readable DPB reconstructed pictures<\/h3>\n<p>Until now, the reconstructed pictures stored in D3D12 required the <a href=\"https:\/\/learn.microsoft.com\/en-us\/windows\/win32\/api\/d3d12\/nn-d3d12-id3d12resource\">ID3D12Resource<\/a> containing them to have set the <a href=\"https:\/\/learn.microsoft.com\/en-us\/windows\/win32\/api\/d3d12\/ne-d3d12-d3d12_resource_flags\">D3D12_RESOURCE_FLAG_VIDEO_ENCODE_REFERENCE_ONLY<\/a> flag, restricting access to them. Starting today, this is no longer a mandatory restriction, and we allow the IHV drivers to optionally support regular <a href=\"https:\/\/learn.microsoft.com\/en-us\/windows\/win32\/api\/d3d12\/nn-d3d12-id3d12resource\">ID3D12Resource<\/a> textures for DPB resources. This is useful for applications in scenarios where they\u2019d like to preview the video being encoded in real-time or calculate statistics from the reconstructed pictures without having to re-decode the compressed bitstream.<\/p>\n<h3>Input ID3D12Resource for QPMap<\/h3>\n<p>Until now, the QPMap input provided to the encoder needed to be passed as a CPU array. From now on, D3D12 encoder can also accept the QPMap input as <a href=\"https:\/\/learn.microsoft.com\/en-us\/windows\/win32\/api\/d3d12\/nn-d3d12-id3d12resource\">ID3D12Resource<\/a> GPU textures, when the driver supports this. This avoids having to stall CPU\/GPU between encoding frames if adjusting the QPMap based on output frame statistics or other statistics coming from the GPU pipeline.<\/p>\n<h1>Supported OS &amp; Hardware<\/h1>\n<p>This section will specify the IHV drivers &amp; hardware supporting these features. These will be included in the <a href=\"https:\/\/www.nuget.org\/packages\/Microsoft.Direct3D.D3D12\/1.716.0-preview\">Agility SDK 1.716.0-preview<\/a> release.<\/p>\n<p>Please note that the &#8220;Subregion Notifications&#8221; feature requires Windows 11, version 24H2 or later. Additionally, the latest updates are required from Windows Update for certain hardware due to essential fixes included for this feature.<\/p>\n<h4>NVIDIA<\/h4>\n<table style=\"width: 64.5094%;\" width=\"0\">\n<tbody>\n<tr>\n<td style=\"width: 30.3%;\">HEVC Feature<\/td>\n<td style=\"width: 39.881%;\">Supported platforms<\/td>\n<\/tr>\n<tr>\n<td style=\"width: 30.3%;\">Subregion notifications<\/td>\n<td style=\"width: 39.881%;\">\n<ul>\n<li>NVIDIA will fully support this SDK release, please contact your developer relations representative for specifics.<\/li>\n<\/ul>\n<\/td>\n<\/tr>\n<tr>\n<td style=\"width: 30.3%;\">Dirty regions<\/td>\n<td style=\"width: 39.881%;\">\n<ul>\n<li>Ampere and newer<\/li>\n<\/ul>\n<\/td>\n<\/tr>\n<tr>\n<td style=\"width: 30.3%;\">Motion vectors hints<\/td>\n<td style=\"width: 39.881%;\">\n<ul>\n<li>Pascal and newer<\/li>\n<\/ul>\n<\/td>\n<\/tr>\n<tr>\n<td style=\"width: 30.3%;\">Per block output stats: SATD, bits usage and QP<\/td>\n<td style=\"width: 39.881%;\">\n<ul>\n<li>Ada and newer<\/li>\n<\/ul>\n<\/td>\n<\/tr>\n<tr>\n<td style=\"width: 30.3%;\">444 input texture support (DXGI formats AYUV, YUY2,Y210,Y410)<\/td>\n<td style=\"width: 39.881%;\">\n<ul>\n<li>NVIDIA will fully support this SDK release, please contact your developer relations representative for specifics.<\/li>\n<\/ul>\n<\/td>\n<\/tr>\n<tr>\n<td style=\"width: 30.3%;\">Readable DPB reconpic (NV12 only)<\/td>\n<td style=\"width: 39.881%;\">\n<ul>\n<li>Pascal and newer<\/li>\n<\/ul>\n<\/td>\n<\/tr>\n<tr>\n<td style=\"width: 30.3%;\">Input QPMap as GPU texture<\/td>\n<td style=\"width: 39.881%;\">\n<ul>\n<li>Pascal and newer<\/li>\n<\/ul>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h4>AMD<\/h4>\n<p>AMD driver support for the new video encode features can be found in the following <a href=\"https:\/\/www.amd.com\/en\/resources\/support-articles\/release-notes\/RN-RAD-MS-AGILITY-SDK-25-10-07-01.html\">developer preview<\/a>.<\/p>\n<table style=\"width: 64.5906%;\" width=\"451\">\n<tbody>\n<tr>\n<td style=\"width: 42.9887%;\" width=\"157\"><strong>H264<\/strong> Feature<\/td>\n<td style=\"width: 249.973%;\" width=\"294\">Supported platforms<\/td>\n<\/tr>\n<tr>\n<td style=\"width: 42.9887%;\" width=\"157\">Subregion notifications<\/td>\n<td style=\"width: 249.973%;\" width=\"294\">\n<ul>\n<li>RX 7&#215;00<\/li>\n<li>7X4xHS Series Laptop APUs<\/li>\n<li>8X40 Series Laptop APUs<\/li>\n<\/ul>\n<\/td>\n<\/tr>\n<tr>\n<td style=\"width: 42.9887%;\" width=\"157\">Input QPMap as CPU\/GPU texture<\/td>\n<td style=\"width: 249.973%;\" width=\"294\">\n<ul>\n<li>RX 7&#215;00<\/li>\n<li>7X4xHS Series Laptop APUs<\/li>\n<li>8X40 Series Laptop APUs<\/li>\n<\/ul>\n<\/td>\n<\/tr>\n<tr>\n<td style=\"width: 42.9887%;\" width=\"157\">Dirty rects (Repeat frame, CPU input)<\/td>\n<td style=\"width: 249.973%;\" width=\"294\">\n<ul>\n<li>RX 7&#215;00<\/li>\n<li>7X4xHS Series Laptop APUs<\/li>\n<li>8X40 Series Laptop APUs<\/li>\n<\/ul>\n<\/td>\n<\/tr>\n<tr>\n<td style=\"width: 42.9887%;\" width=\"157\">Motion vectors (Full search, CPU input)<\/td>\n<td style=\"width: 249.973%;\" width=\"294\">\n<ul>\n<li>RX 7&#215;00<\/li>\n<li>7X4xHS Series Laptop APUs<\/li>\n<li>8X40 Series Laptop APUs<\/li>\n<\/ul>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>&nbsp;<\/p>\n<table style=\"width: 64.69%;\" width=\"451\">\n<tbody>\n<tr>\n<td style=\"width: 42.9764%;\" width=\"157\"><strong>HEVC <\/strong>Feature<\/td>\n<td style=\"width: 250.147%;\" width=\"294\">Supported platforms<\/td>\n<\/tr>\n<tr>\n<td style=\"width: 42.9764%;\" width=\"157\">Subregion notifications<\/td>\n<td style=\"width: 250.147%;\" width=\"294\">\n<ul>\n<li>RX 7&#215;00<\/li>\n<li>7X4xHS Series Laptop APUs<\/li>\n<li>8X40 Series Laptop APUs<\/li>\n<\/ul>\n<\/td>\n<\/tr>\n<tr>\n<td style=\"width: 42.9764%;\" width=\"157\">Dirty rects (Repeat frame, CPU input)<\/td>\n<td style=\"width: 250.147%;\" width=\"294\">\n<ul>\n<li>RX 7&#215;00<\/li>\n<li>7X4xHS Series Laptop APUs<\/li>\n<li>8X40 Series Laptop APUs<\/li>\n<\/ul>\n<\/td>\n<\/tr>\n<tr>\n<td style=\"width: 42.9764%;\" width=\"157\">Motion vectors (Full search, CPU input)<\/td>\n<td style=\"width: 250.147%;\" width=\"294\">\n<ul>\n<li>RX 7&#215;00<\/li>\n<li>7X4xHS Series Laptop APUs<\/li>\n<li>8X40 Series Laptop APUs<\/li>\n<\/ul>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>&nbsp;<\/p>\n<table style=\"width: 64.6588%;\" width=\"451\">\n<tbody>\n<tr>\n<td style=\"width: 42.9764%;\" width=\"157\"><strong>AV1<\/strong> Feature<\/td>\n<td style=\"width: 255.137%;\" width=\"294\">Supported platforms<\/td>\n<\/tr>\n<tr>\n<td style=\"width: 42.9764%;\" width=\"157\">Subregion notifications<\/td>\n<td style=\"width: 255.137%;\" width=\"294\">\n<ul>\n<li>RX 7&#215;00<\/li>\n<li>7X4xHS Series Laptop APUs<\/li>\n<li>8X40 Series Laptop APUs<\/li>\n<\/ul>\n<\/td>\n<\/tr>\n<tr>\n<td style=\"width: 42.9764%;\" width=\"157\">Dirty rects (Repeat frame, CPU input)<\/td>\n<td style=\"width: 255.137%;\" width=\"294\">\n<ul>\n<li>RX 7&#215;00<\/li>\n<li>7X4xHS Series Laptop APUs<\/li>\n<li>8X40 Series Laptop APUs<\/li>\n<\/ul>\n<\/td>\n<\/tr>\n<tr>\n<td style=\"width: 42.9764%;\" width=\"157\">Motion vectors (Full search, CPU input)<\/td>\n<td style=\"width: 255.137%;\" width=\"294\">\n<ul>\n<li>RX 7&#215;00<\/li>\n<li>7X4xHS Series Laptop APUs<\/li>\n<li>8X40 Series Laptop APUs<\/li>\n<\/ul>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h4><strong>Intel<\/strong><\/h4>\n<p>For Intel drivers, please contact your developer representative.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>UPDATE 9\/25\/25: This feature is now supported in retail as of the 1.618 Agility SDK. Today DirectX 12 provides APIs to support GPU video encode acceleration for several applications, as detailed in\u00a0D3D12 Video Encoding &#8211; Windows drivers | Microsoft Learn previous blog posts such as Announcing new DirectX 12 feature &#8211; Video Encoding! &#8211; DirectX [&hellip;]<\/p>\n","protected":false},"author":70024,"featured_media":2727,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[],"class_list":["post-11067","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-directx"],"acf":[],"blog_post_summary":"<p>UPDATE 9\/25\/25: This feature is now supported in retail as of the 1.618 Agility SDK. Today DirectX 12 provides APIs to support GPU video encode acceleration for several applications, as detailed in\u00a0D3D12 Video Encoding &#8211; Windows drivers | Microsoft Learn previous blog posts such as Announcing new DirectX 12 feature &#8211; Video Encoding! &#8211; DirectX [&hellip;]<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/directx\/wp-json\/wp\/v2\/posts\/11067","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/directx\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/directx\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/directx\/wp-json\/wp\/v2\/users\/70024"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/directx\/wp-json\/wp\/v2\/comments?post=11067"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/directx\/wp-json\/wp\/v2\/posts\/11067\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/directx\/wp-json\/wp\/v2\/media\/2727"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/directx\/wp-json\/wp\/v2\/media?parent=11067"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/directx\/wp-json\/wp\/v2\/categories?post=11067"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/directx\/wp-json\/wp\/v2\/tags?post=11067"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}