October 13th, 2022

DirectStorage 1.1 Coming Soon

Cassie Hoef
Principal Program Manager

When we shared our first public release of DirectStorage on Windows to reduce CPU overhead and increase IO throughput, we also shared that GPU decompression was next on our roadmap. We are now in the final stretch of development and plan to release DirectStorage 1.1 with GPU Decompression to developers by the end of 2022. This is one of our most highly requested features, so in the meantime, we want to share a sneak peek at what we’ve been up to and what developers can look forward to later this year!

For Gamers:

What is asset compression and how does GPU decompression change games?

Games require massive amounts of data to build immersive worlds – every character, object, and landscape has “assets” describing characteristics like shape, lighting, and color. This adds up to hundreds of gigabytes of data. To reduce the overall package size of a game, these assets are compressed. When a game is run, the assets are transferred to system memory, where the CPU decompresses the data before it is finally copied into GPU memory to be used as needed. The transfer and decompression of these assets on gaming devices contributes heavily to load times and limits how much detail can be included in open world scenes.

DirectStorage 1.0 improves the data transfer part of this process. Advances in Windows 11 combined with DirectStorage allow developers to make use of the higher bandwidth of NVMe drives. DirectStorage enabled games installed on NVMe drives should expect to see reductions in load times by up to 40%. After enhancing this part of the pipeline, developers will want to improve decompression performance next.

Typically, decompression work is done on the CPU because compression formats have historically been optimized for CPUs only. We are offering an alternative method in DirectStorage 1.1 by moving the decompression of those assets to the GPU instead – known as “GPU decompression.” Graphics cards are extremely efficient at performing repeatable tasks in parallel, and we can utilize that capability along with the bandwidth of a high-speed NVMe drive to do more work at once. As a result, the amount of time it takes for an asset to load decreases, reducing level load times and improving open world streaming.

To get a more tangible sense for the possibilities, we built a highly optimized sample (below). It shows that when DirectStorage is running with GPU decompression vs CPU decompression, scenes are loading nearly 3x faster and the CPU is almost entirely freed up to be used for other game processes. When DirectStorage 1.1 released, it kicks off a new journey for game developers to make full use of gaming hardware and speed up load times for PC games over the next few years.

Avocados are dancing. GPU with GDeflate (left) loading in 0.8 seconds vs CPU with Zlib (right) loading in 2.36 seconds
GPU with GDeflate (left) loading in 0.8 seconds vs CPU with Zlib (right) in 2.36 seconds.
This is an early preview of the sample; performance numbers vary with different workloads/hardware.

 

Where does GPU decompression work?

Several factors affect game performance when it comes to compression/decompression. Here’s a breakdown of what works and what’s recommended:

OS: DirectStorage games will work on both Windows 10 and Windows 11, but there are additional optimizations in the IO stack available to Windows 11 users, so that is our recommended choice for the best improvements. Games running on both Windows 10 and Windows 11 will see gains from an efficient GPU decompression implementation, as the key component to this feature is moving the workload from the CPU to the GPU rather than changes to the OS itself.

Storage Device: DirectStorage enabled games will work on all devices (. You’ll need an NVMe SSD, where the bandwidth capabilities are much higher and the storage media itself is faster, to see the significant improvements of DirectStorage. We highly recommend ensuring your game files are saved to an NVMe to get the best gaming experience.

GPU: Any DirectX 12 capable GPU that supports Shader Model 6.0 will be able to take advantage of the new feature, we recommend a DX12 Ultimate capable card.

 

For Developers:

Introducing GDeflate in DirectStorage

There are a wide variety of compression formats available – developers choose among these by considering the compression ratio and runtime performance of the codec.  With DirectStorage 1.1, we present a new compression format, contributed by NVIDIA, called GDeflate.

“NVIDIA and Microsoft are working together to make long load times in PC games a thing of the past,” said John Spitzer, VP of Developer and Performance Technology at NVIDIA. “Applications will benefit by applying GDeflate compression to their game assets, enabling richer content and shorter loading times without having to increase the file download size.”

GDeflate is a novel lossless data compression standard optimized for high-throughput decompression on the GPU with deflate-like compression ratios. GDeflate saves CPU cycles by offloading costly decompression operations to the GPU, while saving system interconnect bandwidth and on-disk footprint at the same time. GDeflate compression is inherently data-parallel, which enables greater scalability across a wide range of GPU architectures. It is designed to provide significant bandwidth amplification when loading from the fastest NVMe devices, supporting both bulk-loading and fine-grained streaming scenarios.

GDeflate provides a new GPU decompression format that all hardware vendors can support and optimize for. Microsoft is working with key partners like AMD, Intel, and NVIDIA to provide drivers tailored for this format. “Intel is excited to release drivers co-engineered with Microsoft to work seamlessly with the DirectStorage Runtime to bring optimized GPU decompression capabilities to game developers!” said Murali Ramadoss, Intel Fellow and GM of GPU Software Architecture. Like all DirectX technologies, with DirectStorage, Microsoft is working to ensure that gamers have great options for compatibility and performance for their hardware.

 

What’s coming with DirectStorage 1.1?

When released, we will provide an update to the DirectStorage SDK that provides everything developers need to get started with GPU decompression. Included with the SDK will be:

  • Updated DirectStorage runtime to perform decompression
  • Tooling for GDeflate, including compressors
  • Samples (including the Bulk Loading demo above)
  • Documentation

Hardware vendors may begin releasing drivers for DirectStorage 1.1 in the weeks before our release. Drivers from our key partners ensure that developers will see the full potential of GPU decompression. “DirectStorage 1.1 with GPU decompression will enable developers to unleash their creativity, delivering more detailed and visually stunning worlds,” said Scott Herkelman, senior vice president and general manager, Graphics Business Unit at AMD. “We have worked closely with Microsoft to ensure the best possible experience on AMD devices and platforms.”

If these drivers are not present, DirectStorage will fall back to an optimized DirectCompute implementation. Developers who plan to experiment with the next version of DirectStorage should keep an eye out for new and improved drivers.

Soon after the SDK release, we will publish Apache 2.0 licensed reference implementations of GDeflate compressors and decompressors, allowing tooling to be integrated with existing asset pipelines.

With a new compression format, tooling, and drivers in hand, developers will have the key ingredients to architect their engines to take advantage of GPU decompression.

 

Next Steps

We will be providing more API specifics and documentation with the release coming soon, here are a few ways developers can start planning now:

  • Refresh your memory on how your IO stack is currently working in your game engine’s architecture. Consider how your tooling pipeline may need to be adjusted.
  • Try out DirectStorage 1.0.
  • GPU decompression only works with requests that target GPU resources, so start utilizing:
    • DSTORAGE_REQUEST_DESTINATION_BUFFER
    • DSTORAGE_REQUEST_DESTINATION_TEXTURE_REGION
    • DSTORAGE_REQUEST_DESTINATION_MULTIPLE_SUBRESOURCES
    • DSTORAGE_REQUEST_DESTINATION_TILES
  • Set aside some R&D time after the release for understanding and experimenting with GPU decompression.

Stay tuned to the blog and our Discord channel for more news and we look forward getting DirectStorage 1.1 into developers’ hands so they can start integrating it with future games!

 

 

 

Category
DirectX

Author

Cassie Hoef
Principal Program Manager

7 comments

Discussion is closed. Login to edit/delete existing comments.

  • Mattias Bilger

    “but there are additional optimizations in the IO stack available to Windows 11 users”

    What additional optimizations are we talking about here? Been using Windows 11 for quite some time and the OS is really much slower/bad experience to use than Windows 10…

  • ⸻ “‪How Things Work‬”

    But when is Soon™?

    • Wouter Zelle

      By the end of 2022.

  • JWeiler

    Are there any changes regarding the status of Bitlocker/fvevol.sys in the BypassIO model?

    fsutil bypassio state C:\
    BypassIo on “C:\” is partially supported
    Volume stack bypass is disabled (fvevol.sys)
    Storage Type: NVMe
    Storage Driver: BypassIo compatible

    Having games installed on a Bitlocker-encrypted drive is fairly common.

    • Daniel Thomas

      Exactly, there’s currently system conflicts with this feature that’s gonna prevent a lot of people from being able to access this. Even something like drive cloning software can block you from using this.

  • Alexander S

    Why ancient Deflate? Wouldn’t zstd be much, much better? For basically anything at any level of compression