DirectStorage is coming to PC

Andrew Yeung - MSFT

Earlier this year, Microsoft showed the world how the Xbox Series X, with its portfolio of technology innovations, will introduce a new era of no-compromise gameplay. Alongside the actual console announcements, we unveiled the Xbox Velocity Architecture, a key part of how the Xbox Series X will deliver next generation gaming experiences.

We’re excited to bring DirectStorage, an API in the DirectX family originally designed for the Velocity Architecture to Windows PCs!  DirectStorage will bring best-in-class IO tech to both PC and console just as DirectX 12 Ultimate does with rendering tech. With a DirectStorage capable PC and a DirectStorage enabled game, you can look forward to vastly reduced load times and virtual worlds that are more expansive and detailed than ever.

In this blog post, we’re going to give gaming enthusiasts more details on how it’s going to work and how it will revolutionize PC gaming.


The evolution of storage technologies and game IO patterns

Recent advancements in SSD and PCIe technologies, specifically NVMe technologies, allow gaming PCs to have storage solutions that deliver far more bandwidth than was ever possible with older hard drive technologies. Instead of tens of megabytes per second, drives like the upcoming Xbox Series X console’s custom NVMe can deliver a blazing-fast multiple gigabytes per second.

Game workloads have also evolved. Modern games load in much more data than older ones and are smarter about how they load this data. These data loading optimizations are necessary for this larger amount of data to fit into shared memory/GPU accessible memory. Instead of loading large chunks at a time with very few IO requests, games now break assets like textures down into smaller pieces, only loading in the pieces that are needed for the current scene being rendered. This approach is much more memory efficient and can deliver better looking scenes, though it does generate many more IO requests.

Unfortunately, current storage APIs were not optimized for this high number of IO requests, preventing them from scaling up to these higher NVMe bandwidths creating bottlenecks that limit what games can do. Even with super-fast PC hardware and an NVMe drive, games using the existing APIs will be unable to fully saturate the IO pipeline leaving precious bandwidth on the table.

That’s where DirectStorage for PC comes in. This API is the response to an evolving storage and IO landscape in PC gaming. DirectStorage will be supported on certain systems with NVMe drives and work to bring your gaming experience to the next level. If your system doesn’t support DirectStorage, don’t fret; games will continue to work just as well as they always have.


What exactly will DirectStorage do for my PC gaming experience and how?

There are two primary areas this new API is going to improve: reducing frustratingly long load times of the past and enabling games to be more detailed and expansive than ever.

Although seemingly different, both benefits stem from the same IO system advancements that DirectStorage brings. Whether it’s the textures of your characters clothing, or the details of the mountains off in the distance, both fundamentally involve the loading of data from a storage device which needs to eventually get to the GPU. The former just happens while on a loading screen whereas the latter happens as you walk through an open world game that loads in the distant scenery coming into view in real time while dumping things that drop out of view.

In either case, previous gen games had an asset streaming budget on the order of 50MB/s which even at smaller 64k block sizes (ie. one texture tile) amounts to only hundreds of IO requests per second. With multi-gigabyte a second capable NVMe drives, to take advantage of the full bandwidth, this quickly explodes to tens of thousands of IO requests a second. Taking the Series X’s 2.4GB/s capable drive and the same 64k block sizes as an example, that amounts to >35,000 IO requests per second to saturate it.

Existing APIs require the application to manage and handle each of these requests one at a time first by submitting the request, waiting for it to complete, and then handling its completion. The overhead of each request is not very large and wasn’t a choke point for older games running on slower hard drives, but multiplied tens of thousands of times per second, IO overhead can quickly become too expensive preventing games from being able to take advantage of the increased NVMe drive bandwidths.

On top of that, many of these assets are compressed. In order to be used by the CPU or GPU, they must first be decompressed. A game can pull as much data off the disk as it wants, but you still need an efficient way to decompress and get it to the GPU for rendering. By using DirectStorage, your games are able to leverage the best current and upcoming decompression technologies.

In a world where a game knows it needs to load and decompress thousands of blocks for the next frame, the one-at-a-time model results in loss of efficiency at various points in the data block’s journey. The DirectStorage API is architected in a way that takes all this into account and maximizes performance throughout the entire pipeline from NVMe drive all the way to the GPU.

It does this in several ways: by reducing per-request NVMe overhead, enabling batched many-at-a-time parallel IO requests which can be efficiently fed to the GPU, and giving games finer grain control over when they get notified of IO request completion instead of having to react to every tiny IO completion.

In this way, developers are given an extremely efficient way to submit/handle many orders of magnitude more IO requests than ever before ultimately minimizing the time you wait to get in game, and bringing you larger, more detailed virtual worlds that load in as fast as your game character can move through it.


Why NVMe?

NVMe devices are not only extremely high bandwidth SSD based devices, but they also have hardware data access pipes called NVMe queues which are particularly suited to gaming workloads. To get data off the drive, an OS submits a request to the drive and data is delivered to the app via these queues. An NVMe device can have multiple queues and each queue can contain many requests at a time. This is a perfect match to the parallel and batched nature of modern gaming workloads. The DirectStorage programming model essentially gives developers direct control over that highly optimized hardware.

In addition, existing storage APIs also incur a lot of ‘extra steps’ between an application making an IO request and the request being fulfilled by the storage device, resulting in unnecessary request overhead. These extra steps can be things like data transformations needed during certain parts of normal IO operation. However, these steps aren’t required for every IO request on every NVMe drive on every gaming machine. With a supported NVMe drive and properly configured gaming machine, DirectStorage will be able to detect up front that these extra steps are not required and skip all the necessary checks/operations making every IO request cheaper to fulfill.

For these reasons, NVMe is the storage technology of choice for DirectStorage and high-performance next generation gaming IO.


When can we expect more details?

For every DirectX family feature, Microsoft brings together the best of the PC gaming industry players to standardize new gaming features, make them available to game developers, and eventually get them into your gaming machines.

This process has already begun for DirectStorage and we’re working with our industry partners right now to finish designing/building the API and its supporting components. We’re targeting getting a development preview of DirectStorage into the hands of game developers next year.