Hardware Accelerated GPU Scheduling

Steve Pronovost

Abstract

You may have noticed a mysterious new optional feature called Hardware Accelerated GPU Scheduling appear in the advanced graphics settings page with the Windows 10 May 2020 update. The purpose of this blog is to give some background on this new feature and how we are introducing it. It is intended for folks curious about Windows internals. Remaining on the cutting edge of hardware innovation has always been a critical aspect of our graphics platform. Hardware Accelerated GPU Scheduling enables more efficient GPU scheduling between applications. For most users, this transition will be transparent. It is one of those things that if we do our job right, you will never know the transition happened. As the graphics platform continues to evolve, this modernization will enable new scenarios in the future.

WDDM GPU Scheduler

It has been almost 14 years since the introduction of the Windows Display Driver Model 1.0 (WDDM) and with it the introduction of GPU scheduling in Windows. Few likely remember the pre-WDDM days where applications could simply submit work to the GPU as much as they wanted. They submitted to a global queue where it was executed in a strict “first to submit, first to execute” fashion. These very rudimentary scheduling schemes were workable, at a time where most GPU applications were full screen games, being run one at a time.

With the transition to a broad set of applications using the GPU for richer graphics and animations, the platform needed to better prioritize GPU work to ensure a responsive user experience. Thus, the WDDM GPU scheduler was born.

Over time we have significantly enhanced the GPU scheduler at the heart of WDDM, supporting additional features and scenarios with each new WDDM version. However, throughout its evolution, one aspect of the scheduler was unchanged. We have always had a high-priority thread running on the CPU that coordinates, prioritizes, and schedules the work submitted by various applications.

This approach to scheduling the GPU has some fundamental limitations in terms of submission overhead, as well as latency for the work to reach the GPU. These overheads have been mostly masked by the way applications have traditionally been written. For example, an application would typically do GPU work on frame N, and have the CPU run ahead and work on preparing GPU commands for frame N+1. This buffering of GPU commands into batches allows an application to submit just a few times per frame, minimizing the cost of scheduling and ensuring good CPU-GPU execution parallelism.

An inherent side effect of buffering between CPU and GPU is that the user experiences increased latency. User input is picked up by the CPU during “frame N+1” but is not rendered by the GPU until the following frame. There is a fundamental tension between latency reduction and submission/scheduling overhead. Applications may submit more frequently, in smaller batches to reduce latency or they may submit larger batches of work to reduce submission and scheduling overhead.

Hardware-accelerated GPU scheduling

With Windows 10 May 2020 update, we are introducing a new GPU scheduler as a user opt-in, but off by default option. With the right hardware and drivers, Windows can now offload most of GPU scheduling to a dedicated GPU-based scheduling processor.

Windows continues to control prioritization and decide which applications have priority among contexts. We offload high frequency tasks to the GPU scheduling processor, handling quanta management and context switching of various GPU engines.

The new GPU scheduler is a significant and fundamental change to the driver model. Changing the scheduler is akin to rebuilding the foundation of a house while still living in it. To ensure a smooth transition we are introducing the new scheduler as an early-adopter, opt-in feature. During the transition we will gather large scale performance and reliability data as well as customer feedback.

We are adding UI to the Advanced Graphics Settings page to control enabling the new GPU scheduler. The settings page can be reached through Settings -> System -> Display -> Graphics Settings. If both your GPU and driver support the new GPU scheduler, the UI below will appear.

Which GPUs will support Hardware Scheduling?

The new GPU scheduler will be supported on recent GPUs that have the necessary hardware, combined with a WDDMv2.7 driver that exposes this support to Windows. Please watch for announcements from our hardware vendor partners on specific GPU generations and driver versions this support will be enabled for.

Hardware accelerated GPU scheduling is a big change for drivers. While some GPUs have the necessary hardware, the associated driver exposing this support will only be released once it has gone through a significant amount of testing with our Insider population.

If you are an Insider and have chosen to install a build of Windows from our Fast or Slow distribution ring, you have been running a version of Windows with support for hardware accelerated GPU scheduling. You may have even been part of our experimentation!

As we get under-development drivers from our GPU manufacturer partners, we published these drivers to an Insider version of Windows Update (WU) where distribution is limited to the Insider population. In the Insider Fast Ring, we can run experiments where we silently toggle hardware accelerated GPU scheduling on, on behalf of some users such that we get a mix of users running with and without the new scheduler.

Through our experimentation platform and our telemetry system we can effectively run A/B experiments and compare how systems running with hardware accelerated GPU scheduling compare to systems running our old GPU scheduler. We monitor reliability telemetry such as kernel crashes (bluescreens), user mode crashes, GPU hangs, freeze/deadlocks as well as a limited set of performance metrics.

Once a driver completes support for hardware accelerated scheduling and accumulates enough execution time in our Insider Pool to demonstrate its reliability and performance, it is allowed to be promoted to the public version of Windows Update where it becomes available to everyone running the supported hardware.

Why not have hardware accelerated GPU scheduling on by default for all users given all the care taken before a driver can expose this support? Although we do a lot of validation through our Insider population, the number of system configurations and scenarios in the Insider population does not fully cover what can happen in our eco-system of more than a billion devices. Because hardware accelerated GPU scheduling is such a fundamental pillar of the graphics subsystem and used in absolutely everything that you do on your PC, we decided to introduce it initially as an opt-in to avoid any possible disruption. Users can opt-in through the UI and for new systems, OEM are encouraged to configure and validate their system with hardware accelerated GPU scheduling turned on from the factory.

What to expect when switching to the new GPU scheduler

The transition should be transparent, and users should not notice any significant changes. Although the new scheduler reduces the overhead of GPU scheduling, most applications have been designed to hide scheduling costs through buffering.

The goal of the first phase of hardware accelerated GPU scheduling is to modernize a fundamental pillar of the graphics subsystem and to set the stage for things to come… but that’s going to be a story for a another time 😊.

We do not expect customers to experience performance regressions but if you encounter any, please be sure to file feedback at: https://aka.ms/submitgameperformancefeedback.

Thanks

Please give this a try and let us know what you think!

18 comments

Discussion is closed. Login to edit/delete existing comments.

  • Gaganjot Singh 0

    That’s an interesting feature by Windows 10. Will it increase of hardware consumption on the GPU?

  • R. K. 0

    Just so you know. With the activated GPU Scheduling you may expect that your multi gpu setup will not work anymore in blender 2.8.3 / cuda / cycles.
    Took me a long time to figure it out. Because this feature will be activated automatically without informing you. Please stop doing this Microsoft.

    My output with blender –debug-cycles:

    CUDA error: Launch failed in cuGraphicsResourceGetMappedPointer(&buffer, &bytes, pmem.cuPBOresource), line 2000

    Refer to the Cycles GPU rendering documentation for possible solutions:
    https://docs.blender.org/manual/en/latest/render/cycles/gpu_rendering.html

    CUDA error: Launch failed in cuModuleGetFunction(&cuFilmConvert, cuModule, “kernel_cuda_convert_to_half_float”), line 1865

  • Chandan Nataraj 0

    Hi Steve,

    Thanks for the article.
    Is this going to affect TCC mode? Say I have two windows applications that use CUDA and the GPU is set to use TCC mode. Is there a way to set priority for one of the process to use GPU over other when there is a high priority task?

Feedback usabilla icon