DirectX ❤ Linux
DirectX is coming to the Windows Subsystem for Linux
At //build 2020 we announced that GPU hardware acceleration is coming to the Windows Subsystem for Linux 2 (WSL 2).
What is WSL? WSL is an environment in which users can run their Linux applications from the comfort of their Windows PC. If you are a developer working on containerized workload that will be deployed in the cloud inside of Linux containers, you can develop and test these workloads locally on your Windows PC using the same native Linux tools you are accustomed to. In response to popular demand, these Linux applications and tools can now benefit from GPU acceleration.
The purpose of this blog is to give you a glimpse of how this support is achieved and how the various pieces fit together.
Over the last few Windows releases, we have been busy developing client GPU virtualization technology. This technology is integrated into WDDM (Windows Display Driver Model) and all WDDMv2.5 or later drivers have native support for GPU virtualization. This technology is referred to as WDDM GPU Paravirtualization, or GPU-PV for short. GPU-PV is now a foundational part of Windows and is used in scenarios like Windows Defender Application Guard, the Windows Sandbox or the Hololens 2 emulator. Today this technology is limited to Windows guests, i.e. Windows running inside of a VM or container.
To bring support for GPU acceleration to WSL 2, WDDMv2.9 will expand the reach of GPU-PV to Linux guests. This is achieved through a new Linux kernel driver that leverages the GPU-PV protocol to expose a GPU to user mode Linux. The projected abstraction of the GPU follows closely the WDDM GPU abstraction model, allowing API and drivers built against that abstraction to be easily ported for use in a Linux environment.
Introducing dxgkrnl (Linux Edition)
Dxgkrnl is a brand-new kernel driver for Linux that exposes the /dev/dxg device to user mode Linux. /dev/dxg exposes a set of IOCTL that closely mimic the native WDDM D3DKMT kernel service layer on Windows. Dxgkrnl inside of the Linux kernel connects over the VM Bus to its big brother on the Windows host and uses this VM bus connection to communicate with the physical GPU.
If the host has multiple GPUs, all GPUs are projected and available to the Linux environment (assuming all of these GPUs are running WDDMv2.9 drivers).
Applications running inside of the Linux environment have the same access to the GPU as native applications on Windows. There is no partitioning of resources between Linux and Windows or limit imposed on Linux applications. The sharing is completely dynamic based on who needs what. There are basically no differences between two Windows applications sharing a GPU versus a Linux and a Windows application sharing the same GPU. If a Linux application is alone on a GPU, it can consume all its resources!
Assuming you have the right GPU driver installed on the Windows host, /dev/dxg is automatically exposed and available to any WSL distro installed without having to install any additional packages. Note that the distro needs to be running in WSL version 2 mode (wsl –set-version <Distro> 2) in order to get access to the GPU.
Although they share a name, the version of dxgkrnl inside of the Linux kernel is a clean room implementation of a Linux GPU driver based on our GPU-PV protocol and doesn’t share anything else in common with its similarly named Windows counterpart. Dxgkrnl Linux edition is being made open source and shared back with the community. As we work on upstreaming this new driver, source code is available in Microsoft’s official Linux kernel branch for WSL 2.
DxCore & D3D12 on Linux
Projecting a WDDM compatible abstraction for the GPU inside of Linux allowed us to recompile and bring our premiere graphics API to Linux when running in WSL.
This is the real and full D3D12 API, no imitations, pretender or reimplementation here… this is the real deal. libd3d12.so is compiled from the same source code as d3d12.dll on Windows but for a Linux target. It offers the same level of functionality and performance (minus virtualization overhead). The only exception is Present(). There is currently no presentation integration with WSL as WSL is a console only experience today. The D3D12 API can be used for offscreen rendering and compute, but there is no swapchain support to copy pixels directly to the screen (yet 😊).
DxCore (libdxcore.so) is a simplified version of dxgi where legacy aspects of the API have been replaced by modern versions. DxCore is available on both Windows and Linux. DxCore is also used to host a flat version of the D3DKMT API used by a WDDM based driver on Windows to talk with the GPU. This API abstracts the differences in how the various WDDM services make their way to the kernel (service table on Windows versus IOCTL on Linux).
libd3d12.so and libdxcore.so are closed source, pre-compiled user mode binaries that ship as part of Windows. These binaries are compatible with glibc based distros and are automatically mounted under /usr/lib/wsl/lib and made visible to the loader. In other words, these APIs work right out of the box without the need to install additional packages or tweak the distro’s configuration. Support is currently limited to glibc based distros such as Ubuntu, Debian, Fedora, Centos, SUSE, etc…
D3D12 wouldn’t be able to operate without a GPU specific user mode driver (UMD) provided by our GPU manufacturer partners. The UMD is responsible for things like compiling shaders to hardware specific byte code and translating API rendering requests into actual GPU instructions in command buffers to be executed by the GPU. Working closely with our partners, they have recompiled their D3D12 UMD to a Linux target, enabling execution of these drivers in a WSL environment. This support is being integrated in upcoming WDDMv2.9 drivers such that GPU support in WSL is seamless to the end user. WDDMv2.9 drivers will carry a version of the DX12 UMD compiled for Linux. The host driver package is mounted inside of WSL at /usr/lib/wsl/drivers and directly accessible to the d3d12 API. If you have a WDDMv2.9 driver… the GPU magically shows up in WSL and becomes fully usable.
DirectML and AI Training
In addition to D3D12 and DxCore, we ported our machine learning API, DirectML to work on Linux when running in WSL. We brought DirectML’s performant machine learning inferencing capabilities to Linux and expanded its functionality in support of training workflows too! DirectML sits on top of our D3D12 API and provides a collection of compute operations and optimizations for machine learning workloads.
The DirectML team has a goal of integrating these hardware accelerated inferencing and training capabilities with popular ML tools, libraries, and frameworks. In supporting training workflows with DirectML, we’re placing an initial focus on the ML workflows of students and beginners. We want to ensure university students and in-industry engineers can leverage the breadth of Windows hardware to learn and gain new ML skills. Utilizing DirectML provides these students and beginners with a simple path to leverage hardware acceleration in their existing systems, by tapping into their DirectX 12 capable GPU from their Linux-based ML tools running inside WSL 2.
As we’ve invested in expanding DirectML’s capabilities, the awesome co-engineering with our silicon partners has been essential for ensuring the breadth of GPUs in the Windows ecosystem benefit from these ML focused investments. This is why we are excited to announce training with DirectML will enter preview starting this summer!
In order to make it even easier for our customers to get started training with DirectML, we are releasing a preview package of TensorFlow with an integrated DirectML backend. Students and beginners will be able to get started with TensorFlow tutorials and build the foundation for their future. Plus, we’re engaging with TensorFlow community and are going through the RFC process! Once the preview is publicly available, we will continue to make investments, adding new functionality to DirectML as well as continue to improve on its end-to-end training capability with TensorFlow, making your training workflows even better.
If you are interested in a sneak-peak of DirectML hardware acceleration for training in action, check out the //build Skilling Session titled Windows AI: hardware-accelerated ML on Windows devices.
OpenGL, OpenCL & Vulkan
We have recently announced work on mapping layers that will bring hardware acceleration for OpenCL and OpenGL on top of DX12. We will be using these layers to provide hardware accelerated OpenGL and OpenCL to WSL through the Mesa library. After our work is done, WSL distro will need to update Mesa in order to light up this acceleration. For distros picking up this Mesa update, acceleration will automatically be enabled whenever a WDDMv2.9 driver or above is installed on the Windows host.
What about Vulkan? We are still exploring how best to support Vulkan in WSL and will share more details in the future. Besides, we can’t reveal all our plans at one time 😊.
What about the most popular compute API out there today, you ask?
We are pleased to announce that NVIDIA CUDA acceleration is also coming to WSL! CUDA is a cross-platform API and can communicate with the GPU through either the WDDM GPU abstraction on Windows or the NVIDIA GPU abstraction on Linux.
We worked with NVIDIA to build a version of CUDA for Linux that directly targets the WDDM abstraction exposed by /dev/dxg. This is a fully functional version of libcuda.so which enables acceleration of CUDA-X libraries such as cuDNN, cuBLAS, TensorRT.
Support for CUDA in WSL will be included with NVIDIA’s WDDMv2.9 driver. Similar to D3D12 support, support for the CUDA API will be automatically installed and available on any glibc-based WSL distro if you have an NVIDIA GPU. The libcuda.so library gets deployed on the host alongside libd3d12.so, mounted and added to the loader search path using the same mechanism described previously.
In addition to CUDA support, we are also bringing support for NVIDIA-docker tools within WSL. The same containerized GPU workload that executes in the cloud can run as-is inside of WSL. The NVIDIA-docker tools will not be pre-installed, instead remaining a user installable package just like today, but the package will now be compatible and run in WSL with hardware acceleration.
For more details and the latest on the upcoming NVIDIA CUDA support in WSL, please visit http://developer.nvidia.com/cuda/wsl.
At //build we announced that support for Linux GUI applications is coming to WSL. While today WSL is a console only experience, soon you’ll be able to use your favorite Linux IDE or other GUI application alongside your other Windows applications on your Windows desktop.
How are the pixels going to flow between Linux applications and the Windows desktop hosting them and how are the various window going to be integrated into unified and seamless experience? That’s going to be a story for another time 😊.
Coming to WDDMv2.9
If you have read all the way to this point, you’re probably excited about all this stuff and now wondering when you’ll actually be able to start playing with it!
Support for DxCore, D3D12, DirectML and NVIDIA CUDA is coming to a Windows Insider Fast build soon once the Fast ring moves back to receiving builds from RS_PRERELEASE. The preview of TensorFlow will become available at about the same time as an installable PyPI package alongside the existing TensorFlow packages in PyPI.org. At that point you’ll be able to replicate all of those cool //build demos 😊. For preview availability stay tuned to aka.ms/gpuinwsl and if you’re interested in refining our future investments in ML, take a moment and complete our survey.
Support for the OpenGL/OpenCL mapping layer and GUI applications will be coming at a later time through Insider build flighting. As we get closer to having these technologies ready, we will keep you informed through the Windows Insider build release notes and blog updates.
Hope you found this interesting! Please try this out and share your feedback!