Transformer support for PyTorch with DirectML is here!

Adele Parsons

The latest release of PyTorch with DirectML is available today! This release brings support for training popular Transformer Models like GPT2, BERT, and Detection Transformers. To get started with training Transformer Models using PyTorch with DirectML, you can find a new sample on the DirectML GitHub. The sample covers training a PyTorch implementation of the Transformer Model in the popular paper “Attention is All You Need ” (Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin, arxiv, 2017).

This release of PyTorch with DirectML also includes improved memory consumption capabilities to unlock faster performance and the ability to use larger batch sizes.

Finally, PyTorch with DirectML now follows a Plugin model with support for the latest version of PyTorch (1.13). After installing PyTorch, simply pip install torch-directml  to get started. Once you’ve installed the Torch-DirectML plugin, you can begin training AI models starting with the following lines:

import torch

import torch_directml

dml = torch_directml.device()

tensor = torch.tensor([1]).to(dml)  # Note that dml is a variable, not a string!


Please note that this release of the Torch-DirectML plugin is mapped to the “PrivateUse1” Torch backend. The new torch.directml.device() API is a convenient wrapper for sending your tensors to the DirectML device. Now you’re ready to train your models using PyTorch with DirectML!

Please leave any questions, suggestions, or issues here on GitHub. Our team is constantly engaging with the community and would love to hear your input!



Discussion is closed. Login to edit/delete existing comments.

  • kasule francis 0

    I would like to thank you for all the beautiful content that you write about,How ever I would also want you to try and find out or write about the need for a stand alone general purpose Ai or ML program that one can install like they way we install @microsoft office so everyone can fit or train it to do anything he wishes .I am from Africa Uganda and really we don’t have many people who can code complex programs if one isn’t very rich but a general purpose ML program from Microsoft would have been the best and then one can upgrade and attach modules etc .I am in the medical field and I need it for my own use , researchers in university need it so do people who just want to chat with one offline as you are aware internet isn’t everywhere.If you know of such a program kindly send me a link to my email . I will attach a link to an example of a program that we are trying to modify for general purpose but it isn’t easy and it can’t learn from images . kindly request the leaders of planning in Microsoft that the world needs simple general purpose Ai or ML program that they can install and train to their needs accordingly.
    Dr kasule.

    • kasule francis 0 Link to an example of a simple program although I am not sure if it can learn and it can’t accept images etc If you have any volunteers who can improve it to learn from images and fix any bugs you are well come .Thanks alot for your time.

Feedback usabilla icon