AutoRest and OpenAPI: The backbone of Azure SDK

Avatar

David

Developing rich, cross-language SDKs for a cloud platform as featureful as Microsoft Azure is a tall order. Luckily AutoRest and the OpenAPI specification enable the Azure SDK team to generate much of the code needed for these SDKs using API specifications authored by Azure service teams. This article will give you more insight into how we use code generation to provide a great development experience for Azure users.

What is OpenAPI?

OpenAPI is a specification language that enables one to describe a web service API in terms of its operations and the data types it understands. It was originally conceived in 2011 as a specification called Swagger to enable generation of documentation and client libraries for the REST APIs of a company called Wordnik. Over time, more companies started using Swagger to describe their REST APIs and provide those same benefits to their own users.

The innovation in tooling around Swagger gradually led to greater adoption and the formation of the OpenAPI Initiative. This organization established OpenAPI version 2.0 (equivalent to the Swagger 2.0 specification) and then began developing the next iteration, OpenAPI version 3. This specification enabled a new level of expressiveness for describing patterns that were either vague or impossible to specify in OpenAPI version 2.

OpenAPI is primarily concerned with describing web services that follow the Representational State Transfer (REST) architectural model where operations are exposed via URI paths that accept HTTP verbs like GET, PUT, POST, and DELETE. These URI paths generally refer to “resources” understood by the service where the request and response bodies of most operations contain the details of the resource at that path. For example, a POST request with the body containing the desired state of a resource will create that resource, a PUT request with changes to some properties will update the resource, and a DELETE request will cause the resource to be deleted. OpenAPI provides a schema language which enables this type of API to be described in a machine-readable form, usually encoded in JSON.

Let’s imagine that we have a service for creating and querying some arbitrary resource called a “widget.” A widget resource has an ID and a name and can be created using the POST action on the /widgets URI. All widgets that have been created can be queried using a GET request on the /widgets URI and a specific widget can be queried with a GET to the specific URI for the ID like /widgets/1.

Here is an example OpenAPI 3.0 schema that describes this service:

{
  "openapi": "3.0.0",
  "info": {
    "version": "1.0.0",
    "title": "Widget Service"
  },
  "paths": {
    "/widgets": {
      "get": {
        "summary": "List all widgets",
        "operationId": "listWidgets",
        "tags": [
          "widgets"
        ],
        "responses": {
          "200": {
            "description": "An array of widgets",
            "content": {
              "application/json": {
                "schema": {
                  "type": "array",
                  "items": {
                    "$ref": "#/components/schemas/Widget"
                  }
                }
              }
            }
          },
          "default": {
            "description": "Unexpected error",
            "content": {
              "application/json": {
                "schema": {
                  "$ref": "#/components/schemas/Error"
                }
              }
            }
          }
        }
      },
      "post": {
        "summary": "Create a widget",
        "operationId": "createWidget",
        "tags": [
          "widgets"
        ],
        "responses": {
          "201": {
            "description": "Widget was created"
          },
          "default": {
            ... elided error response ...
          }
        }
      }
    },
    "/widgets/{widgetId}": {
      "get": {
        "summary": "Details of a specific widget",
        "operationId": "getWidgetById",
        "tags": [
          "widgets"
        ],
        "parameters": [
          {
            "name": "widgetId",
            "in": "path",
            "required": true,
            "description": "The ID of the widget to retrieve",
            "schema": {
              "type": "string"
            }
          }
        ],
        "responses": {
          "200": {
            "description": "Returned the widget",
            "content": {
              "application/json": {
                "schema": {
                  "$ref": "#/components/schemas/Widget"
                }
              }
            }
          },
          "default": {
            ... elided error response ...
          }
        }
      }
    }
  },
  "components": {
    "schemas": {
      "Widget": {
        "type": "object",
        "required": [
          "id",
          "name"
        ],
        "properties": {
          "id": {
            "type": "integer",
            "format": "int64"
          },
          "name": {
            "type": "string"
          }
        }
      },
      "Error": {
        "type": "object",
        "required": [
          "code",
          "message"
        ],
        "properties": {
          "code": {
            "type": "integer",
            "format": "int32"
          },
          "message": {
            "type": "string"
          }
        }
      }
    }
  }
}

What you can see is that each operation’s request and responses are specified fully so that a code generator knows how to handle each case. The operations are also specific about the content types used in both requests and responses (like application/json) so that the schemas for each content type can be different depending on what type is being transmitted. We’re also defining schemas for both the Widget resource and an Error type for returning error details when something goes wrong.

What is AutoRest?

AutoRest is a tool that provides a code generation framework for converting OpenAPI 2.0 and 3.0 specifications into client libraries for the services described by those specifications. It was developed by Microsoft around the time the OpenAPI Initiative was formed so that Azure service teams could start producing generated client libraries from new Swagger and OpenAPI 2.0 specifications.

Despite there being other existing code generator implementations, Microsoft decided to write its own code generator to address limitations with Swagger 2.0 which made it difficult or even impossible to express patterns used in Azure services. For example, AutoRest defined and added implementations for x-ms-discriminator-value to distinguish between possible schema types in requests and responses and x-ms-pageable which enables response collections to paged through additional operation calls. Ultimately these constructs were added to the OpenAPI 3.0 specification with features like type discriminators and response operation links.

At the core of AutoRest is a flexible pipeline where a series of pre-configured phases transform and merge various OpenAPI input files to produce a “code model” that can be consumed by a language-specific code generator. These code generator extensions will interpret the code model and produce code that aligns with the design guidelines for each language. The generated code for a language will use the corresponding Azure Core implementation so that we can provide configurable behavior for how HTTP requests are made in the generated code.

Here’s a high-level diagram of the AutoRest pipeline:

AutoRest pipeline diagram

The current set of languages with first-class code generation support are C#, Python, Java, TypeScript / JavaScript, and PowerShell with other languages like Go and Swift following close behind. One interesting aspect of the new generators written for AutoRest V3 is that many of them are written in the language that they generate which helps decrease context switching when writing code that generates client code in the target language. These generators communicate with AutoRest over an RPC protocol which enables them to run as external processes that communicate with the AutoRest host process.

The new V3 generation of AutoRest now supports most of the OpenAPI 3.0 specification. This enables Azure service teams to more closely model the real-world usage of their APIs. Since many existing service specifications are authored with OpenAPI 2.0, we have an early pipeline phase which converts the 2.0 specification to an equivalent OpenAPI 3.0 specification before it reaches the modeler phase.

AutoRest V3 features a brand new modeler called Modelerfour which integrates many learnings from previous AutoRest versions to produce a richer code model that reduces the amount of work that language generators much do to produce client libraries. Modelerfour now performs the following tasks in addition to producing the code model:

  • Checks for duplicate schema definitions and reconciles them into a single definition
  • Applies many known OpenAPI extensions to the code model
  • Checks for inconsistencies in the input file to ensure that a reasonable output can be produced
  • Pre-formats many schema and enumeration value names with rules like camel or Pascal casing
  • Flattens deeply-nested parameters or payload schemas to increase client SDK usability

All of these features can be enabled conditionally for each language generator so that the resulting code model provides just the right information.

How We Generate Azure SDK Libraries

The Azure SDK team uses AutoRest to generate client libraries for many Azure services across a variety of languages. The majority of the OpenAPI specifications (typically in OpenAPI v2) can be found in the Azure REST API Specs repository on GitHub. We typically use two strategies for producing SDK libraries from these specifications:

Fully-Generated Libraries

A number of the Azure Resource Manager (ARM) libraries are produced completely from generated code. This is possible due to the regularity and predictability of the REST API design for all ARM services. You can find many libraries available for ARM services across all our supported languages because of how easy it is for them to be generated. The generated ARM libraries for Python also serve as the foundation of the Azure CLI tools.

Generated Core with “Convenience” Layer

Over the past year we have started to onboard more Azure data plane APIs like Azure Key Vault to produce a new set of first-class SDKs for these services. Because these services are all designed differently, due to their specific functionality and requirements, we must take a different approach when producing high-quality client libraries.

In this case, we use AutoRest to generate the core functionality of each data plane service from the Swagger specification and then we build a “convenience layer” on top which provides a more crafted design that makes it easier to use. When producing these crafted interfaces, we have them reviewed by our architecture board to ensure that the design meets our guidelines and maintains some reasonable consistency between languages.

Conclusion

Hopefully this article gives you a better understanding of how we produce client libraries for so many Azure services. If you’d like to learn more about AutoRest, check out the GitHub repository for the project.

Azure SDK Blog Contributions

Thank you for reading this Azure SDK blog post! We hope that you learned something new and welcome you to share this post. We are open to Azure SDK blog contributions. Please contact us at azsdkblog@microsoft.com with your topic and we’ll get you set up as a guest blogger.

2 comments

Comments are closed. Login to edit/delete your existing comments

  • Avatar
    Sławek Rosiek

    What’s the status of AutoRest right now? Is it still developed/maintained? Few days ago I tried to use it to generate typescript client but unsuccessful as it is failing. Two blocking bugs were reported around March/April but they are not resolved.

    • Avatar
      David WilsonMicrosoft employee

      We are still actively developing AutoRest but the priority has shifted to providing code generation capabilities for Azure SDKs. Once the new pipeline and generators have stabilized, we may have more time to support community scenarios and issues. Can you tell me which blocking bugs you are referring to?