April 25th, 2024

Introducing API Manifest Plugins for Semantic Kernel

Hi all,

Today we’re featuring a guest author from another team in Microsoft on our Semantic Kernel blog. We will turn it over to Mustafa Zengin to dive into Introducing API Manifest Plugins for Semantic Kernel.

Semantic Kernel allows developers to import plugins from OpenAPI documents. For large APIs, such as Microsoft Graph, importing the entire OpenAPI document can be inefficient and sometimes ineffective as the function descriptions will exceed token limits when used with large language models. To address this issue, we developed a new plugin infrastructure for Semantic Kernel that uses API Manifest, which is a convenient format to describe plugins that refer to subset of APIs and package them together.

What is an API Manifest?

An API Manifest is a document that describes an application’s API dependencies. It contains links to API descriptions, lists which requests are being made by the application, and the requests’ authorization requirements. A complete specification of API Manifest can be found here.
It also describes API based plugins for AI chat interfaces.

API Manifest Plugins for Semantic Kernel

API Manifest plugins for Semantic Kernel are a way to generate Semantic Kernel functions from API Manifest files. Semantic Kernel functions can be called from function calling (or tool using) large language models (LLMs) in various Retrieval Augmented Generation (RAG) scenarios.
It has a few key features:
  • Slices large API descriptions into only the parts that are needed for the task at hand.
  • Provides a way of packaging dependencies by combining multiple API dependencies into a single plugin.
  • Defines the authorization requirements for the API calls.
  • In copilot systems where plugins are provided by 3rd party authors, it allows quick inspection of the API dependencies and their authorization requirements as opposed to inspecting the entire API description.
We will demonstrate the use of API Manifest plugins with a sample scenario.

Scenario

Alice receives an email message in Turkish from a friend from her Astronomy Club who invites her to observe the solar eclipse together. The email talks about how rare the event is by referencing the previous solar eclipse as well. Alice will create a Markdown table that contains the dates of current and previous eclipses and a link to the eclipse map on timeanddate.com. She will also create an HTML table that contains the same information so that she can use them in different contexts.
Turkish email content:
8 Nisan 2024 tarihinde bir güneş tutulması olacak, beraber izlemek ister misin? Bir önceki tutulma 4 Aralık 2021 tarihinde olmuştu. Bunu kaçırmak istemezsin.
English translation of the email content:
There will be a solar eclipse on April 8, 2024, would you like to watch it together? The previous eclipse was on December 4, 2021. You don’t want to miss this one.

What are the tools available for this task?

  • Graph API
    • to access email content.
  • LLM
    • to translate from Turkish to English
    • to extract dates from the email
    • to build timeanddate.com link using few-shot prompting
    • to create Markdown table
    • to create corresponding HTML table
In the last step, Alice would like the output to be consistent. However, LLMs are subject to hallucination, which means that consistency between the Markdown and HTML output is not always guaranteed. Conveniently GitHub has a Markdown API that outputs HTML representation of Markdown and does so in a consistent manner. So she will add GitHub API into the mix.
  • GitHub API
    • to get the HTML representation of the Markdown table

How can this scenario be implemented in Semantic Kernel?

Microsoft Graph API has 1000s of operations in its OpenAPI description. We need some form of slicing of the API description. We will also connect to GitHub API to get the HTML representation of the Markdown table. These calls will have different authorization requirements. Graph API will need delegated authentication with Mail.Read scope for /me/messages endpoint per Graph API docs. GitHub will work without authorization. API Manifest plugin infrastructure in Semantic Kernel provides all this functionality with a simple interface.

We will use function calling stepwise planner from Semantic Kernel to ask LLM to execute the following plan:
Get my last email, extract the current and previous dates of eclipses, and build the timeanddate.com eclipse map URL. Then create a Markdown table and get the HTML representation of the Markdown table and output these two tables.
Eclipse map URL is built using the following format: https://www.timeanddate.com/eclipse/map/2024-april-8
(Note: for the sake of simplicity, we used “get my last email”, however, you can use more complicated queries, such as “get emails from a specific sender”, etc. All Graph API filters for the corresponding API call will be available in the API Manifest plugin)

Prerequisites

Configuration

Fill in the details from the Prerequisites step to the appsettings.Developer.json file below.
{
    "AzureOpenAI": {
        "ChatModelId": "",
        "ChatDeploymentName": "",
        "Endpoint": "",
        "ApiKey": ""
    },
    "MsGraph": {
        "ClientId": "",
        "TenantId": "9188040d-6c67-4c5b-b112-36a304b66dad", // MSA/Consumer/Personal tenant,  https://learn.microsoft.com/azure/active-directory/develop/accounts-overview
        "RedirectUri": "http://localhost"
    }
}
Please note that if you are using a school or work account, you will need the tenant ID specified in your app registration.

Package dependencies

dotnet add package Microsoft.Extensions.Configuration.Json
dotnet add package Microsoft.SemanticKernel.Planners.OpenAI --prerelease 
dotnet add package Microsoft.SemanticKernel.Plugins.MSGraph --prerelease
dotnet add package Microsoft.SemanticKernel.Plugins.OpenAPI.Extensions --prerelease

API Manifest file

Currently, you can author this file manually. In the future, Kiota, our open-source API-client generation tool, will provide the functionality to generate this file from OpenAPI descriptions.

{
    "applicationName": "Message Processor Plugin",
    "description": "This plugin accesses Microsoft Graph API to read emails and GitHub API for Markdown to HTML conversion.",
    "publisher": {
        "name": "Alice",
        "contactEmail": "alice@contoso.com"
    },
    "apiDependencies": {
        "microsoft.graph": {
            "apiDescriptionUrl": "https://raw.githubusercontent.com/microsoftgraph/msgraph-metadata/master/openapi/v1.0/graphexplorer.yaml",
            "requests": [
                {
                    "method": "Get",
                    "uriTemplate": "/me/messages"
                }
            ]
        },
        "github": {
            "apiDescriptionUrl": "https://raw.githubusercontent.com/github/rest-api-description/main/descriptions/api.github.com/api.github.com.yaml",
            "requests": [
                {
                    "method": "POST",
                    "uriTemplate": "/markdown"
                }
            ]
        }
    }
}
  • apiDependencies section contains the API dependencies.
  • microsoft.graph and github are the names of the dependencies.
  • apiDescriptionUrl is the URL of the OpenAPI description.
  • requests section contains the requests that are being made by the application.

Sample Program

using Microsoft.Extensions.Configuration;
using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.Planning;
using Microsoft.SemanticKernel.Plugins.MsGraph.Connectors.CredentialManagers;
using Microsoft.SemanticKernel.Plugins.OpenApi;
using Microsoft.SemanticKernel.Plugins.OpenApi.Extensions;

// read config
var configuration = new ConfigurationBuilder()
    .AddJsonFile("appsettings.Development.json", optional: true, reloadOnChange: true)
    .Build();

// initialize kernel
var kernel = Kernel.CreateBuilder()
    .AddAzureOpenAIChatCompletion(
        deploymentName: configuration["AzureOpenAI:ChatDeploymentName"],
        endpoint: configuration["AzureOpenAI:Endpoint"],
        serviceId: "AzureOpenAIChat",
        apiKey: configuration["AzureOpenAI:ApiKey"],
        modelId: configuration["AzureOpenAI:ChatModelId"]).Build();

// prefetch graph token
LocalUserMSALCredentialManager credentialManager = await LocalUserMSALCredentialManager.CreateAsync().ConfigureAwait(false);

var token = await credentialManager.GetTokenAsync(
    configuration["MSGraph:ClientId"],
    configuration["MSGraph:TenantId"],
    // API Manifest file captures required scopes. In the future, we have plans to improve integrations to automate this process
    // so that you don't have to specify the scopes in your auth callback.
    ["Mail.Read"],
    new Uri(configuration["MSGraph:RedirectUri"])).ConfigureAwait(false);

// create graph auth callback to inject auth header
AuthenticateRequestAsyncCallback? graphAuthCallback = (HttpRequestMessage requestMessage, CancellationToken _) => {
    requestMessage.Headers.Add("Authorization", $"Bearer {token}");
    return Task.CompletedTask;
};

// specify auth callbacks for each API dependency
// we don't need one for GitHub as markdown API doesn't require any auth
var apiManifestPluginParameters = new ApiManifestPluginParameters
{
    FunctionExecutionParameters = new ()
    {
        { "microsoft.graph", new OpenApiFunctionExecutionParameters(authCallback: graphAuthCallback) },
        // no need for github authentication as markdown API does not need it
    }
};

// import api manifest plugin
KernelPlugin plugin =
await kernel.ImportPluginFromApiManifestAsync(
    "MessageProcessorPlugin",                   // plugin name
    "MessageProcessorPlugin/apimanifest.json",  // path to api manifest file
    apiManifestPluginParameters)
    .ConfigureAwait(false);

// set goal
var goal = @"
Get my last email, extract the current and previous dates of eclipses, and build the timeanddate eclipse map URL. Then create a Markdown table and get the HTML representation of the Markdown table and output these two tables.

Eclipse map URL is built using the following format: https://www.timeanddate.com/eclipse/map/2024-april-8
";

// create planner
var planner = new FunctionCallingStepwisePlanner(
    new FunctionCallingStepwisePlannerOptions
    {
        MaxIterations = 10,
        MaxTokens = 32000
    }
);

// execute plan
var result = await planner.ExecuteAsync(kernel, goal);

// output result
Console.WriteLine(result.FinalAnswer);

Output in Markdown

| Eclipse Date | Eclipse URL |
|--------------|-------------|
| April 8, 2024 | [View Map](https://www.timeanddate.com/eclipse/map/2024-april-8) |
| December 4, 2021 | [View Map](https://www.timeanddate.com/eclipse/map/2021-december-4) |

Output in HTML (consistent with Markdown)

<table role="table">
<thead>
<tr>
<th>Eclipse Date</th>
<th>Eclipse URL</th>
</tr>
</thead>
<tbody>
<tr>
<td>April 8, 2024</td>
<td><a href="https://www.timeanddate.com/eclipse/map/2024-april-8" rel="nofollow">View Map</a></td>
</tr>
<tr>
<td>December 4, 2021</td>
<td><a href="https://www.timeanddate.com/eclipse/map/2021-december-4" rel="nofollow">View Map</a></td>
</tr>
</tbody>
</table>

Rendered Output

Eclipse Date Eclipse URL
April 8, 2024 View Map
December 4, 2021 View Map
As both formats are available in a consistent style in the output, they can be used in places where Markdown or HTML is used, including this blog post.
After defining the plugin once, the same plugin can be used in other message processing scenarios:
  • Extracting different types of information from email messages will be only a matter of changing the goal in natural language.
  • Generated HTML output can be wired to an email sender with the help of another plugin or with an extension of this plugin with additional endpoints from Graph.
  • Generated Markdown output can be backed up to GitHub as a gist.
  • and on and on…

Follow-up Challenge

NASA has a public API that returns “Astronomy Picture of the Day”. Extend the above scenario to include the “Astronomy Picture of the Day” for both eclipses. You should be able to extend API manifest file to refer to NASA API and add authentication with API key. Because connecting the NASA API will be outside the responsibility of “MessageProcessorPlugin”, you may want to rename it to something else or create two different plugins with clear package boundaries in terms of their responsibilities. It really depends on how you want to package your plugins and group the API dependencies. API Manifest plugins in Semantic Kernel give you that flexibility.

We are looking forward to what you are going to build with API Manifest plugins. Please let us know your feedback!”

From the Semantic Kernel team, we want to thank Mustafa for his time. We’re always interested in hearing from you. If you have feedback, questions or want to discuss further, feel free to reach out to us and the community on the Semantic Kernel GitHub Discussion Channel! We would also love your support, if you’ve enjoyed using Semantic Kernel, give us a star on GitHub.

0 comments

Discussion are closed.