November 6th, 2023

OpenAI Assistants: the future of Semantic Kernel

During the OpenAI event earlier today, OpenAI announced the launch of GPTs and the assistants API, the new and improved ways of creating agents on top of their chat completion models. With assistants, much of the heavy lifting required to build agents has been stripped away…

  • Messages are now managed for you in threads.
  • Memory is automatically handled for you behind the scenes.
  • And multiple functions can be called (instead of just one).

This ultimately means it’ll be faster, and easier, for you to build agents on top of OpenAI and Semantic Kernel. We’re excited to share our plans on incorporating assistants into Semantic Kernel and how they fit into our v1 proposals, so beginning today, we’re going to start a series on building agents with Semantic Kernel.

For our inaugural blog post on agents, we’ll share our overarching plans for incorporating OpenAI assistants with Semantic Kernel. Subsequent articles will demonstrate how to achieve the following with today’s APIs (with an eye towards what it’ll be like for v1): creating your first assistant, orchestrating assistants together, extending assistants with plugins, and keeping assistants safe through responsible AI and monitoring. If there are any topics you’d like us to cover, let us know on our dedicated discussion board for GPTs and assistants.

The kernel will become your gateway to assistants.

In our Road to V1 and beyond blog post, we shared that one of our goals was to provide “a compelling reason to use the kernel”. Last week we shared a few ways the kernel would improve, but we didn’t share the full story.

That changes today with the announcement of the assistants API.

With the kernel, we plan on providing an abstraction layer on top of assistant so it’s easier to build assistants and so you can more easily extend the new assistant APIs provided by OpenAI.

Today’s kernel just manages the runtime.

With today’s kernel, you can only define the available functions, models, and prompt template engines. This helps you create a runtime that allows your semantic and native functions to talk to each other, but many other pieces still need to be implemented by the developer.

For example, to build a complete agent with the kernel, you must manage the entire chat history on your own. This proves particularly annoying when using OpenAI function calling–after a function is used, you need to add both the agent response and the function response–or when you start running out of tokens for the chat history.

For the uninitiated, this can be confusing and challenging, so we plan to make this better.

Tomorrow’s kernel will help you manage everything for assistants.

To simplify things, we will update the kernel as part of so it can use an OpenAI assistant behind the scenes. We’re excited about this change, because it means we can update the already simplified v1 proposal from this…

// Create a new kernel
IKernel kernel = new Kernel(
    aiServices: new () { },
    plugins: new () { intentPlugin, mathPlugin }
);

// Start the chat
ChatHistory chatHistory = gpt35Turbo.CreateNewChat();
while(true)
{
    Console.Write("User > ");
    chatHistory.AddUserMessage(Console.ReadLine()!);

    // Run the simple chat
    var result = await kernel.RunAsync(
        chatFunction,
        variables: new() {{ "messages", chatHistory }},
        streaming: true
    );

    Console.Write("Agent > ");
    await foreach(var message in result.GetStreamingValue<string>()!)
    {
        Console.Write(message);
    }

    Console.WriteLine();
    chatHistory.AddAgentMessage(await result.GetValueAsync<string>()!);
}

To this…

// Create a new kernel
AssistantKernel kernel = new AssistantKernel(
    aiServices: new () { gpt35Turbo, gpt4Agent },
    plugins: new () { intentPlugin, mathPlugin }
);

// Start the chat
kernel.StartChat(chatFunction);
while(true)
{
    Console.Write("User > ");

    // Run the simple chat
    var result = await kernel.SendUserMessage(
        Console.ReadLine()!,
        streaming: true
    );

    Console.Write("Agent > ");
    await foreach(var message in result.GetStreamingValue<string>()!)
    {
        Console.Write(message);
    }
    Console.WriteLine();
}

With this new setup, you no longer need to manage the chat history yourself. Additionally, “running” the kernel will become even easier because you just need to pass in the last user’s input.

Behind the scenes, whenever you use the SendUserMessage method, we’ll 1) call the necessary OpenAI GPT APIs to send the user’s message, 2) retrieve a response from the LLM, before finally 3) giving the result back to you.

With the kernel, we’ll make it easy to extend OpenAI assistants.

As powerful as the new assistant APIs are, they don’t do everything. This is where Semantic Kernel comes in. With its support for plugins, planners, and multi-model support, you can use Semantic Kernel to extend assistants to make them more power while also optimizing performance and cost.

  1. Simplified function calling – To make you agents even more useful, you can provide them with actions to run. We’ll simplify this process by leveraging the existing functions already registered in a kernel via plugins. As you converse with your agent, we’ll provide it with the functions you’ve added and automatically run them as we get responses from the model.
  2. Complex multi-step plans – With agents, OpenAI can start to call multiple functions at a time, but it still cannot create complex plans with conditional logic, loops, and variable passing. With Semantic Kernel planners, you can do just that. Not only does this save you tokens, but it also allows you to generate complete plans that can be reviewed by humans before they’re executed.
  3. Multi-model support – Today’s agents use either GPT-3.5-turbo, GPT-4, and soon GPT-4-turbo for all chat completions. As a developer, however, you may want to be more discerning. You may want to use GPT-4-turbo for the final response while using GPT-3.5-turbo for some of the simpler semantic functions. With Semantic Kernel, you make these optimizations. You can even leverage non-OpenAI models in conjunction with your OpenAI agents.
  4. More control over memory – If you want to use an advanced memory architecture to have more control over how you save and retrieve memories (like Kernel memory or Llama index), you can add these services as plugins to provide even better context to your agent.
  5. Greater visibility and monitoring – With Semantic Kernel’s pre/post hooks, you can easily add telemetry once to your kernel to easily get visibility into token usage, rendered prompts, and more across all your native and semantic functions.

We want your feedback!

As the Semantic Kernel team, we want to continue providing early previews into where we’d like to push the SDK. This helps contributors in the open-source community add PRs, and perhaps more importantly, gives us an opportunity to collect your feedback.

If you have any feedback on this proposal (either good or bad), please share it on our discussion boards. We’ve created a dedicated board for this topic so you can share with us your scenarios so we can make sure we build the right integration.

3 comments

Discussion is closed. Login to edit/delete existing comments.

  • José Luis Latorre Millás · Edited

    Sounds fantastic @mabolan!
    Can’t wait to see the full picture with Agents on it for V 1.0.0!! 🙂

  • Amit LalMicrosoft employee · Edited

    How this SK assistant differ from AutoGen AI Agents? Trying to wrap my head around this with AutoGen

    • Matthew BolanosMicrosoft employee Author

      That’s a great question! Autogen is a framework for letting multiple agents collaborate with each other to complete a request on the same thread. Semantic Kernel, on the other hand, is an SDK that helps you give a single agent a set of tools (via plugins). Because of this, we believe both projects complement each other. To make a team of agents in Autogen productive, you’ll ultimately want to give them plugins so they can complete real work. This is when you’d want to use the two projects together.

      Put more simply…
      – Need agents to collaborate with each other? Use...

      Read more