August 1st, 2024

Introducing enterprise multi-agent support in Semantic Kernel

The term “agent” has become a popular term within the industry. There have many different definitions, but at their core, they consist of a system prompt (i.e., a persona), plugins, and an ability to automatically reason and create plans to address user goals.

Up until today, we’ve demonstrated how you could use components of Semantic Kernel to build agents. With just a few lines of code, you can use a chat completion model to answer user’s questions and to automatically invoke plugins as necessary.

What was missing, however, was a first-class agent abstraction. Not only would this simplify code by consolidating logic, but it would also ensure that there is a common contract on how to interact with an agent. This may seem like a small feat, but it has allowed us to build a multi-agent framework that allows agents to coordinate with one another while also simplifying the code you need to write.

You can find the agent packages for the Python and .NET SDKs in the following locations:

Introducing Semantic Kernel agents

With our latest Python (1.6.0) and .NET releases (1.18.0 RC1), Semantic Kernel now provides a first-class abstraction for agents. This reduces much of the complexity required to build a standard chat experience while also providing a standardized API to interact with them. With this release, we’re providing two out-of-the-box agents: Assistant API agents and Chat completion agents.

Note: The Agent Framework is currently marked as experimental along with other certain Semantic Kernel features.  Even though we strive to maintain stable development patterns, there may be updates that break compatibility until we are able to graduate the Agent Framework.

Assistant API agents

For most developers, the OpenAIAssistantAgent (based on the Open AI Assistant API) provides the simplest way to get started. With OpenAI Assistants, state is managed automatically for you and you’re provided out-of-the-box tools (i.e., code interpreter and file retrieval).

To use Assistants in Semantic Kernel, you can use the Semantic Kernel python package or the Assistant agent .NET NuGet package. Once you’ve installed and imported the right packages, you can quickly get started with just a few lines of code.

Note: the following code samples are high-level to show the main building blocks for agents. Please follow the provided links, below the code snippets, to view the full implementation of each scenario.

C# code

IKernelBuilder builder = Kernel.CreateBuilder();

builder.Plugins.AddFromType<YourPlugin>();

Kernel kernel = builder.Build();

OpenAIAssistantAgent agent =
    await OpenAIAssistantAgent.CreateAsync(
        kernel: new(),
        config: new(YourConfiguration.AzureOpenAI.ApiKey, YourConfiguration.AzureOpenAI.Endpoint),
        new()
        {
            Name = "<agent name>",
            Instructions = "<agent instructions>",
            ModelId = YourConfiguration.AzureOpenAI.ChatDeploymentName,
            EnableCodeInterpreter = true,
        });

string threadId = await agent.CreateThreadAsync();

await agent.AddChatMessageAsync(threadId, new ChatMessageContent(AuthorRole.User, "<input>"));

await foreach (ChatMessageContent message in agent.InvokeAsync(threadId))
{
    DisplayMessage(message);
}

See the C# Step8_OpenAIAssistant.cs code sample for the complete implementation.

Python code

kernel = Kernel()

service_id = "agent"

kernel.add_plugin(plugin=YourPlugin(), plugin_name="your_plugin")

agent = await AzureAssistantAgent.create(
    kernel=kernel, service_id=service_id, name="<agent name>", instructions="<agent instructions>"
)

thread_id = await agent.create_thread()

await agent.add_chat_message(thread_id=thread_id, message=ChatMessageContent(role=AuthorRole.USER, content="<input>"))

async for message in agent.invoke(thread_id=thread_id):
    display_message(message)

See the Python step7_assistant.py code sample for the complete implementation.

Chat completion agent

For customers who don’t want to use Assistants (either because they want more control over their chat history or because they want to use a non-OpenAI model), they can use our Chat completion agent. Internally it behaves the same as OpenAI Assistants, but you need to provide your own chat history and must supply your own code interpreter.

To use the chat completion agent in Semantic Kernel, you can use the Semantic Kernel python package or the .Core agent .NET NuGet package. Once you’ve installed and imported the right packages, you can quickly get started with just a few lines of code:

C# code

IKernelBuilder builder = Kernel.CreateBuilder();

builder.AddAzureOpenAIChatCompletion(
    YourConfiguration.AzureOpenAI.ChatDeploymentName,
    YourConfiguration.AzureOpenAI.Endpoint,
    YourConfiguration.AzureOpenAI.ApiKey);

builder.Plugins.AddFromType<YourPlugin>()

Kernel kernel = builder.Build();

ChatCompletionAgent agent =
    new()
    {
        Name = "<agent name>",
        Instructions = "<agent instructions>",
        Kernel = kernel,
    };

ChatHistory chat = [new ChatMessageContent(AuthorRole.User, "<input>")];

await foreach (ChatMessageContent message in agent.InvokeAsync(chat))
{
    chat.Add(message);
    DisplayMessage(message);
}

See the C# Step1_Agent.cs code sample for the complete implementation.

Python code

kernel = Kernel()

service_id = "agent"

kernel.add_service(AzureChatCompletion(service_id=service_id))

kernel.add_plugin(plugin=YourPlugin(), plugin_name="your_plugin")

agent = ChatCompletionAgent(service_id=service_id, kernel=kernel, name="<agent name>", instructions="<agent instructions>")

chat = ChatHistory()

chat.add_user_message("<input>")

async for content in agent.invoke(chat):
    chat.add_message(content)
    display_message(message)

See the Python step1_agent.py code sample for the complete implementation.

Multi-agent orchestration: group chat vs task-based

Once you have a single agent, you can then create multiple and have them work together, there’s been a considerable amount of research that has gone into multi-agent patterns over the past several months. Very early on, projects like AutoGen were able to identify that multiple agents, with different tools and system prompts, could outperform a single agent. This is no different than humans. With a diverse group of people, multiple perspectives and skillsets can be leveraged to deliver a better product or service.

What has emerged are two different ways of coordinating (or orchestrating) multiple agents together: group chats or task-based business process flows.

Collaborating within a group chat

With group chats, several different participants (multiple agents and/or multiple humans) can come together and share a common chat history. This pattern is no different than how humans use apps like Microsoft Teams, WhatsApp, or SMS. It’s best at preserving semantic information across multiple turns. For example, if a user told agent A that they liked the color blue, agent B would be able to use that same information because they’re in the same chat.

Image GroupChat

This pattern, however, isn’t perfect. Agents can get stuck in a loop talking back-and-forth with each other and it can require many tokens (thereby making solutions slow and costly). Because of this, another breed of orchestration was introduced by the research community…

Providing structure with business process flows (coming soon)

The other popular approach was influenced by how humans work together on complex business processes. When employees work on a large project, they don’t just use chat; the conversation would become unwieldy and overwhelming. Instead, other productivity tools are leveraged, like Microsoft 365 and GitHub. At the center of the work though, is typically some sort of task app that coordinates who is working on what and tracks the progress that is being made.

Image Process

Frameworks like Crew AI were the first to build this pattern as a first-class piece of their framework. Instead of just passing conversation state from one agent to another, artifacts are routed between agents so that they can progressively complete work that is leveraged by someone else. Unfortunately, because the primary means of communication is via artifacts, any information in a chat that isn’t included in an artifact is effectively lost.

The best of both worlds: combining group chats with processes

Most of today’s frameworks either do one or the other (not both) or interweave them into the same concept. What the Semantic Kernel team has identified though is that there’s benefit of treating them as separate concepts that can be interwoven together. We can take additional inspiration from how humans work. We don’t have monolithic apps that combine chat and tasks. No, we typically have a single enterprise chat application (e.g., Microsoft Teams) and a separate project management app (e.g., GitHub projects) to manage work.

Image process chat

As we approach Microsoft Ignite, we’re going to deliver preview versions of both patterns and demonstrate how they can work together. Today, we’re exciting to share the .NET preview version of Group Chat orchestration.

Introducing AutoGen group chats in Semantic Kernel

As a research project, AutoGen has not been developed to support enterprise production scenarios, but many customers have expressed interest in using the patterns established with AutoGen in their AI applications. To support these customers, we’ve replicated most of the patterns in AutoGen within Semantic Kernel within the .NET library.

Development for group chats in Java is currently ongoing and is planned to be released by Ignite.

At the core of AutoGen and Semantic Kernel’s agent framework is an AgentChat. This object stores the shared conversation state across multiple agents and also contains the routing logic to determine “who speaks next,” and when a conversation round is over. Below, you can see how easy it is in Semantic Kernel to create a group chat that has a custom termination strategy.

C# code

ChatCompletionAgent agentReviewer = ...;
OpenAIAssistantAgent agentWriter = ...;

AgentGroupChat chat =
    new(agentWriter, agentReviewer)
    {
        ExecutionSettings =
            new()
            {
                TerminationStrategy =
                    new ApprovalTerminationStrategy()
                    {
                        Agents = [agentReviewer]
                    }
            }
    };

chat.AddChatMessage(new ChatMessageContent(AuthorRole.User, "<input>"));

await foreach (ChatMessageContent content in chat.InvokeAsync())
{
    DisplayMessage(message);
}

See the C# Step3_Chat.cs code sample for the complete implementation.

Python code

agent_reviewer = ChatCompletionAgent(...)
agent_writer = await OpenAIAssistantAgent.create(...)

chat = AgentGroupChat(
    agents=[agent_writer, agent_reviewer],
    termination_strategy=ApprovalTerminationStrategy(agents=[agent_reviewer]),
)

await chat.add_chat_message(ChatMessageContent(role=AuthorRole.USER, content="<input>"))

async for message in chat.invoke():
    display_message(message)

See the Python step3_chat.py code sample for the complete implementation.

What’s next?

We are continuing to evolve the Agent Framework and look forward to your input and suggestions. Besides delivering a system for process automation, we’re also working on providing…

  • Support for Python OpenAI Assistant V2 features is available starting from version 1.4.0. Support for .NET OpenAI Assistant V2 features is available as of 1.18.0 RC1.
  • Enable for serializing and restoring AgentChat (soon!)
  • Expose for streaming for all agents and AgentChat
  • Enable history truncation strategy for ChatCompletionAgent
  • Improved chat patterns

Please reach out if you have any questions or feedback through our Semantic Kernel GitHub Discussion Channel. We look forward to hearing from you! We would also love your support, if you’ve enjoyed using Semantic Kernel, give us a star on GitHub.

Author

Chris Rickman
Principal Software Engineer
Evan Mattson
Senior Software Engineer

1 comment

Discussion is closed. Login to edit/delete existing comments.

  • José Luis Latorre Millás

    Sounds very exciting 🙂 – keep on the great work!!