MicroAgents: Exploring Agentic Architecture with Microservices

Throughout our time building Semantic Kernel and working with customers, we’ve introduced agents and have just started to explore the potential of autonomous AI agents. While the community is in the midst of exploring various architectures for these agents, one source we can draw inspiration from is the microservice architecture.

Consider the benefits of microservices:

Ease of maintenance: feature enhancement & validation
Reliability: fault isolation & diagnostics
Efficiency: cost & flexibility
Deployment agility: frequency & scaling

In this post, we’ll cover what we’re calling MicroAgents and talk through what some of the benefits this approach could provide. Let’s dive in!

What is an Agent?

From the perspective of Semantic Kernel orchestration, an AI agent is a modular abstraction that can possess a persona, can perform actions in response to user input, and can easily communicate with other agents.

You might also view an agent from an AI-as-a-service perspective or as an autonomous worker.

The MicroAgent pattern adapts to either perspective.

Microagents

Take the case of an AI personal assistant with various services: location, calendar, email, banking, shopping, travel, weather, etc… If you were to build this as a monolithic agent, it would be directly bound to its tooling. Namely, all provided functions and plugins would need to fit in that agent’s model context.

In addition, the agent would have to decide which among possibly thousands of services to call, making it tedious to give the agent nuanced system instructions.

Partitioning by functional domain and utilizing agent composition introduces a MicroAgent pattern which associates each microagent with a service. Each microagent’s system instructions can be tailored for factors specific to its service. Natural language interactions define a durable, elastic interface for combining and coordinating loosely coupled microagents to accomplish complex tasks.

While this pattern may be intriguing, it certainly isn’t the only approach for defining AI agents and no single approach need be exclusive. For example, it’s easy to envision persona, role, or task-based agents existing alongside and utilizing MicroAgents.

Observations

We wanted to create a comparison between either approach for basic validation.

For this experiment, eight critical APIs and twelve irrelevant APIs were defined across 5 microagents. Not the scale of hundreds or thousands of API’s you might find at production scale, but sufficient surface initial qualitative differences.

The user message was to book a four night vacation to Hawaii with intentional ambiguity around current date, travel dates, home location, airport codes, and need for return flight.

The first thing that stood out is just how amazing gpt-4-turbo is at coordinating complex function calling. As expected, completion rates drop-off on lesser models.

Our experimental completion rates for either approach were effectively equivalent: 80% – 85%. Either approach would sometimes omit adding the trip to the calendar. The monoagent approach would sometimes fail to book a return trip and even once required additional confirmation before booking.

The first step either approach took was to consult the calendar on which dates might be available to travel. Each approach reliably executed the same steps:

Determine today: GetCurrentDate()
Retrieved the calendar events for next month by calling GetEvents(from, to)
Reason over calendar availability

We saw that while the monoagent orchestrated each function call, the microagent approach simply requested from the calendar agent to: Show me my availability for a four-night stay in Hawaii next month (letting the calendar microagent internally manage the steps required to process the request).

On the other hand, we saw the travel microagent engage in a classically iterative interaction as the various flight routes (different islands) were provided to the manager who subsequently initiated the booking request.

With either approach (mono or micro), we found that context is absolutely key. The absence of sufficient contextual data greatly impacts the models ability to reliably sustain task completion. While the microagent pattern provides an opportunity for higher domain proficiency (return trip booking), a mono-agent (by definition) maintains context (by coordinating everything).

Join us in building agents!

Explore these ideas by checking out the Semantic Kernel Agents package. We’ll soon release the code and a video walking through the experiments listed in this blog. We’d love to hear about your experience.

UPDATE: Find the code with a demo application and data on the MicroAgent approach here.