August 22nd, 2024

Diving into Function Calling and its JSON Schema in Semantic Kernel Python

Evan Mattson
Senior Software Engineer

One of the most exciting features available in certain Large Language Models (LLMs) is function-calling. In Semantic Kernel, we handle the heavy lifting so that you can bring your own code or utilize built-in plugins that cater to your use case. Our goal is to make it easy for you to incorporate function calling into your application. Today, we’ll dive into how we create the function-calling JSON schema. This schema is a core piece of functionality that the model requires to decide which function to call for a given context.

For those unfamiliar, function calling refers to executing local code, typically on a user’s machine or as part of their deployment, to satisfy a user’s question or query to an LLM. If you’ve ever asked an LLM to perform any sort of math (without using the code interpreter functionality), you may have noticed it sometimes gives incorrect results. The model predicts the next most probable token in a sequence, so how does it know how to properly add two large numbers like 102982 + 2828381? Unless it has encountered this specific operation frequently during training, it will make its best guess, which may lead to an incorrect answer. So, how do we improve interactions with the model that require a more deterministic approach to finding the correct answer? We use function calling. If you’d like to dive deeper into the underlying concepts of function calling, please visit OpenAI’s function calling documentation.

Semantic Kernel provides various abstractions over different LLMs. When a user configures their execution settings to use function_choice_behavior = FunctionChoiceBehavior.Auto(), this triggers the SDK to include a special JSON payload with the tools attribute in the request settings during a request to the model. In this post, we’ll explore different Semantic Kernel plugin/function configurations and examine how their JSON payloads are constructed and what information is included.

A Simple Example: Adding Two Numbers

Let’s start with a simple example. Suppose I want to add the two numbers mentioned earlier. I can write the following Semantic Kernel (SK) plugin/function:


class MyMathPlugin:
    """A sample Math Plugin."""

    @kernel_function(
        name="add_numbers", 
        description="Adds two numbers together and provides the result",
    )
    def add_numbers(
        self,
        number_one: Annotated[int, "The first number to add"],
        number_two: Annotated[int, "The second number to add"],
    ) -> Annotated[int, "The result of adding the two numbers"]:
        """A sample Semantic Kernel Function to add two numbers."""
        return number_one + number_two

Although the function and class are simple, let’s walk through it in detail. As the Python script containing this code is interpreted and begins running, the Semantic Kernel SDK looks for functions decorated with the @kernel_function decorator, parses them for their names, input parameters, and output parameters. This information is stored in KernelParameterMetadata which is part of a KernelFunction. The kernel function decorator code examines each input parameter, checking its type, whether it’s annotated, if it has a default value, and more. This is essential because, as we’ll cover shortly, not all kernel functions are defined with primitive type inputs like we’ve done in the simple example. Sometimes we need to dig deeper to understand the types involved, if those types have underlying types, and more. This allows us to accurately determine what the model must send back to the caller during a tool call (in this case, Semantic Kernel, which orchestrates the call to the model) so it can allow for the proper function invocation.

While we’re here, another important aspect is the description provided in the @kernel_function decorator, and the annotations used for the input parameters. This information can help give the model more context about the inputs and how it should respond when it is making a tool call.

JSON Schema for the add_numbers Plugin

As mentioned, the information parsed from a kernel function is stored in KernelParameterMetadata. As we build the JSON schema for the parameter, we determine how the parameter is communicated to the model — is it an integer, a string, a boolean, or an object?

Here’s the JSON schema for the add_numbers plugin:


{
    "type": "function",
    "function": {
        "name": "math-add_numbers",
        "description": "Adds two numbers together and provides the result",
        "parameters": {
            "type": "object",
            "properties": {
                "number_one": {
                    "type": "integer",
                    "description": "The first number to add"
                },
                "number_two": {
                    "type": "integer",
                    "description": "The second number to add"
                }
            },
            "required": [
                "number_one", "number_two"
            ]
        }
    }
}

Just as we expected when defining our kernel function, the two input parameters (number_one and number_two) appear as required. When using this kernel plugin/function, the model knows it must return two integers when calling this specific function.

A More Complex Example: Working with Complex Types

Let’s look at a more complex plugin/function definition to see how this works. We’ll start by defining a Pydantic model with two attributes—a start_date and an end_date. The attributes use Pydantic’s Field configuration, allowing us to specify with three dots (...) that the attribute is required, provide a description, and include examples of how the date should be formatted.


class ComplexRequest(KernelBaseModel):
    start_date: str = Field(
        ...,
        description="The start date in ISO 8601 format",
        examples=["2023-01-01", "2024-05-20"],
    )
    end_date: str = Field(
        ...,
        description="The end date in ISO-8601 format",
        examples=["2023-01-01", "2024-05-20"],
    )

Next, we define a new plugin called ComplexTypePlugin. It contains one kernel function named answer_request. As we discussed earlier, the kernel_function decorator’s description provides high-level context to the model. The important input parameter for this kernel function is request of type ComplexRequest. When we parse the function’s input parameters, we store this object type in KernelParameterMetadata so that when building the JSON schema, we can properly recurse into the ComplexRequest class and grab the start_date and end_date attributes. These attributes could even be complex types themselves, but for this example, we don’t need to go that far.


class ComplexTypePlugin:
    @kernel_function(name="answer_request", description="Answer a request")
    def book_holiday(
        self, request: Annotated[ComplexRequest, "A request to answer."]
    ) -> Annotated[
        bool,
        "The result is the boolean value True if successful, False if unsuccessful.",
    ]:
        return True

The great thing about using an LLM orchestrator like Semantic Kernel is that we take on the heavy lifting for you. If you interfaced directly with an LLM, you’d have to figure out how to structure the payloads, build the schema, and work with the proper content types.

With this kernel function, we build the following schema for you:


{
    "type": "function",
    "function": {
        "name": "complex-answer_request",
        "description": "Answer a request",
        "parameters": {
            "type": "object",
            "properties": {
                "request": {
                    "type": "object",
                    "properties": {
                        "start_date": {
                            "type": "string",
                            "description": "The start date in ISO 8601 format"
                        },
                        "end_date": {
                            "type": "string",
                            "description": "The end date in ISO-8601 format"
                        }
                    },
                    "required": [
                        "start_date",
                        "end_date"
                    ],
                    "description": "A request to answer."
                }
            },
            "required": [
                "request"
            ]
        }
    }
}

Similar to our add_numbers function, the JSON schema includes our request input parameter, and we dug into its type to identify that we need a start_date string and an end_date string. Both of these attributes are required, as shown in the schema (and by the fact that we didn’t add default values in the method parameters). One of the most useful aspects of function calling is that the model knows which parameters are required for a function and won’t attempt to perform the tool call until it has everything it needs if it decides to use a provided plugin or function based on the conversation’s context.

For example, if I add this plugin/function to a Kernel object, enable function calling, and ask the model:

User: > Answer a request for me.
Assistant: > Certainly! Please provide me with the start date and end date for your request, and I shall endeavor to assist you with it!

I can then respond with:

User: > The start date is Feb 10, 2023 to Mar 10, 2024.
Assistant: > The request has been successfully answered with a resounding "True"! If there are any further inquiries or additional assistance you seek, do not hesitate to ask!

You’ll notice that I didn’t need to format my exact dates in ISO-8601 format. We can tell the model how to format the dates, and it will follow along to the best of its ability. The following dates are what the model returned as part of the function call arguments:


request.start_date = '2023-02-10'
request.end_date = '2024-03-10'

Then, our plugin code that expects the ISO-8601 format can proceed without a hitch.

Conclusion

I wanted to spend some time demystifying what goes on in the Semantic Kernel SDK related to function calling, building the JSON schema, and how to communicate required parameters and their types to the model. To sum up:

  1. We can provide string descriptions via the kernel_function decorator description parameter and input parameter annotations to help give the model more context about our function or parameters.
  2. We can define complex objects that use underlying complex type objects, along with annotations or Pydantic models/Fields to give further information and context to the model, which can aid it in choosing the correct function to invoke to complete the user’s query.

For complete implementations around handling auto function calling in Python, please see our code samples here.

As we continue bringing you new features in Semantic Kernel, we want to highlight that we are tracking an issue related adding support for OpenAI’s structured outputs. If you’re looking for a challenge on the Python side and want to help us out with this, please feel free to comment on the open issue we’re tracking or create a GitHub discussion to let us know you’d like to assist with the integration. We’ll be happy to work with you.

Thank you for your time, and we welcome any feedback related to this post or Semantic Kernel in general.

Author

Evan Mattson
Senior Software Engineer

0 comments

Discussion are closed.