{"id":3227,"date":"2024-08-22T07:03:18","date_gmt":"2024-08-22T14:03:18","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/semantic-kernel\/?p=3227"},"modified":"2024-08-22T07:03:18","modified_gmt":"2024-08-22T14:03:18","slug":"diving-into-function-calling-and-its-json-schema-in-semantic-kernel-python","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/agent-framework\/diving-into-function-calling-and-its-json-schema-in-semantic-kernel-python\/","title":{"rendered":"Diving into Function Calling and its JSON Schema in Semantic Kernel Python"},"content":{"rendered":"<p>One of the most exciting features available in certain Large Language Models (LLMs) is function-calling. In Semantic Kernel, we handle the heavy lifting so that you can bring your own code or utilize built-in plugins that cater to your use case. Our goal is to make it easy for you to incorporate function calling into your application. Today, we&#8217;ll dive into how we create the function-calling JSON schema. This schema is a core piece of functionality that the model requires to decide which function to call for a given context.<\/p>\n<p>For those unfamiliar, function calling refers to executing local code, typically on a user&#8217;s machine or as part of their deployment, to satisfy a user&#8217;s question or query to an LLM. If you&#8217;ve ever asked an LLM to perform any sort of math (without using the code interpreter functionality), you may have noticed it sometimes gives incorrect results. The model predicts the next most probable token in a sequence, so how does it know how to properly add two large numbers like 102982 + 2828381? Unless it has encountered this specific operation frequently during training, it will make its best guess, which may lead to an incorrect answer. So, how do we improve interactions with the model that require a more deterministic approach to finding the correct answer? We use function calling. If you&#8217;d like to dive deeper into the underlying concepts of function calling, please visit OpenAI&#8217;s <a href=\"https:\/\/platform.openai.com\/docs\/guides\/function-calling\">function calling documentation<\/a>.<\/p>\n<p>Semantic Kernel provides various abstractions over different LLMs. When a user configures their execution settings to use <code>function_choice_behavior = FunctionChoiceBehavior.Auto()<\/code>, this triggers the SDK to include a special JSON payload with the <code>tools<\/code> attribute in the request settings during a request to the model. In this post, we&#8217;ll explore different Semantic Kernel plugin\/function configurations and examine how their JSON payloads are constructed and what information is included.<\/p>\n<h3>A Simple Example: Adding Two Numbers<\/h3>\n<p>Let&#8217;s start with a simple example. Suppose I want to add the two numbers mentioned earlier. I can write the following Semantic Kernel (SK) plugin\/function:<\/p>\n<pre><code>\r\nclass MyMathPlugin:\r\n    \"\"\"A sample Math Plugin.\"\"\"\r\n\r\n    @kernel_function(\r\n        name=\"add_numbers\", \r\n        description=\"Adds two numbers together and provides the result\",\r\n    )\r\n    def add_numbers(\r\n        self,\r\n        number_one: Annotated[int, \"The first number to add\"],\r\n        number_two: Annotated[int, \"The second number to add\"],\r\n    ) -&gt; Annotated[int, \"The result of adding the two numbers\"]:\r\n        \"\"\"A sample Semantic Kernel Function to add two numbers.\"\"\"\r\n        return number_one + number_two\r\n<\/code><\/pre>\n<p>Although the function and class are simple, let&#8217;s walk through it in detail. As the Python script containing this code is interpreted and begins running, the Semantic Kernel SDK looks for functions decorated with the <code>@kernel_function<\/code> decorator, parses them for their names, input parameters, and output parameters. This information is stored in <code>KernelParameterMetadata<\/code> which is part of a <code>KernelFunction<\/code>. The kernel function decorator code examines each input parameter, checking its type, whether it&#8217;s annotated, if it has a default value, and more. This is essential because, as we&#8217;ll cover shortly, not all kernel functions are defined with primitive type inputs like we&#8217;ve done in the simple example. Sometimes we need to dig deeper to understand the types involved, if those types have underlying types, and more. This allows us to accurately determine what the model must send back to the caller during a tool call (in this case, Semantic Kernel, which orchestrates the call to the model) so it can allow for the proper function invocation.<\/p>\n<p>While we&#8217;re here, another important aspect is the description provided in the <code>@kernel_function<\/code> decorator, and the annotations used for the input parameters. This information can help give the model more context about the inputs and how it should respond when it is making a tool call.<\/p>\n<h3>JSON Schema for the <code>add_numbers<\/code> Plugin<\/h3>\n<p>As mentioned, the information parsed from a kernel function is stored in <code>KernelParameterMetadata<\/code>. As we build the JSON schema for the parameter, we determine how the parameter is communicated to the model &#8212; is it an integer, a string, a boolean, or an object?<\/p>\n<p>Here\u2019s the JSON schema for the <code>add_numbers<\/code> plugin:<\/p>\n<pre><code>\r\n{\r\n    \"type\": \"function\",\r\n    \"function\": {\r\n        \"name\": \"math-add_numbers\",\r\n        \"description\": \"Adds two numbers together and provides the result\",\r\n        \"parameters\": {\r\n            \"type\": \"object\",\r\n            \"properties\": {\r\n                \"number_one\": {\r\n                    \"type\": \"integer\",\r\n                    \"description\": \"The first number to add\"\r\n                },\r\n                \"number_two\": {\r\n                    \"type\": \"integer\",\r\n                    \"description\": \"The second number to add\"\r\n                }\r\n            },\r\n            \"required\": [\r\n                \"number_one\", \"number_two\"\r\n            ]\r\n        }\r\n    }\r\n}\r\n<\/code><\/pre>\n<p>Just as we expected when defining our kernel function, the two input parameters (<code>number_one<\/code> and <code>number_two<\/code>) appear as <code>required<\/code>. When using this kernel plugin\/function, the model knows it must return two integers when calling this specific function.<\/p>\n<h3>A More Complex Example: Working with Complex Types<\/h3>\n<p>Let\u2019s look at a more complex plugin\/function definition to see how this works. We&#8217;ll start by defining a Pydantic model with two attributes\u2014a <code>start_date<\/code> and an <code>end_date<\/code>. The attributes use Pydantic\u2019s <code>Field<\/code> configuration, allowing us to specify with three dots (<code>...<\/code>) that the attribute is required, provide a description, and include examples of how the date should be formatted.<\/p>\n<pre><code>\r\nclass ComplexRequest(KernelBaseModel):\r\n    start_date: str = Field(\r\n        ...,\r\n        description=\"The start date in ISO 8601 format\",\r\n        examples=[\"2023-01-01\", \"2024-05-20\"],\r\n    )\r\n    end_date: str = Field(\r\n        ...,\r\n        description=\"The end date in ISO-8601 format\",\r\n        examples=[\"2023-01-01\", \"2024-05-20\"],\r\n    )\r\n<\/code><\/pre>\n<p>Next, we define a new plugin called <code>ComplexTypePlugin<\/code>. It contains one kernel function named <code>answer_request<\/code>. As we discussed earlier, the <code>kernel_function<\/code> decorator&#8217;s description provides high-level context to the model. The important input parameter for this kernel function is <code>request<\/code> of type <code>ComplexRequest<\/code>. When we parse the function&#8217;s input parameters, we store this object type in <code>KernelParameterMetadata<\/code> so that when building the JSON schema, we can properly recurse into the <code>ComplexRequest<\/code> class and grab the <code>start_date<\/code> and <code>end_date<\/code> attributes. These attributes could even be complex types themselves, but for this example, we don&#8217;t need to go that far.<\/p>\n<pre><code>\r\nclass ComplexTypePlugin:\r\n    @kernel_function(name=\"answer_request\", description=\"Answer a request\")\r\n    def book_holiday(\r\n        self, request: Annotated[ComplexRequest, \"A request to answer.\"]\r\n    ) -&gt; Annotated[\r\n        bool,\r\n        \"The result is the boolean value True if successful, False if unsuccessful.\",\r\n    ]:\r\n        return True\r\n<\/code><\/pre>\n<p>The great thing about using an LLM orchestrator like Semantic Kernel is that we take on the heavy lifting for you. If you interfaced directly with an LLM, you&#8217;d have to figure out how to structure the payloads, build the schema, and work with the proper content types.<\/p>\n<p>With this kernel function, we build the following schema for you:<\/p>\n<pre><code>\r\n{\r\n    \"type\": \"function\",\r\n    \"function\": {\r\n        \"name\": \"complex-answer_request\",\r\n        \"description\": \"Answer a request\",\r\n        \"parameters\": {\r\n            \"type\": \"object\",\r\n            \"properties\": {\r\n                \"request\": {\r\n                    \"type\": \"object\",\r\n                    \"properties\": {\r\n                        \"start_date\": {\r\n                            \"type\": \"string\",\r\n                            \"description\": \"The start date in ISO 8601 format\"\r\n                        },\r\n                        \"end_date\": {\r\n                            \"type\": \"string\",\r\n                            \"description\": \"The end date in ISO-8601 format\"\r\n                        }\r\n                    },\r\n                    \"required\": [\r\n                        \"start_date\",\r\n                        \"end_date\"\r\n                    ],\r\n                    \"description\": \"A request to answer.\"\r\n                }\r\n            },\r\n            \"required\": [\r\n                \"request\"\r\n            ]\r\n        }\r\n    }\r\n}\r\n<\/code><\/pre>\n<p>Similar to our <code>add_numbers<\/code> function, the JSON schema includes our <code>request<\/code> input parameter, and we dug into its type to identify that we need a <code>start_date<\/code> string and an <code>end_date<\/code> string. Both of these attributes are required, as shown in the schema (and by the fact that we didn&#8217;t add default values in the method parameters). One of the most useful aspects of function calling is that the model knows which parameters are required for a function and won&#8217;t attempt to perform the tool call until it has everything it needs if it decides to use a provided plugin or function based on the conversation&#8217;s context.<\/p>\n<p>For example, if I add this plugin\/function to a Kernel object, enable function calling, and ask the model:<\/p>\n<pre><code>User: &gt; Answer a request for me.\r\nAssistant: &gt; Certainly! Please provide me with the start date and end date for your request, and I shall endeavor to assist you with it!\r\n<\/code><\/pre>\n<p>I can then respond with:<\/p>\n<pre><code>User: &gt; The start date is Feb 10, 2023 to Mar 10, 2024.\r\nAssistant: &gt; The request has been successfully answered with a resounding \"True\"! If there are any further inquiries or additional assistance you seek, do not hesitate to ask!\r\n<\/code><\/pre>\n<p>You\u2019ll notice that I didn\u2019t need to format my exact dates in ISO-8601 format. We can tell the model how to format the dates, and it will follow along to the best of its ability. The following dates are what the model returned as part of the function call arguments:<\/p>\n<pre><code>\r\nrequest.start_date = '2023-02-10'\r\nrequest.end_date = '2024-03-10'\r\n<\/code><\/pre>\n<p>Then, our plugin code that expects the ISO-8601 format can proceed without a hitch.<\/p>\n<h3>Conclusion<\/h3>\n<p>I wanted to spend some time demystifying what goes on in the Semantic Kernel SDK related to function calling, building the JSON schema, and how to communicate required parameters and their types to the model. To sum up:<\/p>\n<ol>\n<li>We can provide string descriptions via the <code>kernel_function<\/code> decorator description parameter and input parameter annotations to help give the model more context about our function or parameters.<\/li>\n<li>We can define complex objects that use underlying complex type objects, along with annotations or Pydantic models\/Fields to give further information and context to the model, which can aid it in choosing the correct function to invoke to complete the user\u2019s query.<\/li>\n<\/ol>\n<p>For complete implementations around handling auto function calling in Python, please see our code samples <a href=\"https:\/\/github.com\/microsoft\/semantic-kernel\/tree\/main\/python\/samples\/concepts\/auto_function_calling\">here<\/a>.<\/p>\n<p>As we continue bringing you new features in Semantic Kernel, we want to highlight that we are tracking an <a href=\"https:\/\/github.com\/microsoft\/semantic-kernel\/issues\/7946\" target=\"_blank\" rel=\"noopener\">issue<\/a> related adding support for OpenAI\u2019s structured outputs. If you&#8217;re looking for a challenge on the Python side and want to help us out with this, please feel free to comment on the open issue we\u2019re tracking or create a <a href=\"https:\/\/github.com\/microsoft\/semantic-kernel\/discussions\" target=\"_blank\" rel=\"noopener\">GitHub discussion<\/a> to let us know you&#8217;d like to assist with the integration. We\u2019ll be happy to work with you.<\/p>\n<p>Thank you for your time, and we welcome any feedback related to this post or Semantic Kernel in general.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>One of the most exciting features available in certain Large Language Models (LLMs) is function-calling. In Semantic Kernel, we handle the heavy lifting so that you can bring your own code or utilize built-in plugins that cater to your use case. Our goal is to make it easy for you to incorporate function calling into [&hellip;]<\/p>\n","protected":false},"author":150043,"featured_media":2370,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[33,1],"tags":[92,93,67,53,9],"class_list":["post-3227","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-python","category-semantic-kernel","tag-function-calling","tag-json-schema","tag-plugin","tag-python","tag-semantic-kernel"],"acf":[],"blog_post_summary":"<p>One of the most exciting features available in certain Large Language Models (LLMs) is function-calling. In Semantic Kernel, we handle the heavy lifting so that you can bring your own code or utilize built-in plugins that cater to your use case. Our goal is to make it easy for you to incorporate function calling into [&hellip;]<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/posts\/3227","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/users\/150043"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/comments?post=3227"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/posts\/3227\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/media\/2370"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/media?parent=3227"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/categories?post=3227"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/tags?post=3227"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}