{"id":3435,"date":"2024-09-26T10:53:36","date_gmt":"2024-09-26T17:53:36","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/semantic-kernel\/?p=3435"},"modified":"2025-02-06T16:53:11","modified_gmt":"2025-02-07T00:53:11","slug":"using-json-schema-for-structured-output-in-python-for-openai-models","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/agent-framework\/using-json-schema-for-structured-output-in-python-for-openai-models\/","title":{"rendered":"Using JSON Schema for Structured Output in Python for OpenAI Models"},"content":{"rendered":"<p>In working with AI applications, ensuring that the output generated by a language model is structured and follows a consistent format is critical\u2014especially when handling complex tasks like solving math problems. A powerful way to achieve this is through the use of JSON Schema, which allows the AI model to produce outputs that align with a predefined structure, making them easy to parse and use in downstream applications.<\/p>\n<p>In this post, we will explore how to implement a JSON Schema-based structured output using <strong>Semantic Kernel, <\/strong>a feature introduced in version 1.10.0. Specifically, we\u2019ll look at how to guide an AI-powered math tutor to provide step-by-step solutions to math problems with structured output.<\/p>\n<p>For more information on structured outputs with OpenAI, visit their official guide: <a href=\"https:\/\/platform.openai.com\/docs\/guides\/structured-outputs\/introduction\" target=\"_blank\" rel=\"noopener\">OpenAI Structured Outputs Guide<\/a>.<\/p>\n<h2>Why JSON Schema?<\/h2>\n<p>When interacting with AI models, especially in scenarios where consistency, clarity, and accuracy are important (such as tutoring or solving complex problems), the output must be predictable. JSON Schema ensures that responses are well-structured, follow a specified format, and can be easily parsed by the system. This structure is key when building applications that rely on a specific format for further processing.<\/p>\n<h2>The Problem Scenario<\/h2>\n<p>Let\u2019s assume we want to create an AI math tutor that helps users solve algebraic equations step by step. The model will not only compute the final answer but will also guide the user through the entire reasoning process. To ensure that the AI returns structured output, we will define a JSON Schema, which the AI will use to deliver the result.<\/p>\n<h3>Python-Based Approach with Semantic Kernel<\/h3>\n<pre><code>import asyncio\r\nfrom semantic_kernel import Kernel\r\nfrom semantic_kernel.connectors.ai.function_choice_behavior import FunctionChoiceBehavior\r\nfrom semantic_kernel.connectors.ai.open_ai.services.azure_chat_completion import AzureChatCompletion\r\nfrom semantic_kernel.connectors.ai.open_ai.services.open_ai_chat_completion import OpenAIChatCompletion\r\nfrom semantic_kernel.contents import ChatHistory\r\nfrom semantic_kernel.contents.streaming_chat_message_content import StreamingChatMessageContent\r\n<\/code><\/pre>\n<p>In this setup, we import the necessary components for the Semantic Kernel framework. The AI service will interact with the OpenAI or Azure OpenAI completion API to generate step-by-step solutions.<\/p>\n<h2>Defining the JSON Schema Using Pydantic<\/h2>\n<p>In this example, we will use the <strong>Pydantic<\/strong> library, which allows for easy definition of data models that can be used to validate and serialize the structured output.<\/p>\n<pre><code>from semantic_kernel.kernel_pydantic import KernelBaseModel\r\n\r\nclass Step(KernelBaseModel):\r\n    explanation: str\r\n    output: str\r\n\r\nclass Reasoning(KernelBaseModel):\r\n    steps: list[Step]\r\n    final_answer: str\r\n<\/code><\/pre>\n<p>This defines a <strong>Step<\/strong> model where each step contains an explanation and an output. The <strong>Reasoning<\/strong> model aggregates these steps and provides a final answer. The JSON Schema based on this Pydantic model will structure the response returned by the LLM.<\/p>\n<h2>Non-Pydantic Model Option<\/h2>\n<p>If Pydantic isn\u2019t a requirement for your application, you can opt to use a non-Pydantic approach to define the structured output. Here\u2019s an example:<\/p>\n<pre><code>\r\nclass Step:\r\n    explanation: str\r\n    output: str\r\n\r\nclass Reasoning:\r\n    steps: list[Step]\r\n    final_answer: str\r\n<\/code><\/pre>\n<p>This non-Pydantic version provides flexibility if you prefer not to rely on the Pydantic library for schema validation and serialization.<\/p>\n<h2>Note: Supported Models for Structured Outputs<\/h2>\n<h3>Azure OpenAI:<\/h3>\n<ul>\n<li>Access to <code>gpt-4o-2024-08-06<\/code> or later<\/li>\n<li>The <code>2024-08-01-preview<\/code> API version<\/li>\n<li>See more information <a href=\"https:\/\/learn.microsoft.com\/en-us\/azure\/ai-services\/openai\/how-to\/structured-outputs?tabs=python-secure\" target=\"_blank\" rel=\"noopener\">here<\/a>.<\/li>\n<\/ul>\n<h3>OpenAI:<\/h3>\n<ul>\n<li>The OpenAI models supported are:\n<ul>\n<li><code>gpt-4o-mini-2024-07-18<\/code> and later<\/li>\n<li><code>gpt-4o-2024-08-06<\/code> and later<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<h2>Connecting to OpenAI<\/h2>\n<p>With the model in place, the next step is to set up the connection to OpenAI or Azure OpenAI services. Depending on the service being used, you can flip a flag to specify whether to use Azure OpenAI or standard OpenAI services:<\/p>\n<pre><code>use_azure_openai = False\r\nkernel = Kernel()\r\n\r\nservice_id = \"structured-output\"\r\nif use_azure_openai:\r\n    chat_service = AzureChatCompletion(service_id=service_id)\r\nelse:\r\n    chat_service = OpenAIChatCompletion(service_id=service_id)\r\n\r\nkernel.add_service(chat_service)\r\n<\/code><\/pre>\n<h2>Configuring the Prompt and Settings<\/h2>\n<p>We then define a system message that tells the AI how to behave, followed by settings that control the format of the output:<\/p>\n<pre><code>system_message = \"\"\"\r\nYou are a helpful math tutor. Guide the user through the solution step by step.\r\n\"\"\"\r\n\r\nreq_settings = kernel.get_prompt_execution_settings_from_service_id(service_id=service_id)\r\nreq_settings.max_tokens = 2000\r\nreq_settings.temperature = 0.7\r\nreq_settings.top_p = 0.8\r\nreq_settings.function_choice_behavior = FunctionChoiceBehavior.Auto(filters={\"excluded_plugins\": [\"chat\"]})\r\nreq_settings.response_format = Reasoning\r\n<\/code><\/pre>\n<p>Here, we tell the AI to behave as a math tutor and specify that the response must follow the <strong>Reasoning<\/strong> JSON Schema format.<\/p>\n<h2>Adding User Input and Running the Model<\/h2>\n<p>Finally, we simulate a user asking a math question. The AI will process the query and guide the user through the solution step by step:<\/p>\n<pre><code>history = ChatHistory()\r\nhistory.add_user_message(\"how can I solve 8x + 7y = -23, and 4x=12?\")\r\n\r\nasync def main():\r\n    stream = True\r\n    if stream:\r\n        answer = kernel.invoke_stream(\r\n            chat_function,\r\n            chat_history=history,\r\n        )\r\n        print(\"AI Tutor:&gt; \", end=\"\")\r\n        result_content: list[StreamingChatMessageContent] = []\r\n        async for message in answer:\r\n            result_content.append(message[0])\r\n            print(str(message[0]), end=\"\", flush=True)\r\n        if result_content:\r\n            result = \"\".join([str(content) for content in result_content])\r\n    else:\r\n        result = await kernel.invoke(\r\n            chat_function,\r\n            chat_history=history,\r\n        )\r\n        print(f\"AI Tutor:&gt; {result}\")\r\n    history.add_assistant_message(str(result))\r\n\r\nif __name__ == \"__main__\":\r\n    asyncio.run(main())\r\n<\/code><\/pre>\n<h2>Output Example<\/h2>\n<pre><code>\r\n{\r\n    \"steps\": [\r\n        {\r\n            \"explanation\": \"We start with the second equation because it has only one variable, which makes it simpler to solve. The equation is 4x = 12. To find the value of x, divide both sides of the equation by 4.\",\r\n            \"output\": \"x = 3\"\r\n        },\r\n        {\r\n            \"explanation\": \"Now that we know x = 3, we can substitute this value into the first equation to find the value of y. The first equation is 8x + 7y = -23. Substitute x = 3 into this equation.\",\r\n            \"output\": \"8(3) + 7y = -23\"\r\n        },\r\n        {\r\n            \"explanation\": \"Calculate 8 times 3 to simplify the equation. This gives us 24 + 7y = -23.\",\r\n            \"output\": \"24 + 7y = -23\"\r\n        },\r\n        {\r\n            \"explanation\": \"To isolate 7y, subtract 24 from both sides of the equation. This gives us 7y = -23 - 24.\",\r\n            \"output\": \"7y = -47\"\r\n        },\r\n        {\r\n            \"explanation\": \"Now, divide both sides of the equation by 7 to solve for y.\",\r\n            \"output\": \"y = -47\/7\"\r\n        },\r\n        {\r\n            \"explanation\": \"Simplify the fraction -47\/7 if possible. In this case, it cannot be simplified further, so we leave it as is.\",\r\n            \"output\": \"y = -47\/7\"\r\n        }\r\n    ],\r\n    \"final_answer\": \"x = 3, y = -47\/7\"\r\n}\r\n<\/code><\/pre>\n<p>For the full implementation, you can view the source code on GitHub: <a href=\"https:\/\/github.com\/microsoft\/semantic-kernel\/blob\/main\/python\/samples\/concepts\/structured_outputs\/json_structured_outputs.py\" target=\"_blank\" rel=\"noopener\">GitHub Repository<\/a>.<\/p>\n<h2>Conclusion<\/h2>\n<p>Using JSON Schema and a framework like Semantic Kernel allows you to control the format of AI-generated responses, ensuring that the output is structured, predictable, and easy to use. Whether you use a Pydantic or non-Pydantic model, this approach is particularly useful for applications that require consistent and well-structured output, such as educational tools or automated systems.<\/p>\n<p>Please reach out if you have any questions or feedback through our\u00a0<a href=\"https:\/\/github.com\/microsoft\/semantic-kernel\/discussions\/categories\/general\" target=\"_blank\" rel=\"noopener\">Semantic Kernel GitHub Discussion Channel<\/a>. We look forward to hearing from you! We would also love your support &#8212; if you\u2019ve enjoyed using Semantic Kernel, give us a star on <a href=\"https:\/\/github.com\/microsoft\/semantic-kernel\" target=\"_blank\" rel=\"noopener\">GitHub<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In working with AI applications, ensuring that the output generated by a language model is structured and follows a consistent format is critical\u2014especially when handling complex tasks like solving math problems. A powerful way to achieve this is through the use of JSON Schema, which allows the AI model to produce outputs that align with [&hellip;]<\/p>\n","protected":false},"author":150043,"featured_media":2364,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[34,1],"tags":[93,53,9,99],"class_list":["post-3435","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-python-2","category-semantic-kernel","tag-json-schema","tag-python","tag-semantic-kernel","tag-structured-outputs"],"acf":[],"blog_post_summary":"<p>In working with AI applications, ensuring that the output generated by a language model is structured and follows a consistent format is critical\u2014especially when handling complex tasks like solving math problems. A powerful way to achieve this is through the use of JSON Schema, which allows the AI model to produce outputs that align with [&hellip;]<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/posts\/3435","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/users\/150043"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/comments?post=3435"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/posts\/3435\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/media\/2364"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/media?parent=3435"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/categories?post=3435"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/tags?post=3435"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}