September 26th, 2024

Using JSON Schema for Structured Output in Python for OpenAI Models

Evan Mattson
Senior Software Engineer

In working with AI applications, ensuring that the output generated by a language model is structured and follows a consistent format is critical—especially when handling complex tasks like solving math problems. A powerful way to achieve this is through the use of JSON Schema, which allows the AI model to produce outputs that align with a predefined structure, making them easy to parse and use in downstream applications.

In this post, we will explore how to implement a JSON Schema-based structured output using Semantic Kernel, a feature introduced in version 1.10.0. Specifically, we’ll look at how to guide an AI-powered math tutor to provide step-by-step solutions to math problems with structured output.

For more information on structured outputs with OpenAI, visit their official guide: OpenAI Structured Outputs Guide.

Why JSON Schema?

When interacting with AI models, especially in scenarios where consistency, clarity, and accuracy are important (such as tutoring or solving complex problems), the output must be predictable. JSON Schema ensures that responses are well-structured, follow a specified format, and can be easily parsed by the system. This structure is key when building applications that rely on a specific format for further processing.

The Problem Scenario

Let’s assume we want to create an AI math tutor that helps users solve algebraic equations step by step. The model will not only compute the final answer but will also guide the user through the entire reasoning process. To ensure that the AI returns structured output, we will define a JSON Schema, which the AI will use to deliver the result.

Python-Based Approach with Semantic Kernel

import asyncio
from semantic_kernel import Kernel
from semantic_kernel.connectors.ai.function_choice_behavior import FunctionChoiceBehavior
from semantic_kernel.connectors.ai.open_ai.services.azure_chat_completion import AzureChatCompletion
from semantic_kernel.connectors.ai.open_ai.services.open_ai_chat_completion import OpenAIChatCompletion
from semantic_kernel.contents import ChatHistory
from semantic_kernel.contents.streaming_chat_message_content import StreamingChatMessageContent

In this setup, we import the necessary components for the Semantic Kernel framework. The AI service will interact with the OpenAI or Azure OpenAI completion API to generate step-by-step solutions.

Defining the JSON Schema Using Pydantic

In this example, we will use the Pydantic library, which allows for easy definition of data models that can be used to validate and serialize the structured output.

from semantic_kernel.kernel_pydantic import KernelBaseModel

class Step(KernelBaseModel):
    explanation: str
    output: str

class Reasoning(KernelBaseModel):
    steps: list[Step]
    final_answer: str

This defines a Step model where each step contains an explanation and an output. The Reasoning model aggregates these steps and provides a final answer. The JSON Schema based on this Pydantic model will structure the response returned by the LLM.

Non-Pydantic Model Option

If Pydantic isn’t a requirement for your application, you can opt to use a non-Pydantic approach to define the structured output. Here’s an example:


class Step:
    explanation: str
    output: str

class Reasoning:
    steps: list[Step]
    final_answer: str

This non-Pydantic version provides flexibility if you prefer not to rely on the Pydantic library for schema validation and serialization.

Note: Supported Models for Structured Outputs

Azure OpenAI:

  • Access to gpt-4o-2024-08-06 or later
  • The 2024-08-01-preview API version
  • See more information here.

OpenAI:

  • The OpenAI models supported are:
    • gpt-4o-mini-2024-07-18 and later
    • gpt-4o-2024-08-06 and later

Connecting to OpenAI

With the model in place, the next step is to set up the connection to OpenAI or Azure OpenAI services. Depending on the service being used, you can flip a flag to specify whether to use Azure OpenAI or standard OpenAI services:

use_azure_openai = False
kernel = Kernel()

service_id = "structured-output"
if use_azure_openai:
    chat_service = AzureChatCompletion(service_id=service_id)
else:
    chat_service = OpenAIChatCompletion(service_id=service_id)

kernel.add_service(chat_service)

Configuring the Prompt and Settings

We then define a system message that tells the AI how to behave, followed by settings that control the format of the output:

system_message = """
You are a helpful math tutor. Guide the user through the solution step by step.
"""

req_settings = kernel.get_prompt_execution_settings_from_service_id(service_id=service_id)
req_settings.max_tokens = 2000
req_settings.temperature = 0.7
req_settings.top_p = 0.8
req_settings.function_choice_behavior = FunctionChoiceBehavior.Auto(filters={"excluded_plugins": ["chat"]})
req_settings.response_format = Reasoning

Here, we tell the AI to behave as a math tutor and specify that the response must follow the Reasoning JSON Schema format.

Adding User Input and Running the Model

Finally, we simulate a user asking a math question. The AI will process the query and guide the user through the solution step by step:

history = ChatHistory()
history.add_user_message("how can I solve 8x + 7y = -23, and 4x=12?")

async def main():
    stream = True
    if stream:
        answer = kernel.invoke_stream(
            chat_function,
            chat_history=history,
        )
        print("AI Tutor:> ", end="")
        result_content: list[StreamingChatMessageContent] = []
        async for message in answer:
            result_content.append(message[0])
            print(str(message[0]), end="", flush=True)
        if result_content:
            result = "".join([str(content) for content in result_content])
    else:
        result = await kernel.invoke(
            chat_function,
            chat_history=history,
        )
        print(f"AI Tutor:> {result}")
    history.add_assistant_message(str(result))

if __name__ == "__main__":
    asyncio.run(main())

Output Example


{
    "steps": [
        {
            "explanation": "We start with the second equation because it has only one variable, which makes it simpler to solve. The equation is 4x = 12. To find the value of x, divide both sides of the equation by 4.",
            "output": "x = 3"
        },
        {
            "explanation": "Now that we know x = 3, we can substitute this value into the first equation to find the value of y. The first equation is 8x + 7y = -23. Substitute x = 3 into this equation.",
            "output": "8(3) + 7y = -23"
        },
        {
            "explanation": "Calculate 8 times 3 to simplify the equation. This gives us 24 + 7y = -23.",
            "output": "24 + 7y = -23"
        },
        {
            "explanation": "To isolate 7y, subtract 24 from both sides of the equation. This gives us 7y = -23 - 24.",
            "output": "7y = -47"
        },
        {
            "explanation": "Now, divide both sides of the equation by 7 to solve for y.",
            "output": "y = -47/7"
        },
        {
            "explanation": "Simplify the fraction -47/7 if possible. In this case, it cannot be simplified further, so we leave it as is.",
            "output": "y = -47/7"
        }
    ],
    "final_answer": "x = 3, y = -47/7"
}

For the full implementation, you can view the source code on GitHub: GitHub Repository.

Conclusion

Using JSON Schema and a framework like Semantic Kernel allows you to control the format of AI-generated responses, ensuring that the output is structured, predictable, and easy to use. Whether you use a Pydantic or non-Pydantic model, this approach is particularly useful for applications that require consistent and well-structured output, such as educational tools or automated systems.

Please reach out if you have any questions or feedback through our Semantic Kernel GitHub Discussion Channel. We look forward to hearing from you! We would also love your support — if you’ve enjoyed using Semantic Kernel, give us a star on GitHub.

Author

Evan Mattson
Senior Software Engineer

0 comments