October 1st, 2024

Using JSON Schema for Structured Output in .NET for OpenAI Models

In one of the previous posts, we demonstrated how to use JSON Schema to get Structured Output with OpenAI and Python version of Semantic Kernel: Using JSON Schema for Structured Output in Python for OpenAI Models.

In this post, we will explore how to implement a JSON Schema-based structured output using .NET version of Semantic Kernel.

For more information on structured outputs with OpenAI, visit their official guide: OpenAI Structured Outputs Guide.

Why JSON Schema?

When interacting with AI models, especially in scenarios where consistency, clarity, and accuracy are important (such as tutoring or solving complex problems), the output must be predictable. JSON Schema ensures that responses are well-structured, follow a specified format, and can be easily deserialized by the system. This structure is key when building applications that rely on a specific format for further processing.

Supported Models for Structured Outputs

Azure OpenAI:

  • Access to gpt-4o-2024-08-06
  • The 2024-08-01-preview API version
  • See more information here.

OpenAI:

  • gpt-4o-mini-2024-07-18 and later
  • gpt-4o-2024-08-06 and later
  • See more information here.

In this example, we will use OpenAI gpt-4o-2024-08-06 model.

Structured Outputs with OpenAI.Chat.ChatResponseFormat

One of the approaches how to provide JSON Schema to OpenAI model with .NET version of Semantic Kernel is to initialize OpenAI.Chat.ChatResponseFormat object and set it in OpenAIPromptExecutionSettings.ResponseFormat property.

Let’s try to get a top 10 movies of all time from the model and specify JSON Schema of the desired response:

// Initialize kernel.
Kernel kernel = Kernel.CreateBuilder()
    .AddOpenAIChatCompletion(
        modelId: "gpt-4o-2024-08-06",
        apiKey: Environment.GetEnvironmentVariable("OpenAI__ApiKey"))
    .Build();

// Initialize ChatResponseFormat object with JSON schema of desired response format.
ChatResponseFormat chatResponseFormat = ChatResponseFormat.CreateJsonSchemaFormat(
    jsonSchemaFormatName: "movie_result",
    jsonSchema: BinaryData.FromString("""
        {
            "type": "object",
            "properties": {
                "Movies": {
                    "type": "array",
                    "items": {
                        "type": "object",
                        "properties": {
                            "Title": { "type": "string" },
                            "Director": { "type": "string" },
                            "ReleaseYear": { "type": "integer" },
                            "Rating": { "type": "number" },
                            "IsAvailableOnStreaming": { "type": "boolean" },
                            "Tags": { "type": "array", "items": { "type": "string" } }
                        },
                        "required": ["Title", "Director", "ReleaseYear", "Rating", "IsAvailableOnStreaming", "Tags"],
                        "additionalProperties": false
                    }
                }
            },
            "required": ["Movies"],
            "additionalProperties": false
        }
        """),
    jsonSchemaIsStrict: true);

// Specify response format by setting ChatResponseFormat object in prompt execution settings.
var executionSettings = new OpenAIPromptExecutionSettings
{
    ResponseFormat = chatResponseFormat
};

// Send a request and pass prompt execution settings with desired response format.
var result = await kernel.InvokePromptAsync("What are the top 10 movies of all time?", new(executionSettings));

Console.WriteLine(result);

In this approach, we define the JSON Schema from string and pass it to ChatResponseFormat.CreateJsonSchemaFormat method. Then we take newly created ChatResponseFormat and set it in OpenAIPromptExecutionSettings.ResponseFormat property. As a result, we get a string in JSON format which we can output to console:

Output:

{
    "Movies": [
        {
            "Title": "The Shawshank Redemption",
            "Director": "Frank Darabont",
            "ReleaseYear": 1994,
            "Rating": 9.3,
            "IsAvailableOnStreaming": true,
            "Tags": [
                "Drama"
            ]
        }
        // and more...
    ]
}

Because we provided JSON Schema for desired response format, we can be sure that this format will be the same in all AI model responses, which means that we can define a class with all of movie properties and deserialize the response to access each property in the code:

// Define response models
public class MovieResult
{
    public List<Movie> Movies { get; set; }
}

public class Movie
{
    public string Title { get; set; }

    public string Director { get; set; }

    public int ReleaseYear { get; set; }

    public double Rating { get; set; }

    public bool IsAvailableOnStreaming { get; set; }

    public List<string> Tags { get; set; }
}

// Send a request and pass prompt execution settings with desired response format.
var result = await kernel.InvokePromptAsync("What are the top 10 movies of all time?", new(executionSettings));

// Deserialize string response to a strong type to access type properties.
// At this point, the deserialization logic won't fail, because MovieResult type was specified as desired response format.
// This ensures that response string is a serialized version of MovieResult type.
var movieResult = JsonSerializer.Deserialize<MovieResult>(result.ToString());

This approach is flexible, since it allows to specify each property and its type within a string, which means that JSON schemas can be stored in separate files, for example, for reusability. On the other hand, an information about response format should be specified in two places – ChatResponseFormat object and then the class which we want to use for deserialization purposes. This can be improved with the following approach.

Structured Outputs with System.Type

This approach allows to define a desired response format using C# class or structure and then set its type directly in OpenAIPromptExecutionSettings.ResponseFormat. In this case, JSON schema generation will be performed automatically by Semantic Kernel. Let’s use the same MovieResult and Movie classes defined in previous example:

// Initialize kernel.
Kernel kernel = Kernel.CreateBuilder()
    .AddOpenAIChatCompletion(
        modelId: "gpt-4o-2024-08-06",
        apiKey: Environment.GetEnvironmentVariable("OpenAI__ApiKey"))
    .Build();

// Specify response format by setting Type object in prompt execution settings.
var executionSettings = new OpenAIPromptExecutionSettings
{
    ResponseFormat = typeof(MovieResult)
};

// Send a request and pass prompt execution settings with desired response format.
var result = await kernel.InvokePromptAsync("What are the top 10 movies of all time?", new(executionSettings));

// Deserialize string response to a strong type to access type properties.
// At this point, the deserialization logic won't fail, because MovieResult type was specified as desired response format.
// This ensures that response string is a serialized version of MovieResult type.
var movieResult = JsonSerializer.Deserialize<MovieResult>(result.ToString());

// Output the result
for (var i = 0; i < movieResult.Movies.Count; i++)
{
    var movie = movieResult.Movies[i];

    Console.WriteLine($"Movie #{i + 1}");
    Console.WriteLine($"Title: {movie.Title}");
    Console.WriteLine($"Director: {movie.Director}");
    Console.WriteLine($"Release year: {movie.ReleaseYear}");
    Console.WriteLine($"Rating: {movie.Rating}");
    Console.WriteLine($"Is available on streaming: {movie.IsAvailableOnStreaming}");
    Console.WriteLine($"Tags: {string.Join(",", movie.Tags)}");
}

Output:

Movie #1
Title: The Shawshank Redemption
Director: Frank Darabont
Release year: 1994
Rating: 9.3
Is available on streaming: True
Tags: Drama
// and more...

With this approach, it’s possible to define desired response format once using C# class or structure and use the same type to deserialize model response and work with its properties. In this case, Semantic Kernel generates JSON Schema automatically based on provided type.

Limitations

Some keywords from JSON Schema are not supported in OpenAI Structured Outputs yet. For example, format keyword for strings is not supported. It means that JSON Schema won’t be accepted when response format type contains properties with types DateTime, DateTimeOffset, DateOnly, TimeSpan, TimeOnly, Uri. One of the workarounds would be to define these properties as strings.

Structured Outputs with Function Calling

Desired response format can be also configured as a final response format in Function Calling loop. Let’s define a plugin which returns a collection of email body messages:

// Define plugin
public sealed class EmailPlugin
{
    [KernelFunction]
    public List<string> GetEmails()
    {
        return
        [
            "Hey, just checking in to see how you're doing!",
            "Can you pick up some groceries on your way back home? We need milk and bread.",
            "Happy Birthday! Wishing you a fantastic day filled with love and joy.",
            "Let's catch up over coffee this Saturday. It's been too long!",
            "Please review the attached document and provide your feedback by EOD.",
        ];
    }
}

We also need to define a type of desired response format:

// Define response format types
public sealed class EmailResult
{
    public List<Email> Emails { get; set; }
}

public sealed class Email
{
    public string Body { get; set; }

    public string Category { get; set; }
}

Now, we can specify a response format, enable automatic function calling and send a request:

// Initialize kernel.
Kernel kernel = Kernel.CreateBuilder()
    .AddOpenAIChatCompletion(
        modelId: "gpt-4o-2024-08-06",
        apiKey: Environment.GetEnvironmentVariable("OpenAI__ApiKey"))
    .Build();

// Import plugin
kernel.ImportPluginFromType<EmailPlugin>();

var executionSettings = new OpenAIPromptExecutionSettings
{
    ResponseFormat = typeof(EmailResult), // Specify response format
    FunctionChoiceBehavior = FunctionChoiceBehavior.Auto() // Enable automatic function calling
};

// Send a request and pass prompt execution settings with desired response format.
var result = await kernel.InvokePromptAsync("Process the emails.", new(executionSettings));

// Deserialize string response to a strong type to access type properties.
// At this point, the deserialization logic won't fail, because EmailResult type was specified as desired response format.
// This ensures that response string is a serialized version of EmailResult type.
var emailResult = JsonSerializer.Deserialize<EmailResult>(result.ToString());

for (var i = 0; i < emailResult.Emails.Count; i++)
{
    var email = emailResult.Emails[i];

    Console.WriteLine($"Email #{i + 1}");
    Console.WriteLine($"Body: {email.Body}");
    Console.WriteLine($"Category: {email.Category}");
}

Note that the prompt is as simple as Process the emails. We don’t specify how exactly we want AI model to process the emails, but as response format we want to get two properties in Email type – Body and Category. In this way, we can control model’s behavior by using response format instead of user/system prompt.

Output:

Email #1
Body: Hey, just checking in to see how you're doing!
Category: Personal
Email #2
Body: Can you pick up some groceries on your way back home? We need milk and bread.
Category: Reminder
Email #3
Body: Happy Birthday! Wishing you a fantastic day filled with love and joy.
Category: Personal
Email #4
Body: Let's catch up over coffee this Saturday. It's been too long!
Category: Meeting
Email #5
Body: Please review the attached document and provide your feedback by EOD.
Category: Work

Conclusion

Using JSON Schema and a framework like Semantic Kernel allows you to control the format of AI-generated responses, ensuring that the output is structured, predictable, and easy to use. This .NET approach is particularly useful for applications that require consistent and well-structured output, such as educational tools or automated systems.

Please reach out if you have any questions or feedback through our Semantic Kernel GitHub Discussion Channel. We look forward to hearing from you! We would also love your support — if you’ve enjoyed using Semantic Kernel, give us a star on GitHub.

Author

Dmytro Struk
Senior Software Engineer

2 comments

Discussion is closed. Login to edit/delete existing comments.

  • Sarosh Wadia

    The example works ok except for the following line:

    Console.WriteLine($”Tags: {string.Join(“,”, movie.Tags)}”);
    The above line simply repeats the word Tags instead of the Tags value (Name)

    this is one of the proposed applicable change:
    Console.WriteLine($”Tags: {string.Join(“,”, movie.Tags.Select(p => p.Name))}”);

    • Dmytro StrukMicrosoft employee · Edited

      Thanks for your feedback! There was a character escaping issue, instead of “List Tags” the property should be “List<string> Tags”. In this case, the output should work correctly. I’ve updated examples with correct version. Thanks again!