Enhanced Automation in Python: Auto Tool Calling for OpenAI Models in the Semantic Kernel SDK

Evan Mattson

Eduard van Valkenburg

Greetings, Semantic Kernel Python developers and enthusiasts! We’re happy to share a significant update to the Semantic Kernel Python SDK now available in 0.9.1b1 — a leap towards more efficient and streamlined OpenAI model integration. Your feedback, the need to align with the .NET Semantic Kernel SDK, and our commitment to enhancing developer experiences have guided this feature. Let’s dive into what’s new and how it can transform your workflow.

The Initial Challenge: Manual Tool Integration

Previous to this feature, integrating OpenAI models and using tool calls in your projects has traditionally required a hands-on approach. Developers needed to manually manage these tool calls, process results, and ensure seamless communication back to the models. Effective, yet time-consuming, this method often diverted attention from more strategic tasks. A separate utility method, which may not have been very apparent, also handled this. There was a lot of hand-holding involved to get a result from the prompt function and its function result.

Our Solution: Streamlined Auto Tool Calling

Aiming to refine this process, we’ve introduced an innovative feature: automatic tool calling. This development is born from our aspiration to simplify interactions with OpenAI models, making them more intuitive and less labor-intensive. By integrating the kernel directly with our OpenAIChatCompletionBase, we’ve enabled the SDK to automatically manage tool calls initiated by OpenAI models. This integration not only reduces manual overhead but also enriches the flexibility of our SDK, catering to both automated and manual preferences. Note that the OpenAI model requires forming the tool_call responses in a particular order: you must respond to all tool call IDs returned from the model with the tool_call_id, the function name, and a result. Additionally, if auto invoke tool calling is enabled and num_of_responses is greater than one, we provide a warning message and re-configure the num_of_responses to be one.

  1. Define the AzureOpenAI/OpenAI Chat Service.
  2. While configuring the prompt execution settings, specify that tools are available by utilizing the utility function get_tool_call_object that is part of semantic_kernel.connectors.ai.open_ai.utils. You can also specify tool_choice on the settings to be auto or to be a specific tool only.
  3. To enable automatic invocation of tool calls, configure the prompt execution settings by setting auto_invoke_kernel_functions=True and specifying your desired number of attempts in max_auto_invoke_attempts.
  4. Add any required plugins to the kernel that the LLM may utilize to complete the prompt/query.
  5. Use kernel invoke to invoke the prompt function.
  6. The LLM will respond with a tools_call finish reason. This means the chat completion contains tool calls that we need to handle. We construct the ToolCall object with the required data. These are attributes like: the tool call ID, the function name, and its arguments.
  7. When we enable auto tool calling, we loop through the required tool calls and use the kernel to invoke the functions with the required arguments. The result of the function calls are added to the chat history with the role tool_call.
  8. If you disable auto tool calling, then you must manually handle the tool calls that the model returns, and send back the response in the correct order, as the referenced kernel example shows.

Example Usage

We’ve included a kernel example that showcases using this new functionality. It handles streaming and non-streaming auto tool calls. It also has some helper code to show how to handle tool calls if auto invoke tool calls is disabled (via prompt execution settings). By default, the prompt execution settings have the auto invoke tool calls as disabled, so please note that when configuring your OpenAIChatPromptExecutionSettings/AzureOpenAIChatPromptExecutionSettings. The following code is a slimmed down version of the following kernel example.

<required imports here>

kernel = sk.Kernel()

# Note: the underlying gpt-35/gpt-4 model version needs to be at least version 0613 to support tools.
api_key, org_id = sk.openai_settings_from_dot_env()
kernel.add_service(
    sk_oai.OpenAIChatCompletion(
        service_id="chat",
        ai_model_id="gpt-3.5-turbo-1106",
        api_key=api_key,
    ),
)

plugins_directory = os.path.join(__file__, "../../../../samples/plugins")
kernel.import_plugin_from_object(MathPlugin(), plugin_name="math")
kernel.import_plugin_from_object(TimePlugin(), plugin_name="time")

# Note: the number of responses for auto invoking tool calls is limited to 1.
# If configured to be greater than one, this value will be overridden to 1.
execution_settings = sk_oai.OpenAIChatPromptExecutionSettings(
    service_id="chat",
    max_tokens=2000,
    temperature=0.7,
    top_p=0.8,
    tool_choice="auto",
    tools=get_tool_call_object(kernel, {"exclude_plugin": ["ChatBot"]}),
    auto_invoke_kernel_functions=True,
    max_auto_invoke_attempts=3,
)

prompt_template_config = sk.PromptTemplateConfig(
    template="{{$chat_history}}{{$user_input}}",
    name="chat",
    template_format="semantic-kernel",
    input_variables=[
        InputVariable(name="user_input", description="The user input", is_required=True),
        InputVariable(name="chat_history", description="The history of the conversation", is_required=True),
    ],
    execution_settings={"chat": execution_settings},
)

history = ChatHistory()

history.add_system_message(system_message)
history.add_user_message("Hi there, who are you?")
history.add_assistant_message("I am Mosscap, a chat bot. I'm trying to figure out what people need.")

arguments = KernelArguments()

chat_function = kernel.create_function_from_prompt(
    prompt_template_config=prompt_template_config,
    plugin_name="ChatBot",
    function_name="Chat",
)


async def chat() -> bool:
    user_input = input("User:> ")

    result = await kernel.invoke(chat_function, user_input=user_input, chat_history=history)

    # If tools are used, and auto invoke tool calls is False, the response will be of type
    # OpenAIChatMessageContent with information about the tool calls, which need to be sent
    # back to the model to get the final response.
    if not execution_settings.auto_invoke_kernel_functions and isinstance(
        result.value[0], OpenAIChatMessageContent
    ):
        # print_tool_calls(result.value[0])
        # return True

    print(f"Mosscap:> {result}")
    return True


async def main() -> None:
    chatting = True
    print(
        "Welcome to the chat bot!\
        \n  Type 'exit' to exit.\
        \n  Try a math question to see the function calling in action (i.e. what is 3+3?)."
    )
    while chatting:
        chatting = await chat()


if __name__ == "__main__":
    asyncio.run(main())

Why This Matters

  1. Enhanced Productivity: Automating routine tasks allows developers to concentrate on innovation and complex problem-solving.
  2. Customizable Interactions: Whether you prefer the SDK to handle tool calls automatically or you opt for manual input, the choice is yours.
  3. Universal Compatibility: Designed to support both streaming and non-streaming chat completions, this feature is versatile and adaptable to various use cases.

Moving Forward

We’re excited to introduce this update as it opens up the ability to use the OpenAI models in a more advanced way. It also allows us to move forward with our plans to support the FunctionCallingStepwise planner, which already exists in the .NET SDK.

We value your input and encourage you to share your experiences, suggestions, and stories of how you’re using the Semantic Kernel SDK. Your insights are crucial as we continue to evolve and enhance our offerings. As always, you can find our Python public backlog here, as well as our Python roadmap blog here.

0 comments

Leave a comment

Feedback usabilla icon