Now in Beta: Explore the Enhanced Python SDK for Semantic Kernel

Evan Mattson

Eduard van Valkenburg

The Semantic Kernel team is excited to announce we are on the brink of releasing v1.0.0 of our Python SDK. In our journey towards this significant milestone, the latest beta release (0.9.0b1) has brought essential breaking changes, aligning our Python SDK’s capabilities with those of our .NET SDK. Our commitment to Pythonic best practices shines through in this update, with enhancements designed to improve readability and simplicity. This update not only signifies our efforts to synchronize the Python SDK with its .NET counterpart but also paves the way for exciting developments detailed below, setting the stage for a series of targeted enhancements poised to elevate the Python SDK experience for developers.

In the upcoming months, our dedication to enriching the Python SDK will be our focus, ensuring it receives the same caliber of feature enhancements and usability improvements that were introduced to the .NET SDK in our recent push towards v1.0.0. For a closer look at what we have planned for Python, we encourage you to explore our original Python roadmap blog and our active Python backlog.

Key Updates in 0.9.0b1

Several changes have occurred in 0.9.0b1. Below are the most important changes to the SDK, along with links to the relevant PRs. As identified in our Python roadmap blog, we prioritized the main breaking changes for this release so internal and external contributors could have a stable SDK to contribute to.

  1. Method Renaming and Alignment: Methods have been renamed for consistency with the .NET SDK, such as the kernel run to invoke, and the removal of _async suffixes to align with Azure Python SDK practices. PR
  2. Kernel Arguments Overhaul: The introduction of a flexible kernel arguments system, replacing SKContext, allows for more dynamic configurations. PR
  3. The Kernel Function decorator was improved and the function_context_parameter was removed making way for the use of Python’s Annotations. Kernel Function arguments are handled via Kernel Arguments. PR
  4. Enhanced Function Results Handling: A new class for handling function results, offering straightforward methods for accessing values and metadata. PR
  5. AI Service Selector: this feature allows for the dynamic resolution of prompt templates to the correct model by providing a default implementation and enabling user-defined custom selectors. PR
  6. Kernel Function Methodology Revamp: Simplification and restructuring of how kernel functions are handled, improving usability. PR
  7. Plugin System Refinement: Transition from “skills” to “plugins” for enhanced clarity and consistency. PR
  8. Config and Template Updates: Major updates to prompt template configuration to match the .NET SDK, supporting more versatile execution settings. PR
  9. Introduction of Chat History Class: This update deprecates older chat management approaches in favor of a more robust and flexible system. PR
  10. Memory Connector Handling Improvements: By removing the kernel’s memory attribute, the SDK now supports multiple memory connectors through plugins. PR
  11. Enhanced Argument and Template Handling: Introduces named arguments for function calls and improves template parsing for better validation. PR
  12. Pythonic Exception Handling: Refactoring of exception handling to provide clearer error messages and full stack traces. PR
  13. Updated Python Samples: Migration and update of Python samples to reflect these latest changes, now available in the main Semantic Kernel repository. Jupyter Notebooks and Kernel Syntax Examples were also updated. We will be adding more examples as we move towards v1.0.0. PR

Upcoming Features

As we progress towards the v1.0.0 release, we will focus on the following features next. We’re excited about the next round of enhancements because they greatly improve the productivity of developers that use Semantic Kernel and will allow Python developers to start using the same prompt resources as their .NET counterparts.

  • Automatic function calling via kernel functions.
  • A new stepwise planner for function calling.
  • Enhancements to function hooks/filters for alignment with .NET SDK.
  • Introduction of YAML template parsing.
  • Support for Handlebars and Jinja2 templating languages.*
  • Implementation of the Handlebars planner.

Our full Python public backlog is available here.

* We are working on investigation and planning for this work, and don’t yet know which one we will introduce first and if we will have support for one or both.

Migration Overview for Upgrading

For users upgrading from older versions to 0.9.0b1, it’s essential to note the breaking changes, especially around method renaming, argument handling, and configuration updates. We recommend reviewing the detailed PR links provided to understand the changes fully. Additionally, ensure your codebase is updated to reflect these new patterns, particularly in areas related to kernel methods, plugin management, and template configurations.

To help with migration, we’ve identified several common scenarios that developers have implemented on the previous SDK. The following section provides before and after code snippets to make it easier to identify the breaking changes as you upgrade to the Beta.

01. Configuring the kernel

Previously, when you created a service, the name of the service lived outside of the actual connection, which made it difficult to reuse.

kernel.add_chat_service(
    "chat_completion",
    AzureChatCompletion(
        deployment,
        endpoint,
        api_key,
    ),
)

In the Beta update, we’ve moved the service ID into the connection so that the same connection can be reused by multiple kernels with the same name. Additionally, we’ve introduced the kernel.add_service to add any AI service.

kernel.add_service(
    AzureChatCompletion(
        service_id="chat_completion",
        deployment_name=deployment,
        endpoint=endpoint,
        api_key=api_key,
    ),
)

All arguments have been made named to offer enhanced flexibility in constructors. The service_id also plays a more important role as part of the execution settings and is used to help distinguish between execution settings for different types of models. The service_id is used for the base AI service selector.

02. Running a function via the kernel

Previously, you would use the run_async method to call a function using a kernel.

poemResult = await kernel.run_async(writer_plugin["ShortPoem"], input_str=str(currentTime))

This has been updated to invoke so that the same verb is used throughout Semantic Kernel. We’ve also removed the “async” suffix to make it more idiomatic with Python.

poemResult = await kernel.invoke(write_plugin["ShortPoem"], input=currentTime)

Please note that poemResult is of type FunctionResult.

03. Creating a function from a prompt

Previously, creating a semantic function was tightly coupled with OpenAI request settings like max_tokens and temperature.

semantic_function = kernel.create_semantic_function(
    prompt,
    function_name="Chat",
    description="Chat with the assistant",
    max_tokens=4000,
    temperature=0.3,
)

We’ve updated it so that the execution settings can differ per service type. In the following example, we demonstrate how the execution settings type can be pulled from the service within the kernel. As we complete YAML prompt serialization, this code will become even simpler so that you as a developer don’t have to explicitly wire this up. You’ll simply provide the YAML file with the prompt and execution settings and Semantic Kernel will automatically create the function for you.

chat_function = kernel.create_function_from_prompt(
    prompt=prompt,
    plugin_name="Summarize_Conversation",
    function_name="Chat",
    execution_settings=kernel.get_service(service_id).instantiate_prompt_execution_settings(
        service_id=service_id,
        max_tokens=4000,
        temperature=0.3,
    ),
)

The prompt_execution_settings can be one of the following: a single settings object, a list of settings objects or a dictionary of PromptExecutionSetting to be able to handle multiple models.

04. Configuring Memory as a Plugin

Previously, the kernel could only have a single memory provider. The TextMemoryPlugin could then automagically find this single memory provider when being added to a kernel. Unfortunately, this was limiting because many customers required multiple search indexes to be usable within a kernel.

# An embeddings generator is created as a part of register_memory 
# and the kernel has only one memory store attribute
kernel.register_memory_store(VolatileMemoryStore())
kernel.import_plugin(TextMemoryPlugin())

The register_memory_store in turn called the following code in the kernel:

def use_memory(
    self,
    storage: MemoryStoreBase,
    embeddings_generator: Optional[EmbeddingGeneratorBase] = None,
) -> None:
    if embeddings_generator is None:
        service_id = self.get_text_embedding_generation_service_id()
        if not service_id:
            raise ValueError("The embedding service id cannot be `None` or empty")
        embeddings_service = self.get_ai_service(EmbeddingGeneratorBase, service_id)
        if not embeddings_service:
            raise ValueError(f"AI configuration is missing for: {service_id}")
        embeddings_generator = embeddings_service(self)
    if storage is None:
        raise ValueError("The storage instance provided cannot be `None`")
    if embeddings_generator is None:
        raise ValueError("The embedding generator cannot be `None`")
    self.register_memory(SemanticTextMemory(storage, embeddings_generator))

def register_memory(self, memory: SemanticTextMemoryBase) -> None:
    self.memory = memory

Afterwards, we changed the relationship so that a single kernel could have multiple TextMemoryPlugin objects added, each with their own memory store. This is powerful, because it allows you as a developer to have multiple memories for things like documents, short-term memory, and long term memory.

embedding_gen = OpenAITextEmbedding(
    service_id="ada", ai_model_id="text-embedding-ada-002", api_key=api_key, org_id=org_id
)
kernel.add_service(embedding_gen)

memory = SemanticTextMemory(storage=VolatileMemoryStore(), embeddings_generator=embedding_gen)
kernel.import_plugin_from_object(TextMemoryPlugin(memory), "TextMemoryPlugin")

05. Utilizing Chat History as part of a prompt

Previously, if you wanted to include a chat history within a prompt, you’d have to do string manipulation to join it together. Additionally, these pattern didn’t take advantage of the roles in the Chat Completion APIs provided by OpenAI and other model providers.

prompt = """{{$history}}
User: {{$request}}
Assistant:  """

history = []
variables = ContextVariables()
variables["request"] = request
variables["history"] = "\n".join(history)
semantic_function = kernel.create_semantic_function(
    prompt,
    function_name="Chat",
    description="Chat with the assistant",
    max_tokens=4000,
    temperature=0.3,
)
result = await kernel.run_async(
    semantic_function,
    input_vars=variables,
)

# Add the request to the history
history.append("User: " + request)
history.append("Assistant" + result.result)

Subsequently, integrating a chat history object into a prompt becomes significantly simpler. Semantic Kernel can effortlessly incorporate it as a complex object, ensuring seamless integration into the prompt. During this operation, it will also leverage the roles defined in the Chat Completion service to produce better outcomes.

history = ChatHistory()

prompt = "{{$history}}"

chat_function = kernel.create_function_from_prompt(
    prompt=prompt,
    plugin_name="chatGPT",
    function_name="Chat",
    prompt_template_config=chat_prompt_template_config,
)

history.add_user_message(request)

result = await kernel.invoke(
    chat_function,
    request=request,
    history=history,
)

history.add_assistant_message(str(result))

06. Creating a Plugin with Kernel Functions

Previously, annotating a plugin with semantic meaning was very verbose.

class MathPlugin:
    @sk_function(
        description="Adds two numbers together",
        name="Add",
    )
    @sk_function_context_parameter(
        name="input",
        description="The first number to add",
    )
    @sk_function_context_parameter(
        name="number2",
        description="The second number to add",
    )
    def add(self, context: SKContext) -> str:
        return str(float(context["input"]) + float(context["number2"]))

    @sk_function(
        description="Subtract two numbers",
        name="Subtract",
    )
    @sk_function_context_parameter(
        name="input",
        description="The first number to subtract from",
    )
    @sk_function_context_parameter(
        name="number2",
        description="The second number to subtract away",
    )
    def subtract(self, context: SKContext) -> str:
        return str(float(context["input"]) - float(context["number2"]))

math_plugin = kernel.import_plugin(
    MathPlugin(), 
    plugin_name="MathPlugin"
)

result = await kernel.run_async(
    math_plugin["Sqrt"],
    input_str="12",
)

We now allow annotations to be added directly to the input parameters so that it’s easier to read and use.

class MathPlugin:
    @kernel_function(name="Add")
    def add(
        self,
        number1: Annotated[float, "the first number to add"],
        number2: Annotated[float, "the second number to add"],
    ) -> Annotated[float, "the output is a float"]:
        return float(number1) + float(number2)

    @kernel_function(
        description="Subtracts value to a value",
        name="Subtract",
    )
    def subtract(
        self,
        number1: Annotated[float, "the first number"],
        number2: Annotated[float, "the number to subtract"],
    ) -> Annotated[float, "the output is a float"]:
        return float(number1) - float(number2)

math_plugin = kernel.import_plugin_from_object(
    plugin_instance=MathPlugin(), 
    plugin_name="MathPlugin"
)

result = await kernel.invoke(
    math_plugin["Add"],
    number1=5,
    number2=5,
)

We now support types other than strings, including custom objects (preferably based on Pydantic BaseModels.

07. Using Kernel Arguments

Previously, developers had to juggle SKContext and ContextVariables. Many times, it was confusing when to use one or the other.

chat_prompt="{{$request}}"
history = []
variables = ContextVariables()
request = "some input"
variables["request"] = request
variables["history"] = "\n".join(history)

result = await kernel.run_async(
    chat_func,
    input_vars=variables,
)

# Add the request to the history
history.append("User: " + request)
history.append("Assistant" + result.result)

We’ve simplified this use of variables by introducing KernelArguments. You can create them much more easily, as keyword arguments passed to the KernelArgument object, and then pass the KernelArgument object into the invoke function.

chat_prompt = "{{$request}}"
chat_history = ChatHistory()
chat_history.add_system_message(
    "You are a helpful, funny chatbot."
)

# Kernel Arguments can be specified as a concrete object
args = KernelArguments(arg1=val1, arg2=val2)
result = await kernel.invoke(func, args)

Additionally, if you don’t want to create a KernelArgument object, you can simply pass in inputs and values as kwargs so that you can further simplify your code.

result = await kernel.invoke(
    functions=chat_func,
    request="some input",
    chat_history=chat_history,
)
history.add_user_message(request)
history.add_assistant_message(str(result))

08. Configure Multiple Prompt Execution Settings

Previously, each prompt could only have a single set of execution settings attached to it. This was limiting for developers who wanted to use the same prompt across multiple models with different settings.

prompt_template_config=PromptTemplateConfig(
    execution_settings=PromptExecutionSettings(service_id="id1", temperature=0.0),
),

Now, you can author multiple execution settings and assign them to a prompt. This is helpful if you want to test different settings in production.

prompt_template_config=PromptTemplateConfig(
    template="{{$request}}",
    execution_settings=[
        OpenAIPromptExecutionSettings(service_id="id1", temperature=0.0, num_of_responses=2),
        GooglePalmPromptExecutionSettings(service_id="id2", temperature=1.0, candidate_count=2),
    ],
),

09. Then vs. Now: A Comprehensive Example

To see how the different capabilities layer onto each other, we’ve provided the following example for building a simple chat bot. The first block of code is how you’d previously author the sample.

system_message = """
You are a chat bot. Your name is Mosscap and
you have one goal: figure out what people need.
Your full name, should you need to know it, is
Splendid Speckled Mosscap. You communicate
effectively, but you tend to answer with long
flowery prose.
"""
# Create a kernel
kernel = Kernel()

# Create a chat service
chat_service = AzureChatCompletion(
    **azure_openai_settings_from_dot_env_as_dict(include_api_version=True)
)

# Add the chat service to the kernel
kernel.add_chat_service("chat-gpt", chat_service)

# Define the chat request settings
req_settings = kernel.get_prompt_execution_settings_from_service(
    ChatCompletionClientBase, 
    "chat-gpt",
)
req_settings.max_tokens = 2000
req_settings.temperature = 0.7
req_settings.top_p = 0.8

# Configure the prompt config, template, and function config
prompt_config = PromptTemplateConfig(execution_settings=req_settings)
prompt_template = ChatPromptTemplate(
    "{{$user_input}}", 
    kernel.prompt_template_engine, 
    prompt_config
)
prompt_template.add_system_message(system_message)
prompt_template.add_user_message("Hi there, who are you?")
prompt_template.add_assistant_message("I am Mosscap, a chat bot. I'm trying to figure out what people need.")
function_config = SemanticFunctionConfig(prompt_config, prompt_template)

# Register the semantic function with the kernel
chat_function = kernel.register_semantic_function("ChatBot", "Chat", function_config)

# Define an SK Context variables object to handle user input
context_vars = ContextVariables()
user_input = input("User:> ")
context_vars["user_input"] = user_input

# Run the chat function with the user's input
answer = await kernel.run_async(chat_function, input_vars=context_vars)

# Display the result
print(f"Mosscap:> {answer}")

After the changes introduced in the Beta release, you’d now write something like the following:

system_message = """
You are a chat bot. Your name is Mosscap and
you have one goal: figure out what people need.
Your full name, should you need to know it, is
Splendid Speckled Mosscap. You communicate
effectively, but you tend to answer with long
flowery prose.
"""

# Create a Kernel
kernel = Kernel()

# Define a service ID that ties to the AI service
service_id = "chat-gpt"
chat_service = AzureChatCompletion(
    service_id=service_id, **azure_openai_settings_from_dot_env_as_dict(include_api_version=True)
)

# Add the AI service to the kernel
kernel.add_service(chat_service)

# Get the Prompt Execution Settings
req_settings = kernel.get_service(service_id).instantiate_prompt_execution_settings(
    service_id=service_id, 
    max_tokens=2000,
    temperature=0.7,
    top_p=0.8,
)

# Create the prompt template config and specify any required input variables
prompt_template_config = PromptTemplateConfig(
    template={{$chat_history}}{{$input}},
    name="chat",
    input_variables=[
        InputVariable(name="input", description="The user input", is_required=True),
        InputVariable(name="chat_history", description="The history of the conversation", is_required=True),
    ],
    execution_settings=req_settings, # The execution settings will be tied to the configured service_id
)

# Create the chat function from the prompt template config
chat_function = kernel.create_function_from_prompt(
    plugin_name="chat_bot",
    function_name="chat",
    prompt_template_config=prompt_template_config
)

# Define a chat history object
history = ChatHistory(system_message=system_message)

# Add the desired messages
history.add_user_message("Hi there, who are you?")
history.add_assistant_message("I am Mosscap, a chat bot. I'm trying to figure out what people need.")

# Gather the user's input
user_input = input("User:> ")

# Invoke the chat function, passing the kernel arguments as kwargs
response = await kernel.invoke(
    chat_function,
    request=user_input,
    chat_history=history,
)

# View the response
print(f"Mosscap:> {response}")

# Add the user's input and the assistant's messages to the on-going chat history
history.add_user_message(user_input)
history.add_assistant_message(str(response))

Note: there are several ways to set the system message now. The first is to explicitly add the system message to the chat history either by history.add_system_message or as part of the ChatHistory constructor with keyword arguments, as shown above. The other way is to provide a system message through the template that is configured as part of the PromptTemplateConfig. Everything the template parser sees in the template before the ChatHistory object will be turned into a system_message, while everything between or after a ChatHistory is turned into user messages.

0 comments

Leave a comment

Feedback usabilla icon