{"id":4340,"date":"2025-03-06T13:17:32","date_gmt":"2025-03-06T21:17:32","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/semantic-kernel\/?p=4340"},"modified":"2025-03-07T07:29:44","modified_gmt":"2025-03-07T15:29:44","slug":"talk-to-your-agents-introducing-the-realtime-apis-in-semantic-kernel","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/agent-framework\/talk-to-your-agents-introducing-the-realtime-apis-in-semantic-kernel\/","title":{"rendered":"Talk to your agents! Introducing the Realtime API&#8217;s in Semantic Kernel"},"content":{"rendered":"<h2>Introducing Realtime Agents in Semantic Kernel for Python<\/h2>\n<p>With release <strong>1.23.0<\/strong> of the Python version of Semantic Kernel we are introducing a new set of clients for interacting with the realtime multi-modal API&#8217;s of OpenAI and Azure OpenAI. They provide a abstracted approach to connecting to those services, adding your tools and running apps that leverage these very powerful and useful agents.<\/p>\n<p><div class=\"alert alert-info\"><p class=\"alert-divider\"><i class=\"fabric-icon fabric-icon--Info\"><\/i><strong>Experimental<\/strong><\/p>These connectors are experimental as we learn to better understand what is needed to support these kinds of models from different providers. The underlying API&#8217;s are also in preview so there might also be breaking changes coming from the services.<\/div><\/p>\n<p>The key addition that Semantic Kernel brings when you want to connect to these models is that we make the experience of using these models with function calling very easy, just create a Kernel and add your plugins as you are used to doing with Semantic Kernel, you can even just pass in your plugins and then we create the kernel for you, next add the FunctionChoiceBehavior class to the settings, and pass both to the Realtime Client and it will handle serializing the function definitions to the API, and when you use FunctionChoiceBehavior.Auto with auto_invoke turned on (the default), then we will execute the function, pass the result to the API, and ask it to create a response.<\/p>\n<p>Another important thing that we have done with these clients is to abstract away the underlying protocols as much as possible, so that you can easily switch models and providers while maintaining the same codebase.<\/p>\n<h3>Get started<\/h3>\n<p>First you need to install Semantic Kernel with the realtime extra:<\/p>\n<pre class=\"prettyprint language-default\"><code class=\"language-default\">pip install semantic-kernel[realtime]<\/code><\/pre>\n<p>Next, create your functions and Kernel, and add the functions, you can also wrap these function in a class and pass that as one in a list of plugins:<\/p>\n<pre class=\"prettyprint language-py\"><code class=\"language-py\">\r\nfrom datetime import datetime\r\nfrom semantic_kernel.functions import kernel_function\r\nfrom semantic_kernel import Kernel\r\n\r\n@kernel_function\r\ndef get_weather(location: str) -&gt; str:\r\n    \"\"\"Get the weather for a location.\"\"\"\r\n    weather_conditions = (\"sunny\", \"hot\", \"cloudy\", \"raining\", \"freezing\", \"snowing\")\r\n    weather = weather_conditions[randint(0, len(weather_conditions) - 1)]  # nosec\r\n    logger.info(f\"@ Getting weather for {location}: {weather}\")\r\n    return f\"The weather in {location} is {weather}.\"\r\n\r\n\r\n@kernel_function\r\ndef get_date_time() -&gt; str:\r\n    \"\"\"Get the current date and time.\"\"\"\r\n    logger.info(\"@ Getting current datetime\")\r\n    return f\"The current date and time is {datetime.now().isoformat()}.\"\r\n\r\n\r\n@kernel_function\r\ndef goodbye():\r\n    \"\"\"When the user is done, say goodbye and then call this function.\"\"\"\r\n    logger.info(\"@ Goodbye has been called\")\r\n    raise KeyboardInterrupt\r\n\r\nkernel = Kernel()\r\nkernel.add_functions(plugin_name=\"helpers\", functions=[goodbye, get_weather, get_date_time])\r\n<\/code><\/pre>\n<p>Next, create a Realtime Client, there are currently three types of clients available, <a href=\"https:\/\/github.com\/microsoft\/semantic-kernel\/blob\/main\/python\/semantic_kernel\/connectors\/ai\/open_ai\/services\/azure_realtime.py\"><code>AzureRealtimeWebsocket<\/code><\/a>, <a href=\"https:\/\/github.com\/microsoft\/semantic-kernel\/blob\/main\/python\/semantic_kernel\/connectors\/ai\/open_ai\/services\/open_ai_realtime.py\"><code>OpenAIRealtimeWebsocket<\/code> and <code>OpenAIRealtimeWebRTC<\/code><\/a> (they are all available from the <code>semantic_kernel.connectors.ai.open_ai<\/code> namespace:<\/p>\n<pre class=\"prettyprint language-py\"><code class=\"language-py\">from semantic_kernel.connectors.ai import FunctionChoiceBehavior\r\nfrom semantic_kernel.connectors.ai.open_ai import (\r\n    AzureRealtimeExecutionSettings,\r\n    AzureRealtimeWebsocket\r\n)\r\n\r\nrealtime_agent = AzureRealtimeWebsocket()\r\nsettings = AzureRealtimeExecutionSettings(\r\n        instructions=\"\"\"\r\n    You are a chat bot. Your name is Mosscap and\r\n    you have one goal: figure out what people need.\r\n    Your full name, should you need to know it, is\r\n    Splendid Speckled Mosscap. You communicate\r\n    effectively, but you tend to answer with long\r\n    flowery prose.\r\n    \"\"\",\r\n        voice=\"alloy\",\r\n        turn_detection=TurnDetection(type=\"server_vad\", create_response=True, silence_duration_ms=800, threshold=0.8),\r\n        function_choice_behavior=FunctionChoiceBehavior.Auto(),\r\n    )<\/code><\/pre>\n<p>Then we can start the session, the <code>settings<\/code>, <code>chat_history<\/code> and <code>kernel<\/code> or <code>plugins<\/code> can be added here, or they can be passed in the constructor above.<\/p>\n<p>This then starts receiving events from the service, those events are both Audio (<code>RealtimeAudioEvent<\/code>, a subclass of <code>RealtimeEvent<\/code>) and Text (<code>RealtimeTextEvent<\/code>, a subclass of <code>RealtimeEvent<\/code>) events as well as events that denote other activities of the API (<code>RealtimeEvent<\/code> is the type of those) , such as responses being created, items added and updates to the session itself:<\/p>\n<pre class=\"prettyprint language-py\"><code class=\"language-py\">async with realtime_agent(\r\n    settings=settings,\r\n    chat_history=chat_history,\r\n    kernel=kernel,\r\n    create_response=True,\r\n):\r\n    async for event in realtime_agent.receive():\r\n       # event handling code<\/code><\/pre>\n<p>At the same time you can send events to the service, again both audio and text inputs, but also updates to the way the session runs, such as which functions are available.<\/p>\n<p>For instance, if you want the service to create a response, you can do this:<\/p>\n<pre class=\"prettyprint language-py\"><code class=\"language-py\">await realtime_agent.send(RealtimeEvent(service_type=SendEvents.RESPONSE_CREATE))<\/code><\/pre>\n<h3>Learn more<\/h3>\n<p>To learn more about these new features, see our <a href=\"https:\/\/learn.microsoft.com\/en-us\/semantic-kernel\/concepts\/ai-services\/realtime\">documentation<\/a> and <a href=\"https:\/\/github.com\/microsoft\/semantic-kernel\/tree\/main\/python\/samples\/concepts\/realtime\">samples<\/a>. Finally, we have a more complete <a href=\"https:\/\/github.com\/microsoft\/semantic-kernel\/tree\/main\/python\/samples\/demos\/call_automation\">demo app<\/a> that uses <a href=\"https:\/\/learn.microsoft.com\/azure\/communication-services\/\">Azure Communication Services<\/a> to allow you to have calls with your data and other tools.<\/p>\n<h4>Happy talking!<\/h4>\n","protected":false},"excerpt":{"rendered":"<p>Introducing Realtime Agents in Semantic Kernel for Python With release 1.23.0 of the Python version of Semantic Kernel we are introducing a new set of clients for interacting with the realtime multi-modal API&#8217;s of OpenAI and Azure OpenAI. They provide a abstracted approach to connecting to those services, adding your tools and running apps that [&hellip;]<\/p>\n","protected":false},"author":150044,"featured_media":2364,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[47,17,34],"tags":[48,63,53,119,9],"class_list":["post-4340","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-announcement","category-announcements","category-python-2","tag-ai","tag-microsoft-semantic-kernel","tag-python","tag-realtime-api","tag-semantic-kernel"],"acf":[],"blog_post_summary":"<p>Introducing Realtime Agents in Semantic Kernel for Python With release 1.23.0 of the Python version of Semantic Kernel we are introducing a new set of clients for interacting with the realtime multi-modal API&#8217;s of OpenAI and Azure OpenAI. They provide a abstracted approach to connecting to those services, adding your tools and running apps that [&hellip;]<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/posts\/4340","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/users\/150044"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/comments?post=4340"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/posts\/4340\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/media\/2364"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/media?parent=4340"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/categories?post=4340"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/tags?post=4340"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}