{"id":1466,"date":"2023-11-06T10:38:37","date_gmt":"2023-11-06T18:38:37","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/semantic-kernel\/?p=1466"},"modified":"2023-11-10T05:32:49","modified_gmt":"2023-11-10T13:32:49","slug":"assistants-the-future-of-semantic-kernel","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/agent-framework\/assistants-the-future-of-semantic-kernel\/","title":{"rendered":"OpenAI Assistants: the future of Semantic Kernel"},"content":{"rendered":"<p>During the OpenAI event earlier today, <a href=\"https:\/\/openai.com\/blog\/introducing-gpts\">OpenAI announced the launch of GPTs<\/a> and <a href=\"https:\/\/platform.openai.com\/docs\/assistants\/overview\">the assistants API<\/a>, the new and improved ways of creating agents on top of their chat completion models. With assistants, much of the heavy lifting required to build agents has been stripped away\u2026<\/p>\n<ul>\n<li>Messages are now managed for you in threads.<\/li>\n<li>Memory is automatically handled for you behind the scenes.<\/li>\n<li>And multiple functions can be called (instead of just one).<\/li>\n<\/ul>\n<p>This ultimately means it\u2019ll be faster, and easier, for you to build agents on top of OpenAI <em>and<\/em> Semantic Kernel. We\u2019re excited to share our plans on incorporating assistants into Semantic Kernel and how they fit into our v1 proposals, so beginning today, we\u2019re going to start a series on building agents with Semantic Kernel.<\/p>\n<p>For our inaugural blog post on agents, we\u2019ll share our overarching plans for incorporating OpenAI assistants with Semantic Kernel. Subsequent articles will demonstrate how to achieve the following with today\u2019s APIs (with an eye towards what it\u2019ll be like for v1): creating your first assistant, orchestrating assistants together, extending assistants with plugins, and keeping assistants safe through responsible AI and monitoring. If there are any topics you\u2019d like us to cover, let us know on our <a href=\"https:\/\/github.com\/microsoft\/semantic-kernel\/discussions\/3393\">dedicated discussion board for GPTs and assistants<\/a>.<\/p>\n<h2>The kernel will become your gateway to assistants.<\/h2>\n<p>In our <a href=\"https:\/\/devblogs.microsoft.com\/semantic-kernel\/what-to-expect-from-v1-and-beyond-for-semantic-kernel\/\">Road to V1 and beyond blog post<\/a>, we shared that one of our goals was to provide \u201ca compelling reason to use the kernel\u201d. Last week we shared a few ways the kernel would improve, but we didn\u2019t share the <em>full<\/em> story.<\/p>\n<p>That changes today with the announcement of the assistants API.<\/p>\n<p>With the kernel, we plan on providing an abstraction layer on top of assistant so it\u2019s easier to build assistants <em>and<\/em> so you can more easily extend the new assistant APIs provided by OpenAI.<\/p>\n<h2>Today\u2019s kernel <em>just<\/em> manages the runtime.<\/h2>\n<p>With today\u2019s kernel, you can only define the available functions, models, and prompt template engines. This helps you create a runtime that allows your semantic and native functions to talk to each other, but many other pieces still need to be implemented by the developer.<\/p>\n<p>For example, to build a <em>complete<\/em> agent with the kernel, you must manage the entire chat history on your own. This proves particularly annoying when using OpenAI function calling\u2013after a function is used, you need to add both the agent response <em>and<\/em> the function response\u2013or when you start running out of tokens for the chat history.<\/p>\n<p>For the uninitiated, this can be confusing and challenging, so we plan to make this better.<\/p>\n<h2>Tomorrow\u2019s kernel will help you manage everything for assistants.<\/h2>\n<p>To simplify things, we will update the kernel as part of so it can use an OpenAI assistant behind the scenes. We\u2019re excited about this change, because it means we can update the already simplified v1 proposal from this\u2026<\/p>\n<pre>\/\/ Create a new kernel\r\nIKernel kernel = new Kernel(\r\n \u00a0\u00a0 aiServices: new () { },\r\n \u00a0\u00a0 plugins: new () { intentPlugin, mathPlugin }\r\n);\r\n\r\n\/\/ Start the chat\r\nChatHistory chatHistory = gpt35Turbo.CreateNewChat();\r\nwhile(true)\r\n{\r\n \u00a0\u00a0 Console.Write(\"User &gt; \");\r\n \u00a0\u00a0 chatHistory.AddUserMessage(Console.ReadLine()!);\r\n\r\n \u00a0\u00a0 \/\/ Run the simple chat\r\n \u00a0\u00a0 var result = await kernel.RunAsync(\r\n \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 chatFunction,\r\n \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 variables: new() {{ \"messages\", chatHistory }},\r\n \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 streaming: true\r\n \u00a0\u00a0 );\r\n\r\n \u00a0\u00a0 Console.Write(\"Agent &gt; \");\r\n \u00a0\u00a0 await foreach(var message in result.GetStreamingValue&lt;string&gt;()!)\r\n \u00a0\u00a0 {\r\n \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 Console.Write(message);\r\n\u00a0\u00a0\u00a0 }\r\n\r\n \u00a0\u00a0 Console.WriteLine();\r\n \u00a0\u00a0 chatHistory.AddAgentMessage(await result.GetValueAsync&lt;string&gt;()!);\r\n}<\/pre>\n<p>To this\u2026<\/p>\n<pre>\/\/ Create a new kernel\r\nAssistantKernel kernel = new AssistantKernel(\r\n \u00a0\u00a0 aiServices: new () { gpt35Turbo, gpt4Agent },\r\n \u00a0\u00a0 plugins: new () { intentPlugin, mathPlugin }\r\n);\r\n\r\n\/\/ Start the chat\r\nkernel.StartChat(chatFunction);\r\nwhile(true)\r\n{\r\n\u00a0\u00a0\u00a0 Console.Write(\"User &gt; \");\r\n\r\n \u00a0\u00a0 \/\/ Run the simple chat\r\n \u00a0\u00a0 var result = await kernel.SendUserMessage(\r\n \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 Console.ReadLine()!,\r\n \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 streaming: true\r\n \u00a0\u00a0 );\r\n\r\n \u00a0\u00a0 Console.Write(\"Agent &gt; \");\r\n \u00a0\u00a0 await foreach(var message in result.GetStreamingValue&lt;string&gt;()!)\r\n \u00a0\u00a0 {\r\n \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 Console.Write(message);\r\n \u00a0\u00a0 }\r\n \u00a0\u00a0 Console.WriteLine();\r\n}<\/pre>\n<p>With this new setup, you no longer need to manage the chat history yourself. Additionally, \u201crunning\u201d the kernel will become even easier because you <em>just<\/em> need to pass in the last user\u2019s input.<\/p>\n<p>Behind the scenes, whenever you use the SendUserMessage method, we\u2019ll 1) call the necessary OpenAI GPT APIs to send the user\u2019s message, 2) retrieve a response from the LLM, before finally 3) giving the result back to you.<\/p>\n<h2>With the kernel, we\u2019ll make it easy to extend OpenAI assistants.<\/h2>\n<p>As powerful as the new assistant APIs are, they don\u2019t do <em>everything<\/em>. This is where Semantic Kernel comes in. With its support for plugins, planners, and multi-model support, you can use Semantic Kernel to extend assistants to make them more power while <em>also<\/em> optimizing performance and cost.<\/p>\n<ol>\n<li><strong style=\"font-family: 'Segoe UI Bold','Segoe UI',Tahoma,Geneva,Verdana,sans-serif;\">Simplified function calling<\/strong> \u2013 To make you agents even more useful, you can provide them with actions to run. We\u2019ll simplify this process by leveraging the existing functions already registered in a kernel via plugins. As you converse with your agent, we\u2019ll provide it with the functions you\u2019ve added and automatically run them as we get responses from the model.<\/li>\n<li><strong style=\"font-family: 'Segoe UI Bold','Segoe UI',Tahoma,Geneva,Verdana,sans-serif;\">Complex multi-step plans<\/strong> \u2013 With agents, OpenAI can start to call multiple functions at a time, but it still cannot create complex plans with conditional logic, loops, and variable passing. With Semantic Kernel planners, you can do just that. Not only does this save you tokens, but it also allows you to generate complete plans that can be reviewed by humans before they\u2019re executed.<\/li>\n<li><strong style=\"font-family: 'Segoe UI Bold','Segoe UI',Tahoma,Geneva,Verdana,sans-serif;\">Multi-model support<\/strong> \u2013 Today\u2019s agents use either GPT-3.5-turbo, GPT-4, and soon GPT-4-turbo for all chat completions. As a developer, however, you may want to be more discerning. You may want to use GPT-4-turbo for the final response while using GPT-3.5-turbo for some of the simpler semantic functions. With Semantic Kernel, you make these optimizations. You can even leverage non-OpenAI models in conjunction with your OpenAI agents.<\/li>\n<li><strong style=\"font-family: 'Segoe UI Bold','Segoe UI',Tahoma,Geneva,Verdana,sans-serif;\">More control over memory<\/strong> \u2013 If you want to use an advanced memory architecture to have more control over how you save and retrieve memories (like <a href=\"https:\/\/github.com\/microsoft\/kernel-memory\">Kernel memory<\/a> or <a href=\"https:\/\/www.llamaindex.ai\/\">Llama index<\/a>), you can add these services as plugins to provide even better context to your agent.<\/li>\n<li><strong style=\"font-family: 'Segoe UI Bold','Segoe UI',Tahoma,Geneva,Verdana,sans-serif;\">Greater visibility and monitoring<\/strong> \u2013 With Semantic Kernel\u2019s pre\/post hooks, you can easily add telemetry once to your kernel to easily get visibility into token usage, rendered prompts, and more across all your native and semantic functions.<\/li>\n<\/ol>\n<h2>We want your feedback!<\/h2>\n<p>As the Semantic Kernel team, we want to continue providing early previews into where we\u2019d like to push the SDK. This helps contributors in the open-source community add PRs, and perhaps more importantly, gives us an opportunity to collect your feedback.<\/p>\n<p>If you have any feedback on this proposal (either good or bad), please share it on our discussion boards. We\u2019ve created a dedicated board for this topic so you can share with us your scenarios so we can make sure we build the right integration.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>During the OpenAI event earlier today, OpenAI announced the launch of GPTs and the assistants API, the new and improved ways of creating agents on top of their chat completion models. With assistants, much of the heavy lifting required to build agents has been stripped away\u2026 Messages are now managed for you in threads. Memory [&hellip;]<\/p>\n","protected":false},"author":121401,"featured_media":1475,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[],"class_list":["post-1466","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-semantic-kernel"],"acf":[],"blog_post_summary":"<p>During the OpenAI event earlier today, OpenAI announced the launch of GPTs and the assistants API, the new and improved ways of creating agents on top of their chat completion models. With assistants, much of the heavy lifting required to build agents has been stripped away\u2026 Messages are now managed for you in threads. Memory [&hellip;]<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/posts\/1466","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/users\/121401"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/comments?post=1466"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/posts\/1466\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/media\/1475"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/media?parent=1466"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/categories?post=1466"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/tags?post=1466"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}