{"id":3032,"date":"2024-07-23T11:39:53","date_gmt":"2024-07-23T18:39:53","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/semantic-kernel\/?p=3032"},"modified":"2024-07-23T12:26:47","modified_gmt":"2024-07-23T19:26:47","slug":"the-future-of-planners-in-semantic-kernel","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/agent-framework\/the-future-of-planners-in-semantic-kernel\/","title":{"rendered":"The future of Planners in Semantic Kernel"},"content":{"rendered":"<p>Since the very earlier days of Semantic Kernel, we have shipped experimental \u201c<a href=\"https:\/\/github.com\/microsoft\/semantic-kernel\/tree\/main\/dotnet\/src\/Planners\">planners<\/a>\u201d that use prompts to generate multi-step plans. This was extremely powerful because it allowed developers to use LLMs (which were created to merely generate text) to begin automating business processes.<\/p>\n<p>Since then, the Semantic Kernel team has evolved its experimental planners over time so it could adopt the latest research from both inside and outside of Microsoft. Most notably, we began leveraging function calling. With function calling, the <a href=\"https:\/\/github.com\/dmytrostruk\/semantic-kernel\/blob\/79328308b6cb649b49d187c8748766f1dc24356c\/dotnet\/src\/Extensions\/Planning.ActionPlanner\/IActionPlanner.cs\">action planner<\/a> could be replaced with a single function call request and the <a href=\"https:\/\/github.com\/dmytrostruk\/semantic-kernel\/blob\/afb9f29ac90b9ff23d2063619133561eec555c1d\/dotnet\/src\/Extensions\/Planning.StepwisePlanner\/Skills\/StepwiseStep\/skprompt.txt\">ReAct based planning<\/a> of the <a href=\"https:\/\/github.com\/dmytrostruk\/semantic-kernel\/blob\/afb9f29ac90b9ff23d2063619133561eec555c1d\/dotnet\/src\/Extensions\/Planning.ActionPlanner\/IActionPlanner.cs\">stepwise planner<\/a> could be replicated with multiple function calling steps.<\/p>\n<p>As function calling has gotten increasingly more accurate and efficient, however, the need for additional \u201cplanning\u201d logic on top of the model has become less necessary, and in some cases, can reduce the speed, cost, and accuracy of a plan.<\/p>\n<p>In this blog post we\u2019ll cover our current experimental planners and what comes next for the function calling stepwise planner and the Handlebars planner. We&#8217;ll answer: How do they work? What are they good at? And how do we plan to make them even better.<\/p>\n<blockquote><p>Keep an eye out for additional blog posts that provides a deep dive on how to use the future versions of the function calling stepwise planner and the Handlebars planner.<\/p><\/blockquote>\n<h2>Function calling stepwise planner<\/h2>\n<p>As we evolved the original stepwise planner, we took advantage of <a href=\"https:\/\/platform.openai.com\/docs\/guides\/function-calling\">function calling<\/a> from OpenAI. This allowed us to reduce the size of the original prompt while still increasing accuracy. When function calling was first introduced, however, it wasn\u2019t perfect. The LLMs had difficulty stringing together multiple function calls to complete a multi-step process (or plan) using the ReAct methodology.<\/p>\n<p>To address this shortcoming, we <a href=\"https:\/\/github.com\/dmytrostruk\/semantic-kernel\/blob\/main\/dotnet\/src\/Planners\/Planners.OpenAI\/Stepwise\/GeneratePlan.yaml\">introduced a prompt<\/a> at the very beginning asking the AI to enumerate the list of steps it needed to complete a customer\u2019s goal. This prompt was relatively lightweight and provided enough structure for function calling to work on more complex tasks.<\/p>\n<p style=\"text-align: center;\"><a href=\"https:\/\/devblogs.microsoft.com\/semantic-kernel\/wp-content\/uploads\/sites\/78\/2024\/07\/Screenshot-2024-07-23-at-10.19.09\u202fAM.png\"><img decoding=\"async\" class=\"aligncenter wp-image-3036 size-full\" src=\"https:\/\/devblogs.microsoft.com\/semantic-kernel\/wp-content\/uploads\/sites\/78\/2024\/07\/Screenshot-2024-07-23-at-10.19.09\u202fAM.png\" alt=\"Image Screenshot 2024 07 23 at 10 19 09 AM\" width=\"1890\" height=\"726\" srcset=\"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-content\/uploads\/sites\/78\/2024\/07\/Screenshot-2024-07-23-at-10.19.09\u202fAM.png 1890w, https:\/\/devblogs.microsoft.com\/agent-framework\/wp-content\/uploads\/sites\/78\/2024\/07\/Screenshot-2024-07-23-at-10.19.09\u202fAM-300x115.png 300w, https:\/\/devblogs.microsoft.com\/agent-framework\/wp-content\/uploads\/sites\/78\/2024\/07\/Screenshot-2024-07-23-at-10.19.09\u202fAM-1024x393.png 1024w, https:\/\/devblogs.microsoft.com\/agent-framework\/wp-content\/uploads\/sites\/78\/2024\/07\/Screenshot-2024-07-23-at-10.19.09\u202fAM-768x295.png 768w, https:\/\/devblogs.microsoft.com\/agent-framework\/wp-content\/uploads\/sites\/78\/2024\/07\/Screenshot-2024-07-23-at-10.19.09\u202fAM-1536x590.png 1536w\" sizes=\"(max-width: 1890px) 100vw, 1890px\" \/><\/a><span style=\"font-size: 10pt;\"><em>The prompt in the function calling stepwise planner<\/em><\/span><\/p>\n<h3>The \u201cbetter\u201d way to ReAct<\/h3>\n<p>As function calling has gotten better, however, this additional step has become less necessary, and instead, has gotten in the way of functionality needed in enterprise applications&#8230;<\/p>\n<ol>\n<li>Streaming was difficult to implement<\/li>\n<li>The model had a harder time making parallel function calls<\/li>\n<li>Reusing the entire context from a chat history object became difficult<\/li>\n<li>And customization was difficult<\/li>\n<\/ol>\n<p>So we thought to ourselves&#8230; what if we got out of the way entirely and <em>just<\/em> encouraged customers to use \u201cvanilla\u201d function calling. Over the last few months, the results have been astounding. Customers could achieve everything the function calling stepwise planner could with fewer tokens, more control, and significantly lower time-to-first token.<\/p>\n<p style=\"text-align: center;\"><a href=\"https:\/\/devblogs.microsoft.com\/semantic-kernel\/wp-content\/uploads\/sites\/78\/2024\/07\/turn-on-lights.gif\"><img decoding=\"async\" class=\"size-full wp-image-3040 aligncenter\" src=\"https:\/\/devblogs.microsoft.com\/semantic-kernel\/wp-content\/uploads\/sites\/78\/2024\/07\/turn-on-lights.gif\" alt=\"Image turn on lights\" width=\"780\" height=\"132\" \/><\/a><em><span style=\"font-size: 10pt;\">Streaming with function calling<\/span><\/em><\/p>\n<p>Because of this, we\u2019ll be sunsetting the function calling stepwise planner in favor of \u201cvanilla\u201d function calling. <a href=\"https:\/\/learn.microsoft.com\/en-us\/semantic-kernel\/concepts\/planning?pivots=programming-language-csharp#what-about-the-function-calling-stepwise-and-handlebars-planners\">Our docs have already been updated<\/a> to reflect this new recommendation, and shortly, we\u2019ll be publishing a blog detailing how to migrate your code if you already use the function calling stepwise planner.<\/p>\n<h2>Handlebars planner<\/h2>\n<p>The <a href=\"https:\/\/github.com\/dmytrostruk\/semantic-kernel\/tree\/main\/dotnet\/src\/Planners\/Planners.Handlebars\">Handlebars planner<\/a> was the natural successor to the stepwise planner. Both were powerful because it allowed the LLM to generate an entire plan in a single LLM call. This had several benefits: a user could approve an entire \u201cplan\u201d before execution began and it theoretically use fewer tokens.<\/p>\n<p>The challenge, however, is how do you tell the LLM how to generate a plan using as few tokens as possible? With the original sequential planner, we had to \u201cteach\u201d the LLM how to generate custom XML <a href=\"https:\/\/github.com\/dmytrostruk\/semantic-kernel\/blob\/afb9f29ac90b9ff23d2063619133561eec555c1d\/dotnet\/src\/Extensions\/Planning.SequentialPlanner\/skprompt.txt\">within a single prompt<\/a>. This was relatively expensive and yielded poor results because the LLM couldn\u2019t use the knowledge to generate this novel XML structure.<\/p>\n<p style=\"text-align: center;\"><a href=\"https:\/\/devblogs.microsoft.com\/semantic-kernel\/wp-content\/uploads\/sites\/78\/2024\/07\/Screenshot-2024-07-23-at-10.18.43\u202fAM.png\"><img decoding=\"async\" class=\"wp-image-3035 size-full aligncenter\" src=\"https:\/\/devblogs.microsoft.com\/semantic-kernel\/wp-content\/uploads\/sites\/78\/2024\/07\/Screenshot-2024-07-23-at-10.18.43\u202fAM.png\" alt=\"Image Screenshot 2024 07 23 at 10 18 43 AM\" width=\"1996\" height=\"1223\" srcset=\"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-content\/uploads\/sites\/78\/2024\/07\/Screenshot-2024-07-23-at-10.18.43\u202fAM.png 1996w, https:\/\/devblogs.microsoft.com\/agent-framework\/wp-content\/uploads\/sites\/78\/2024\/07\/Screenshot-2024-07-23-at-10.18.43\u202fAM-300x184.png 300w, https:\/\/devblogs.microsoft.com\/agent-framework\/wp-content\/uploads\/sites\/78\/2024\/07\/Screenshot-2024-07-23-at-10.18.43\u202fAM-1024x627.png 1024w, https:\/\/devblogs.microsoft.com\/agent-framework\/wp-content\/uploads\/sites\/78\/2024\/07\/Screenshot-2024-07-23-at-10.18.43\u202fAM-768x471.png 768w, https:\/\/devblogs.microsoft.com\/agent-framework\/wp-content\/uploads\/sites\/78\/2024\/07\/Screenshot-2024-07-23-at-10.18.43\u202fAM-1536x941.png 1536w\" sizes=\"(max-width: 1996px) 100vw, 1996px\" \/><\/a><em>The prompt for the original sequential planner<\/em><\/p>\n<h3>The aha moment: code-based planners are more accurate<\/h3>\n<p>But what if we could use a language that the LLM <em>did<\/em> know how to natively write? Both <a href=\"https:\/\/github.com\/microsoft\/TaskWeaver\">TaskWeaver<\/a> and <a href=\"https:\/\/github.com\/microsoft\/autogen\">AutoGen<\/a> both had success using this approach, so the Semantic Kernel team had LLMs generate plans in several different programming languages (C#, Python, JavaScript) to see if it had more success. In our tests, <em>any<\/em> language performed remarkably better. The new challenge? How do you \u201csafely\u201d run code generated by an LLM?<\/p>\n<p>Without extremely secure and limited runtime containers, we determined that this was not possible to run LLM generated code in an enterprise deployment, so we searched for a language that was purposefully <em>very<\/em> limited so the LLM couldn\u2019t do anything it shouldn\u2019t. We landed on the templating language <a href=\"https:\/\/handlebarsjs.com\/\">Handlebars<\/a> because it could do nothing more than invoke helpers. It also had the benefit of having implementations in almost all languages (meaning we could drive parity across all our SDKs)<\/p>\n<h3>The choice of language matters<\/h3>\n<p>As more customers used the Handlebars planner, we began to realize its limitations. Because the LLMs had less training data on Handlebars templates, we had to make our prompts increasingly more detailed. What originally started off as a cheaper way to generate a plan became just as token intensive as the original sequential planner.<\/p>\n<p>It was around this time that <a href=\"https:\/\/learn.microsoft.com\/en-us\/azure\/container-apps\/sessions-tutorial-semantic-kernel\">Azure Container Apps released dynamic sessions<\/a>. A feature that allows you to generate locked down Python containers for the explicit use of running LLM generated code. The same technology powers the Code interpreter in the Azure Assistants API. We finally had an enterprise way to run a proper coding language that the LLMs were adept at writing!<\/p>\n<h3>Asking LLMs to generate Python instead<\/h3>\n<p>LLMs are particularly good at writing Python since it\u2019s the language of AI researchers. From the very early days of machine learning, models were trained on Python code, and because OpenAI has chosen Python as their language for Code interpreter, the model\u2019s ability to write Python code will only get exponentially better.<\/p>\n<p style=\"text-align: center;\"><a href=\"https:\/\/devblogs.microsoft.com\/semantic-kernel\/wp-content\/uploads\/sites\/78\/2024\/07\/Screenshot-2024-07-23-at-10.31.26\u202fAM.png\"><img decoding=\"async\" class=\"alignnone size-full wp-image-3041 aligncenter\" src=\"https:\/\/devblogs.microsoft.com\/semantic-kernel\/wp-content\/uploads\/sites\/78\/2024\/07\/Screenshot-2024-07-23-at-10.31.26\u202fAM.png\" alt=\"Image Screenshot 2024 07 23 at 10 31 26 AM\" width=\"569\" height=\"473\" srcset=\"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-content\/uploads\/sites\/78\/2024\/07\/Screenshot-2024-07-23-at-10.31.26\u202fAM.png 569w, https:\/\/devblogs.microsoft.com\/agent-framework\/wp-content\/uploads\/sites\/78\/2024\/07\/Screenshot-2024-07-23-at-10.31.26\u202fAM-300x249.png 300w\" sizes=\"(max-width: 569px) 100vw, 569px\" \/><\/a><span style=\"font-size: 10pt;\"><em>Code interpreter in ChatGPT<\/em><\/span><\/p>\n<p>Because of this, we will be replacing our Handlebars planner with a Python version later this fall that behaves more like OpenAI\u2019s current Code interpreter, only with <em>our<\/em> Code interpreter, we will also allow the LLM to invoke local plugins and functions. While we develop this new planner, we recommend that new customers <em>not<\/em> use the existing Handlebars planner because it will be obsoleted.<\/p>\n<blockquote><p>It&#8217;s important to note that\u00a0<em>only<\/em> the Handlebars\u00a0<span style=\"text-decoration: underline;\">planner<\/span> will be deprecated. The Handlebars templating engine will remain for prompt templates.<\/p><\/blockquote>\n<h3>Will a Python-based planner work with C# and Java?<\/h3>\n<p>The greatest concern we\u2019ve heard with moving to a Python-based planner is whether or not this will work with the C# and Java SDK for Semantic Kernel, and for that, we have good news! Just like how we used Handlebars template generation in all three SDKs, we\u2019ll be able to use Python code generation in all three SDKs as well.<\/p>\n<p>To support local development, we\u2019ll provide an out-of-the-box container for local testing. For production deployments, we&#8217;ll provide a connection to Azure Container Apps dynamic sessions.<\/p>\n<p>We also get concern from C# and Java developers who think this will require them to author Python code. Just like the Handlebars planner, we do not expect developers to be author these plans. This is a language that <em>only<\/em> the LLM needs to know to create plans for the user during runtime and Python appears to be the best language LLMs can generate today.<\/p>\n<blockquote><p>Keep an eye out for an announcement in the next few months for when we release our new Python-based planning solution.<\/p><\/blockquote>\n<h2>Give us feedback on our transition<\/h2>\n<p>If you are a current user of the existing planners and have unique scenarios you want to make sure we support with the updated planners, please reach out to us on our <a href=\"https:\/\/github.com\/microsoft\/semantic-kernel\/discussions\/categories\/general\" target=\"_blank\" rel=\"noopener\">Semantic Kernel GitHub Discussion Channel<\/a>!\u00a0We want to ensure that all scenarios of the previous planners continue to be supported.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Since the very earlier days of Semantic Kernel, we have shipped experimental \u201cplanners\u201d that use prompts to generate multi-step plans. This was extremely powerful because it allowed developers to use LLMs (which were created to merely generate text) to begin automating business processes. Since then, the Semantic Kernel team has evolved its experimental planners over [&hellip;]<\/p>\n","protected":false},"author":121401,"featured_media":2364,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[47,1],"tags":[48,63,9],"class_list":["post-3032","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-announcement","category-semantic-kernel","tag-ai","tag-microsoft-semantic-kernel","tag-semantic-kernel"],"acf":[],"blog_post_summary":"<p>Since the very earlier days of Semantic Kernel, we have shipped experimental \u201cplanners\u201d that use prompts to generate multi-step plans. This was extremely powerful because it allowed developers to use LLMs (which were created to merely generate text) to begin automating business processes. Since then, the Semantic Kernel team has evolved its experimental planners over [&hellip;]<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/posts\/3032","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/users\/121401"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/comments?post=3032"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/posts\/3032\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/media\/2364"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/media?parent=3032"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/categories?post=3032"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/tags?post=3032"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}