{"id":1563,"date":"2023-11-14T20:57:19","date_gmt":"2023-11-15T04:57:19","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/semantic-kernel\/?p=1563"},"modified":"2023-11-15T16:31:04","modified_gmt":"2023-11-16T00:31:04","slug":"openai-assistants-the-power-of-templated-assistant-instructions","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/agent-framework\/openai-assistants-the-power-of-templated-assistant-instructions\/","title":{"rendered":"OpenAI Assistants: The power of templated assistant instructions"},"content":{"rendered":"<p>Another day, another update from the Semantic Kernel team on OpenAI Assistants! For this article, we wanted to dive into assistant instructions, the key element in the Assistant API that allows you to give assistants their own <span style=\"text-decoration: underline;\">persona<\/span>.<\/p>\n<p>With the existing OpenAI API, instructions are typically static. You define them once for an assistant, and then they\u2019re reused whenever an assistant answers a thread. You can, however, override the instruction every time you create a new run of a thread.<\/p>\n<p style=\"text-align: center;\"><a href=\"https:\/\/devblogs.microsoft.com\/semantic-kernel\/wp-content\/uploads\/sites\/78\/2023\/11\/Screenshot-2023-11-14-at-8.17.11\u202fPM.png\"><img decoding=\"async\" class=\"aligncenter size-large wp-image-1568\" src=\"https:\/\/devblogs.microsoft.com\/semantic-kernel\/wp-content\/uploads\/sites\/78\/2023\/11\/Screenshot-2023-11-14-at-8.17.11\u202fPM-1024x537.png\" alt=\"openai run with instruction overrides\" width=\"640\" height=\"336\" srcset=\"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-content\/uploads\/sites\/78\/2023\/11\/Screenshot-2023-11-14-at-8.17.11\u202fPM-1024x537.png 1024w, https:\/\/devblogs.microsoft.com\/agent-framework\/wp-content\/uploads\/sites\/78\/2023\/11\/Screenshot-2023-11-14-at-8.17.11\u202fPM-300x157.png 300w, https:\/\/devblogs.microsoft.com\/agent-framework\/wp-content\/uploads\/sites\/78\/2023\/11\/Screenshot-2023-11-14-at-8.17.11\u202fPM-768x403.png 768w, https:\/\/devblogs.microsoft.com\/agent-framework\/wp-content\/uploads\/sites\/78\/2023\/11\/Screenshot-2023-11-14-at-8.17.11\u202fPM-1536x805.png 1536w, https:\/\/devblogs.microsoft.com\/agent-framework\/wp-content\/uploads\/sites\/78\/2023\/11\/Screenshot-2023-11-14-at-8.17.11\u202fPM-2048x1074.png 2048w\" sizes=\"(max-width: 640px) 100vw, 640px\" \/><\/a>\n<span style=\"font-size: 10pt;\"><em>A screenshot of the OpenAI API docs for creating a new run of a thread.<\/em><\/span><\/p>\n<p>We\u2019ve taken advantage of this request parameter to make it possible for you as a developer to <em>templatize<\/em> your instructions. This lets you predictably inject additional information to your assistant whenever it runs a thread. For example, you may want to develop a Product Expert assistant that can answer questions about a piece of inventory using injected product data.<\/p>\n<h2>Creating a product expert assistant: before and after<\/h2>\n<p>With the existing Assistant\u2019s API, you could 1) create an assistant per product, 2) upload the relevant files, and 3) use the retrieval tool to get product information. This would be very time consuming, so alternatively, you could also define a function that lets the expert retrieve details about a product via function calling. In both cases, however, you need to spend tokens (and time) so the LLM can retrieve data <em>first<\/em>\u00a0before generating messages.<\/p>\n<p><strong>We call this Dynamic RAG.<\/strong><\/p>\n<p><a href=\"https:\/\/devblogs.microsoft.com\/semantic-kernel\/wp-content\/uploads\/sites\/78\/2023\/11\/rag-with-function-calling.png\"><img decoding=\"async\" class=\"aligncenter size-large wp-image-1565\" src=\"https:\/\/devblogs.microsoft.com\/semantic-kernel\/wp-content\/uploads\/sites\/78\/2023\/11\/rag-with-function-calling-1024x269.png\" alt=\"Image rag with function calling\" width=\"640\" height=\"168\" srcset=\"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-content\/uploads\/sites\/78\/2023\/11\/rag-with-function-calling-1024x269.png 1024w, https:\/\/devblogs.microsoft.com\/agent-framework\/wp-content\/uploads\/sites\/78\/2023\/11\/rag-with-function-calling-300x79.png 300w, https:\/\/devblogs.microsoft.com\/agent-framework\/wp-content\/uploads\/sites\/78\/2023\/11\/rag-with-function-calling-768x202.png 768w, https:\/\/devblogs.microsoft.com\/agent-framework\/wp-content\/uploads\/sites\/78\/2023\/11\/rag-with-function-calling-1536x404.png 1536w, https:\/\/devblogs.microsoft.com\/agent-framework\/wp-content\/uploads\/sites\/78\/2023\/11\/rag-with-function-calling-2048x538.png 2048w\" sizes=\"(max-width: 640px) 100vw, 640px\" \/><\/a><\/p>\n<p>Dynamic RAG has its time and place, especially when you have a <em>lot<\/em> of data and it\u2019s unclear what should be included in the context window.<\/p>\n<p>If, however, you already know what data you need, you could templatize this request and require it for <em>every<\/em> use of the assistant. This lets you save tokens, speed up responses, <em>and<\/em> make your assistants more deterministic.<\/p>\n<p>For example, if you know the ID of a product before chatting with an assistant, you could use that information in a template to automatically retrieve all product information so you don&#8217;t need to rely on the models to retrieve it for you. Below is an example of what the Product Expert template could look like.<\/p>\n<pre class=\"prettyprint language-html\"><code class=\"language-html\">You are a helpful assistant that provides answers about {{getProductName productId}}.\r\nIf you receive any questions that are about a different product, politely decline to answer them.\r\n\r\nThese are the full details of the product.\r\n{{json (getProductDetails productId)}}<\/code><\/pre>\n<p><strong>We call this templated RAG and where Semantic Kernel can <u>uniquely<\/u> help.<\/strong><\/p>\n<p><a href=\"https:\/\/devblogs.microsoft.com\/semantic-kernel\/wp-content\/uploads\/sites\/78\/2023\/11\/rag-with-templates.png\"><img decoding=\"async\" class=\"aligncenter size-large wp-image-1566\" src=\"https:\/\/devblogs.microsoft.com\/semantic-kernel\/wp-content\/uploads\/sites\/78\/2023\/11\/rag-with-templates-1024x358.png\" alt=\"Image rag with templates\" width=\"640\" height=\"224\" srcset=\"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-content\/uploads\/sites\/78\/2023\/11\/rag-with-templates-1024x358.png 1024w, https:\/\/devblogs.microsoft.com\/agent-framework\/wp-content\/uploads\/sites\/78\/2023\/11\/rag-with-templates-300x105.png 300w, https:\/\/devblogs.microsoft.com\/agent-framework\/wp-content\/uploads\/sites\/78\/2023\/11\/rag-with-templates-768x268.png 768w, https:\/\/devblogs.microsoft.com\/agent-framework\/wp-content\/uploads\/sites\/78\/2023\/11\/rag-with-templates-1536x536.png 1536w, https:\/\/devblogs.microsoft.com\/agent-framework\/wp-content\/uploads\/sites\/78\/2023\/11\/rag-with-templates-2048x715.png 2048w\" sizes=\"(max-width: 640px) 100vw, 640px\" \/><\/a><\/p>\n<p>Notice how with a template, we can <u>instantly<\/u> call the API without waiting for OpenAI to tell us to do so. The Assistant can then use this information to instantly create messages (or have better context to call the next required functions).<\/p>\n<p>Semantic Kernel makes this possible because it has template support out-of-the-box. Today, we offer <a href=\"https:\/\/devblogs.microsoft.com\/semantic-kernel\/introducing-v1-0-0-beta6-for-the-net-semantic-kernel-sdk\/#initial-handlebars-support-with-extensibility-for-more\">Handlebars<\/a> and if you want to add your own (e.g., Liquid or Jinja2) you can author your own prompt template factory. Not only can you templatize API calls, but you can also add requests to other models (including local and non-OpenAI models), requests to vector databases, and more!<\/p>\n<h2>Watch us use templated prompts with assistants.<\/h2>\n<p>To see how templates and assistants intersect, we recommend watching the latest demo video we created. In the video, we show the initial value of assistant instructions before demonstrating how you&#8217;ll soon be able to templatize them with Semantic Kernel.<\/p>\n<p><video style=\"max-width: 100%; width: 600px; aspect-ratio: auto 600\/300; height: auto;\" src=\"https:\/\/learn.microsoft.com\/video\/media\/a6403ffb-9d8f-4619-8641-98effa829bcb\/Assistants%20and%20instructions_1700_1920x1080_AACAudio_1043.mp4\" crossorigin=\"anonymous\" poster=\"https:\/\/videoencodingpublicwus.blob.core.windows.net\/docs-video-encoding\/30d52486-451a-4f8c-a252-6df7319b03d8\/draft\/Thumbnail\/Screenshot%202023-11-14%20at%208.28.46%E2%80%AFPM.png?skoid=1603f197-ec42-4fdf-bfea-0574bc283e7a&amp;sktid=975f013f-7f24-47e8-a7d3-abc4752bf346&amp;skt=2023-11-15T02%3A06%3A03Z&amp;ske=2023-11-21T02%3A11%3A03Z&amp;sks=b&amp;skv=2021-08-06&amp;sv=2021-08-06&amp;st=2023-11-15T04%3A30%3A49Z&amp;se=2023-11-15T08%3A30%3A49Z&amp;sr=b&amp;sp=r&amp;sig=4x5W1Imv4VZfgVcO4GLOvi0ba5K2bfy1DWNcGhupX%2BQ%3D\" controls=\"controls\" width=\"300\" height=\"150\"><span data-mce-type=\"bookmark\" style=\"display: inline-block; width: 0px; overflow: hidden; line-height: 0;\" class=\"mce_SELRES_start\">\ufeff<\/span><track kind=\"captions\" src=\"https:\/\/videoencodingpublicwus.blob.core.windows.net\/docs-video-encoding\/30d52486-451a-4f8c-a252-6df7319b03d8\/draft\/Caption\/Assistants%20and%20instructions_en-us.vtt?skoid=1603f197-ec42-4fdf-bfea-0574bc283e7a&amp;sktid=975f013f-7f24-47e8-a7d3-abc4752bf346&amp;skt=2023-11-15T02%3A06%3A03Z&amp;ske=2023-11-21T02%3A11%3A03Z&amp;sks=b&amp;skv=2021-08-06&amp;sv=2021-08-06&amp;st=2023-11-15T04%3A30%3A49Z&amp;se=2023-11-15T08%3A30%3A49Z&amp;sr=b&amp;sp=r&amp;sig=rZeVP%2BjM5%2Fh%2F55y96EIzwzBgDzr67Q406gEjjYsdkJE%3D\" label=\"English (United States)\" srclang=\"en-us\" \/><!--fast-x2ggwg:2-->\nYour browser does not support the video tag.<\/video><\/p>\n<h2>Take a look at our prototype<\/h2>\n<p>If you\u2019re excited to see the sample code, you can look at our <em>very<\/em> early prototype in our <a href=\"https:\/\/github.com\/matthewbolanos\/sk-v1-proposal\">v1 proposal repo<\/a>. It still has some polishing to go through, so don\u2019t be surprised if it\u2019s a bit buggy at first. The team is now refining the implementation so we can bring it into the main branch of Semantic Kernel.<\/p>\n<p>Once it\u2019s in, I\u2019ll share another blog post that walks through how to use the SDK in more detail.<\/p>\n<h2>Have feedback?<\/h2>\n<p>We\u2019ve already gotten great feedback on our proposal to integrate assistants into our SDK. If you have additional thoughts, <a href=\"https:\/\/github.com\/microsoft\/semantic-kernel\/discussions\/3393\">consider sharing them with us there<\/a>!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Another day, another update from the Semantic Kernel team on OpenAI Assistants! For this article, we wanted to dive into assistant instructions, the key element in the Assistant API that allows you to give assistants their own persona. With the existing OpenAI API, instructions are typically static. You define them once for an assistant, and [&hellip;]<\/p>\n","protected":false},"author":121401,"featured_media":1577,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[],"class_list":["post-1563","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-semantic-kernel"],"acf":[],"blog_post_summary":"<p>Another day, another update from the Semantic Kernel team on OpenAI Assistants! For this article, we wanted to dive into assistant instructions, the key element in the Assistant API that allows you to give assistants their own persona. With the existing OpenAI API, instructions are typically static. You define them once for an assistant, and [&hellip;]<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/posts\/1563","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/users\/121401"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/comments?post=1563"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/posts\/1563\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/media\/1577"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/media?parent=1563"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/categories?post=1563"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/tags?post=1563"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}