{"id":3263,"date":"2024-09-04T11:32:18","date_gmt":"2024-09-04T18:32:18","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/semantic-kernel\/?p=3263"},"modified":"2025-02-10T10:19:03","modified_gmt":"2025-02-10T18:19:03","slug":"guest-blog-bring-your-ai-copilots-to-the-edge-with-phi-3-and-semantic-kernel","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/agent-framework\/guest-blog-bring-your-ai-copilots-to-the-edge-with-phi-3-and-semantic-kernel\/","title":{"rendered":"Guest Blog: Bring your AI Copilots to the edge with Phi-3 and Semantic Kernel"},"content":{"rendered":"<p><span style=\"font-size: 18pt;\"><strong>Bring your AI Copilots to the edge with Phi-3 and Semantic Kernel<\/strong><\/span><\/p>\n<p><img decoding=\"async\" src=\"https:\/\/arafattehsin.com\/wp-content\/uploads\/2024\/08\/poster-localrag.png\" alt=\"Local RAG with Semantic Kernel\" \/><\/p>\n<p>Today we&#8217;re featuring a guest author, Arafat Tehsin, who&#8217;s a Microsoft Most Valuable Professional (MVP) for AI. He&#8217;s written an article we&#8217;re sharing below, focused on how to <a href=\"https:\/\/arafattehsin.com\/ai-copilot-offline-phi3-semantic-kernel\/\">Bring your AI Copilots to the edge with Phi-3 and Semantic Kernel<\/a>. We&#8217;ll turn it over to Arafat to share more!<\/p>\n<p>It\u2019s true that not every challenge can be distilled into a straightforward solution. However, as someone who has always believed in the power of simplicity, I think a deeper understanding of the problem often paves the way for more elegant and effective solutions. In an era where data sovereignty and privacy are paramount, every client is concerned about their private information being processed in the cloud. I decided to explore the avenues of 100% local Retrieval-Augmented Generation (RAG) with Microsoft\u2019s Phi-3 and my favorite AI-first framework, Semantic Kernel.<\/p>\n<p>Last Sunday, when I boarded a flight with my colleagues for our EY Microsoft APAC Summit, I thought I\u2019d finish some of the pending tasks on the 8.5 hour long flight. Unfortunately, this did not happen as our flight came with no in-flight WiFi. This gave me an idea of launching Visual Studio, LM Studio and rest of the dev tools that would work without any the internet. Then, I decided to build something that was on my list for a long time. Something that may give you a new direction or an idea to help your customers to meet their productivity needs through\u00a0<strong><em>all local \/ on-prem RAG powered AI agent.<\/em><\/strong><\/p>\n<p>In my blog posts, I try to address those pain points which are not commonly covered by others (<em>otherwise, I feel I am just adding a redundancy<\/em>). For now, I have seen a lot of folks working with local models ranging from Phi-2 to Phi-3 to Llama 3 and so on. However, those solutions are not either covering the support for the\u00a0<a href=\"https:\/\/survey.stackoverflow.co\/2024\/technology#most-popular-technologies-misc-tech-prof\">most robust enterprise<\/a>\u00a0framework, .NET or they were not descriptive enough for newbie AI developers. Therefore, I decided to take this challenge and thought to address it in a few upcoming posts.<\/p>\n<h2>RAG with on-device Phi-3 model<\/h2>\n<p>In this post, we\u2019re going to build a basic Console App to showcase the capability as how you can build a local Copilot solution using Phi-3 and Semantic Kernel without complementing any online service such as Azure OpenAI or OpenAI and the likes. We will then see how you can add RAG capabilities to it by just using a temporary memory and textual content. In our next post, we will cover a complete RAG solution that\u2019d cover the documents like Word, PDF, Markdown, JSON etc. with a constructive memory store. Below is an outcome of what you will achieve with this post. Keep reading on how to achieve this smoothly. \ud83c\udfce\ufe0f<\/p>\n<p><img decoding=\"async\" class=\"aligncenter wp-image-32016 size-full\" src=\"https:\/\/arafattehsin.com\/wp-content\/uploads\/2024\/08\/LocalRAG.gif\" alt=\"Local RAG with Semantic Kernel\" width=\"1114\" height=\"626\" \/><\/p>\n<h2>Background<\/h2>\n<p>Before we move to the prerequisites, let me share how I got here and my motivation going forward. If you\u2019ve been keeping up with the latest AI-first frameworks like LangChain, Semantic Kernel and Small Language Models (SLM) like Phi-3, MobileBERT, or T5-Small, you\u2019ll know that developers and businesses are creating fascinating use-cases and techniques in the Generative AI spectrum. One of the very common ways is either to use\u00a0<a href=\"https:\/\/ollama.com\/\">Ollama<\/a>\u00a0or\u00a0<a href=\"https:\/\/lmstudio.ai\/\">LMStudio<\/a>\u00a0and host your model on a\u00a0<code>localhost<\/code>\u00a0and communicate locally with your apps \/ library and so on. However, as per my\u00a0<a href=\"https:\/\/arafattehsin.com\/beyond-sentiment-analysis-object-detection-with-ml-net\/\">previous work<\/a>\u00a0with ML.NET, I\u2019ve always loved the file based access to machine learning models. Few weeks ago, when I saw Phi-3 is available with\u00a0<a href=\"https:\/\/huggingface.co\/microsoft\/Phi-3-mini-4k-instruct-onnx\">ONNX format<\/a>, I thought I should build a simple RAG solution with that to avoid any\u00a0<code>localhost<\/code>\u00a0connectivity.<\/p>\n<p>After hours of research, I have figured out that the only person who has done this work with .NET is\u00a0<a href=\"https:\/\/x.com\/elbruno\">Bruno Capuano<\/a>\u00a0<em>(massive thanks to you, Bruno) and<\/em>\u00a0he has written a great\u00a0<a href=\"https:\/\/devblogs.microsoft.com\/dotnet\/using-phi3-csharp-with-onnx-for-text-and-vision-samples-md\/?WT.mc_id=AI-MVP-5003464\">blog post<\/a>\u00a0about it which also includes\u00a0<a href=\"https:\/\/github.com\/microsoft\/Phi-3CookBook\">Phi-3 Cookbook<\/a>\u00a0samples for your learning. I figured out that the all the samples were using the\u00a0<a href=\"https:\/\/github.com\/feiyun0112\/SemanticKernel.Connectors.OnnxRuntimeGenAI\">ONNX Gen AI<\/a>\u00a0library which was created by Microsoft MVP,\u00a0<a href=\"https:\/\/github.com\/feiyun0112\"><code>feiyun<\/code><\/a>\u00a0<em>(Unfortunately, I don\u2019t know the real name) and\u00a0<\/em>that was archived. This made me nervous as I did not want to suggest something which is not maintained.<\/p>\n<p>After digging out a little, I found out that\u00a0<code>feiyun<\/code>\u00a0has\u00a0<a href=\"https:\/\/github.com\/microsoft\/semantic-kernel\/pull\/6518\">submitted a PR<\/a>\u00a0in Semantic Kernel repo for it to be a part of the official package. This made me super excited and as soon as the new version got published (last week), I thought I should create my demo using the same package.<\/p>\n<h2>Pre-requisites<\/h2>\n<p>As discussed above, for us to create a very simple .NET app, we need few essential packages as well as a Phi-3 model.<\/p>\n<h3>Download Phi-3<\/h3>\n<p>We will be using the smallest Phi-3 variant which is\u00a0<a class=\"break-words font-mono font-semibold hover:text-blue-600 \" href=\"https:\/\/huggingface.co\/microsoft\/Phi-3-mini-4k-instruct-onnx\">phi-3-mini-4k-instruct-onnx<\/a>\u00a0and can\u00a0<del>easily<\/del>\u00a0be downloaded through Hugging Face either using their\u00a0<code>huggingface-cli<\/code>\u00a0or\u00a0<code>git<\/code>. You can save this model to your choice of folder, I have downloaded them to my\u00a0<code>D:\\models<\/code>\u00a0folder.<\/p>\n<p>Note: Just be very patient when downloading using git as it might take\u00a0<em>forever.\u00a0<\/em>For me, it took more than an hour on 75 Mbps internet.\u00a0<em>It\u2019s that bad!\n<\/em><\/p>\n<h2>Semantic Kernel<\/h2>\n<p>My\u00a0<a href=\"https:\/\/arafattehsin.com\/custom-copilot-semantic-kernel-azure-openai-service\/\">previous post<\/a>\u00a0talks in detail about Copilot and the step-by-step process to build one with Semantic Kernel. In this post, we\u2019ll go straight to the point. Let\u2019s create a simple Console App .NET 8 and add the latest\u00a0<code>Microsoft.SemanticKernel<\/code>\u00a0package to it with a version\u00a0<code>1.16.2<\/code>. In addition to this package, we will also add\u00a0<code>Microsoft.SemanticKernel.Connectors.Onnx<\/code>\u00a0with a version\u00a0<code>1.16.2-alpha<\/code>\u00a0which will allow us to use the ONNX model you downloaded few minutes ago.<\/p>\n<p>Just replace your\u00a0<code>Program.cs<\/code>\u00a0with this code and change the path of your model to make it work.<\/p>\n<pre>#pragma warning disable SKEXP0070\r\n#pragma warning disable SKEXP0050\r\n#pragma warning disable SKEXP0001\r\n#pragma warning disable SKEXP0010\r\n\r\n\/\/ Create a chat completion service\r\nusing Microsoft.SemanticKernel;\r\nusing Microsoft.SemanticKernel.ChatCompletion;\r\nusing Microsoft.SemanticKernel.Connectors.OpenAI;\r\nusing Microsoft.SemanticKernel.Embeddings;\r\nusing Microsoft.SemanticKernel.Memory;\r\nusing Microsoft.SemanticKernel.Plugins.Memory;\r\n\r\n\/\/ Your PHI-3 model location \r\nvar modelPath = @\"D:\\models\\Phi-3-mini-4k-instruct-onnx\\cpu_and_mobile\\cpu-int4-rtn-block-32\";\r\n\r\n\/\/ Load the model and services\r\nvar builder = Kernel.CreateBuilder();\r\nbuilder.AddOnnxRuntimeGenAIChatCompletion(\"phi-3\", modelPath);\r\n\r\n\/\/ Build Kernel\r\nvar kernel = builder.Build();\r\n\r\n\/\/ Create services such as chatCompletionService and embeddingGeneration\r\nvar chatCompletionService = kernel.GetRequiredService&lt;IChatCompletionService&gt;();\r\n\r\nConsole.ForegroundColor = ConsoleColor.Cyan;\r\nConsole.WriteLine(\"\"\"\r\n  _                     _   _____            _____ \r\n | |                   | | |  __ \\     \/\\   \/ ____|\r\n | |     ___   ___ __ _| | | |__) |   \/  \\ | |  __ \r\n | |    \/ _ \\ \/ __\/ _` | | |  _  \/   \/ \/\\ \\| | |_ |\r\n | |___| (_) | (_| (_| | | | | \\ \\  \/ ____ \\ |__| |\r\n |______\\___\/ \\___\\__,_|_| |_|  \\_\\\/_\/    \\_\\_____|         \r\n                                   by Arafat Tehsin              \r\n\"\"\");\r\n\r\n\r\n\/\/ Start the conversation\r\nwhile (true)\r\n{\r\n    \/\/ Get user input\r\n    Console.ForegroundColor = ConsoleColor.White;\r\n    Console.Write(\"User &gt; \");\r\n    var question = Console.ReadLine()!;\r\n\r\n    \/\/ Enable auto function calling\r\n    OpenAIPromptExecutionSettings openAIPromptExecutionSettings = new()\r\n    {\r\n        ToolCallBehavior = ToolCallBehavior.EnableKernelFunctions,\r\n        MaxTokens = 200\r\n    };\r\n\r\n    var response = kernel.InvokePromptStreamingAsync(\r\n        promptTemplate: @\"{{$input}}\",\r\n        arguments: new KernelArguments(openAIPromptExecutionSettings)         \r\n        {\r\n            { \"input\", question }\r\n        });\r\n\r\n    Console.ForegroundColor = ConsoleColor.Green;\r\n    Console.Write(\"\\nAssistant &gt; \");\r\n\r\n    string combinedResponse = string.Empty;\r\n    await foreach (var message in response)\r\n    {\r\n        \/\/Write the response to the console\r\n        Console.Write(message);\r\n        combinedResponse += message;\r\n    }\r\n\r\n    Console.WriteLine();\r\n}<\/pre>\n<p>Now, if you run this code, you will see this:<\/p>\n<p><img decoding=\"async\" class=\"aligncenter size-large wp-image-32008\" src=\"https:\/\/arafattehsin.com\/wp-content\/uploads\/2024\/08\/local-rag-sk-1-1024x337.png\" sizes=\"(max-width: 1020px) 100vw, 1020px\" srcset=\"https:\/\/arafattehsin.com\/wp-content\/uploads\/2024\/08\/local-rag-sk-1-1024x337.png 1024w, https:\/\/arafattehsin.com\/wp-content\/uploads\/2024\/08\/local-rag-sk-1-300x99.png 300w, https:\/\/arafattehsin.com\/wp-content\/uploads\/2024\/08\/local-rag-sk-1-768x253.png 768w, https:\/\/arafattehsin.com\/wp-content\/uploads\/2024\/08\/local-rag-sk-1.png 1107w\" alt=\"Local RAG with Semantic Kernel\" width=\"1020\" height=\"336\" \/><\/p>\n<p>Although, we\u2019re not there yet with RAG but you can still chat with Phi-3 model in a similar way as you\u2019d do with any other Large Language Model (LLM).<\/p>\n<h3>Embeddings<\/h3>\n<p>Now, as I mentioned in my start that the aim of this post is to go all local (not even localhost), this brings another challenge as how we can add embedding capability without using Azure OpenAI, OpenAI, Llama or the similar models.<\/p>\n<h4><strong>Option 1<\/strong><\/h4>\n<p>Well, we won\u2019t go with all of this. Rather, there\u2019s another cool addition to our ecosystem and that\u2019s called, .NET Smart Components. They have brought us the capability of\u00a0<a href=\"https:\/\/github.com\/dotnet-smartcomponents\/smartcomponents\/blob\/main\/docs\/local-embeddings.md\">Local Embeddings<\/a>. Whilst there is another option of local embedding using\u00a0<code>BertOnnxTextEmbeddingGeneration<\/code>\u00a0as a part of ONNX package but I haven\u2019t seen any working examples of that yet.<\/p>\n<p>Now, in order to achieve that, first we need a\u00a0<code>LocalEmbeddings<\/code>\u00a0package.\u00a0<em>I will come to the code part later with everything together to make your life easier.\u00a0<\/em>However, that\u2019s how the project structure will look like.<\/p>\n<p><img decoding=\"async\" class=\"aligncenter size-full wp-image-32011\" src=\"https:\/\/arafattehsin.com\/wp-content\/uploads\/2024\/08\/local-rag-sk-2.png\" sizes=\"(max-width: 605px) 100vw, 605px\" srcset=\"https:\/\/arafattehsin.com\/wp-content\/uploads\/2024\/08\/local-rag-sk-2.png 605w, https:\/\/arafattehsin.com\/wp-content\/uploads\/2024\/08\/local-rag-sk-2-300x118.png 300w\" alt=\"Project structure of SK Local RAG\" width=\"605\" height=\"237\" \/><\/p>\n<p><strong><em>UPDATE: 17 August, 2024<\/em><\/strong><\/p>\n<p>When I published my post, I got a few messages that a lot of folks were not able to run this code because they were encountering issues with Smart Component\u2019s\u00a0<code>LocalEmbeddings<\/code>\u00a0and<code>\u00a0Microsoft.SemanticKernel.Connectors.Onnx<\/code>\u00a0working together. Whilst I raised\u00a0<a href=\"https:\/\/github.com\/microsoft\/semantic-kernel\/issues\/8060\">this bug<\/a>\u00a0myself to the Semantic Kernel repo, I have figured out that my good friend and Microsoft AI MVP,\u00a0<a href=\"https:\/\/github.com\/joslat\/\">Jose Luis Latorre<\/a>\u00a0and\u00a0<a href=\"https:\/\/github.com\/davidpuplava\">David Puplava<\/a>\u00a0suggested a fantastic solution to overcome this problem.<\/p>\n<h4><strong>Option 2<\/strong><\/h4>\n<p>Let\u2019s download a bge-micro-v2 from Hugging Face in a similar way as described above. bge-micro-v2 is a lightweight model suitable for smaller datasets and devices with limited resources. It\u2019s designed to provide fast inference times at the cost of slightly lower accuracy compared to larger models. We\u2019ll be using it for embeddings. Once you have downloaded it, all you have to do is, instead of using\u00a0<code>LocalEmbeddings<\/code>, you will have to replace it with the following code:<\/p>\n<div class=\"enlighter-default enlighter-v-standard enlighter-t-enlighter enlighter-l-csharp enlighter-hover enlighter-linenumbers \">\n<div class=\"enlighter-code\">\n<div class=\"enlighter\">\n<div class=\"\"><\/div>\n<div class=\"\">\n<div>\n<pre class=\"enlighter-clipboard\">\/\/ Your PHI-3 model location\r\nvar phi3modelPath = @\"D:\\models\\Phi-3-mini-4k-instruct-onnx\\cpu_and_mobile\\cpu-int4-rtn-block-32\";\r\nvar bgeModelPath = @\"D:\\models\\bge-micro-v2\\onnx\\model.onnx\";\r\nvar vocabPath = @\"D:\\models\\bge-micro-v2\\vocab.txt\";\r\n\r\n\/\/ Load the model and services\r\nvar builder = Kernel.CreateBuilder();\r\nbuilder.AddOnnxRuntimeGenAIChatCompletion(\"phi-3\", phi3modelPath);\r\nbuilder.AddBertOnnxTextEmbeddingGeneration(bgeModelPath, vocabPath);<\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<h3>Memory<\/h3>\n<p>As I mentioned earlier in the post, the focus is around showing the capability of how you can bring up offline SLMs to your apps therefore, for the storage, we will not be going with any fancy capability rather a simple text-based dictionary. In this case, we\u2019ll going with\u00a0<code>VolatileMemory<\/code>\u00a0store for now with\u00a0<code>SemanticTextMemory<\/code>. We will also make use of\u00a0<code>TextMemoryPlugin<\/code>\u00a0for its out of the box\u00a0<code>Recall<\/code>\u00a0function so we can re-use to find out the answers from the memory.<\/p>\n<p>Let\u2019s also create a folder called\u00a0<code>Helpers<\/code>\u00a0inside our project and within that, create a class and call it\u00a0<code>MemoryHelper.cs<\/code>(refer to the picture in earlier section).\u00a0This will contain the sample data I created for our organisation collection which I named as\u00a0<code>TheLevelOrg<\/code>(<em>Inspired by\u00a0<a href=\"https:\/\/blog.brakmic.com\/intro-to-semantic-kernel-part-four\/\">Harris Brakmic<\/a><\/em>). You can simply replace everything in your\u00a0<code>MemoryHelper.cs<\/code>\u00a0with the below:<\/p>\n<div id=\"gist131843531\" class=\"gist\">\n<div class=\"gist-file\" translate=\"no\" data-color-mode=\"light\" data-light-theme=\"light\">\n<div class=\"gist-meta\">\n<pre>using Microsoft.SemanticKernel.Memory;\r\n\r\n#pragma warning disable SKEXP0001\r\n\r\nnamespace local_rag_sk.Helpers\r\n{\r\n\r\n    internal static class MemoryHelper\r\n    {\r\n        public static async void PopulateInterestingFacts(SemanticTextMemory memory, string collectionName)\r\n        {\r\n            var facts = OrgFact.GetFacts();\r\n            foreach (OrgFact fact in facts)\r\n            {\r\n                await memory.SaveInformationAsync(collection: collectionName, \r\n                    id: fact.Id, \r\n                    text: fact.Text);\r\n            }\r\n        }\r\n    }\r\n\r\n    public class OrgFact\r\n    {\r\n        public string Text { get; }\r\n        public string Id { get; } = Guid.NewGuid().ToString();\r\n        public string Description { get; }\r\n        public string AdditionalMetadata { get; }\r\n\r\n        public OrgFact(string text, string description, string additionalMetadata)\r\n        {\r\n            Text = text;\r\n            Description = description;\r\n            AdditionalMetadata = additionalMetadata;\r\n        }\r\n\r\n        public static IEnumerable&lt;OrgFact&gt; GetFacts()\r\n        {\r\n            var facts = new OrgFact[]\r\n                {\r\n                    new(\"Our headquarters is located in Sydney, Australia.\", \"Headquarters\", \"City: Sydney\"),\r\n                    new(\"We have been in business for 25 years.\", \"Years in Operation\", \"Years: 25\"),\r\n                    new(\"Our corporate sponsor is the Melbourne Football Club.\", \"Corporate Sponsorship\", \"Team: Melbourne Football Club\"),\r\n                    new(\"We have 2 major departments.\", \"Departments\", \"Number: 2\"),\r\n                    new(\"Our team includes developers among other professionals.\", \"Occupation\", \"Job Title: Developer\"),\r\n                    new(\"Our team enjoys outdoor activities such as bushwalking.\", \"Team Activities\", \"Activity: Bushwalking\"),\r\n                    new(\"We have a company pet policy that allows dogs.\", \"Company Pet Policy\", \"Type: Dog\"),\r\n                    new(\"We prefer catering options featuring Australian cuisine.\", \"Catering Preferences\", \"Cuisine: Australian\"),\r\n                    new(\"We have expanded our operations to 5 countries.\", \"International Presence\", \"Countries: 5\"),\r\n                    new(\"Our staff includes graduates from the University of Sydney.\", \"Education\", \"University: Sydney\"),\r\n                    new(\"Our team is multilingual, speaking 3 languages.\", \"Languages Spoken\", \"Number: 3\"),\r\n                    new(\"We have a strict allergen policy, including precautions for peanuts.\", \"Allergen Policy\", \"Allergen: Peanuts\"),\r\n                    new(\"We support athletic achievements, such as participating in marathons.\", \"Athletic Support\", \"Event: Marathon\"),\r\n                    new(\"We have a company-wide collection of Australian art.\", \"Company Initiatives\", \"Item: Australian Art\"),\r\n                    new(\"Our team enjoys the Australian spring season for company events.\", \"Seasonal Preferences\", \"Season: Spring\"),\r\n                    new(\"Our corporate book club's favorite book is 'The Book Thief'.\", \"Corporate Book Club\", \"Book: The Book Thief\"),\r\n                    new(\"We offer vegetarian, vegan, gluten free and halal options in our corporate diet policy.\", \"Dietary Policies\", \"Diet: Vegetarian\"),\r\n                    new(\"We actively support volunteering in local community projects.\", \"Community Engagement\", \"Place: Local Community Projects\"),\r\n                    new(\"We aim to expand our presence to every continent.\", \"Expansion Goals\", \"Goal: Every Continent\"),\r\n                    new(\"Many of our staff members hold advanced degrees, including in Computer Science.\", \"Advanced Education\", \"Degree: Master's in Computer Science\")\r\n                };\r\n            return facts;\r\n        }\r\n    }\r\n}<\/pre>\n<\/div>\n<\/div>\n<\/div>\n<p>Now you can go and replace your previous\u00a0<code>Program.cs<\/code>\u00a0file so it will have all the memory capabilities. I have included the code comments for you to understand it better.<\/p>\n<div id=\"gist131844143\" class=\"gist\">\n<div class=\"gist-file\" translate=\"no\" data-color-mode=\"light\" data-light-theme=\"light\">\n<div class=\"gist-meta\">\n<pre>#pragma warning disable SKEXP0070\r\n#pragma warning disable SKEXP0050\r\n#pragma warning disable SKEXP0001\r\n#pragma warning disable SKEXP0010\r\n\r\n\/\/ Create a chat completion service\r\nusing local_rag_sk.Helpers;\r\nusing Microsoft.SemanticKernel;\r\nusing Microsoft.SemanticKernel.ChatCompletion;\r\nusing Microsoft.SemanticKernel.Connectors.OpenAI;\r\nusing Microsoft.SemanticKernel.Embeddings;\r\nusing Microsoft.SemanticKernel.Memory;\r\nusing Microsoft.SemanticKernel.Plugins.Memory;\r\n\r\n\/\/ Your PHI-3 model location \r\nvar phi3modelPath = @\"D:\\models\\Phi-3-mini-4k-instruct-onnx\\cpu_and_mobile\\cpu-int4-rtn-block-32\";\r\nvar bgeModelPath = @\"D:\\models\\bge-micro-v2\\onnx\\model.onnx\";\r\nvar vocabPath = @\"D:\\models\\bge-micro-v2\\vocab.txt\";\r\n\r\n\/\/ Load the model and services\r\nvar builder = Kernel.CreateBuilder();\r\nbuilder.AddOnnxRuntimeGenAIChatCompletion(\"phi-3\", phi3modelPath);\r\nbuilder.AddBertOnnxTextEmbeddingGeneration(bgeModelPath, vocabPath);\r\n\r\n\/\/ Build Kernel\r\nvar kernel = builder.Build();\r\n\r\n\/\/ Create services such as chatCompletionService and embeddingGeneration\r\nvar chatCompletionService = kernel.GetRequiredService&lt;IChatCompletionService&gt;();\r\nvar embeddingGenerator = kernel.GetRequiredService&lt;ITextEmbeddingGenerationService&gt;();\r\n\r\n\/\/ Setup a memory store and create a memory out of it\r\nvar memoryStore = new VolatileMemoryStore();\r\nvar memory = new SemanticTextMemory(memoryStore, embeddingGenerator);\r\n\r\n\/\/ Loading it for Save, Recall and other methods\r\nkernel.ImportPluginFromObject(new TextMemoryPlugin(memory));\r\n\r\n\/\/ Populate the memory with some interesting facts\r\nstring collectionName = \"TheLevelOrg\";\r\nMemoryHelper.PopulateInterestingFacts(memory, collectionName);\r\n\r\nConsole.ForegroundColor = ConsoleColor.Cyan;\r\nConsole.WriteLine(\"\"\"\r\n  _                     _   _____            _____ \r\n | |                   | | |  __ \\     \/\\   \/ ____|\r\n | |     ___   ___ __ _| | | |__) |   \/  \\ | |  __ \r\n | |    \/ _ \\ \/ __\/ _` | | |  _  \/   \/ \/\\ \\| | |_ |\r\n | |___| (_) | (_| (_| | | | | \\ \\  \/ ____ \\ |__| |\r\n |______\\___\/ \\___\\__,_|_| |_|  \\_\\\/_\/    \\_\\_____|         \r\n                                   by Arafat Tehsin              \r\n\"\"\");\r\n\r\n\r\n\/\/ Start the conversation\r\nwhile (true)\r\n{\r\n    \/\/ Get user input\r\n    Console.ForegroundColor = ConsoleColor.White;\r\n    Console.Write(\"User &gt; \");\r\n    var question = Console.ReadLine()!;\r\n\r\n    \/\/ Settings for the Phi-3 execution\r\n    OpenAIPromptExecutionSettings executionSettings = new()\r\n    {\r\n        ToolCallBehavior = ToolCallBehavior.EnableKernelFunctions,\r\n        MaxTokens = 200\r\n    };\r\n\r\n    \/\/ Invoke the kernel with the user input\r\n    var response = kernel.InvokePromptStreamingAsync(\r\n        promptTemplate: @\"Question: {{$input}}\r\n        Answer the question using the memory content: {{Recall}}\",\r\n        arguments: new KernelArguments(executionSettings)         \r\n        {\r\n            { \"input\", question },\r\n            { \"collection\", collectionName }\r\n        }\r\n        );\r\n\r\n    Console.ForegroundColor = ConsoleColor.Green;\r\n    Console.Write(\"\\nAssistant &gt; \");\r\n\r\n    string combinedResponse = string.Empty;\r\n    await foreach (var message in response)\r\n    {\r\n        \/\/Write the response to the console\r\n        Console.Write(message);\r\n        combinedResponse += message;\r\n    }\r\n\r\n    Console.WriteLine();\r\n}<\/pre>\n<\/div>\n<\/div>\n<\/div>\n<p>As we continue to explore this new world of SLMs, we find new ways to solve problems and innovate. This post is a step towards when you won\u2019t have to rely on the internet capabilities at all for high compute and better results.\u00a0Next, we\u2019ll take on the challenge of integrating different types of documents into our RAG solution. Stay tuned as we\u2019ll dive deeper into memory and document processing to enhance our AI capabilities.<\/p>\n<div class=\"blog-share text-center\">\n<div class=\"fu fv fw fx fy l\">\n<article>\n<div class=\"l\">\n<div class=\"l\">\n<section>\n<div>\n<div class=\"gn go gp gq gr\">\n<div class=\"ab cb\">\n<div class=\"ci bh fz ga gb gc\">\n<figure class=\"nh ni nj nk nl nm ne nf paragraph-image\">\n<div class=\"nn no fj np bh nq\" tabindex=\"0\" role=\"button\">\n<div>\n<h1 id=\"4a64\" class=\"ns nt gu bf nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol om on oo op bk\" style=\"text-align: left;\" data-selectable-paragraph=\"\">Conclusion<\/h1>\n<\/div>\n<\/div>\n<\/figure>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/section>\n<\/div>\n<\/div>\n<\/article>\n<\/div>\n<div class=\"ab cb\">\n<div class=\"ci bh fz ga gb gc\">\n<div class=\"qe qf ab iv\">\n<div class=\"qg ab\">We&#8217;d like to thank Arafat for his time and all of his great work. \u00a0Please reach out if you have any questions or feedback through our <a href=\"https:\/\/github.com\/microsoft\/semantic-kernel\/discussions\/categories\/general\" target=\"_blank\" rel=\"noopener\">Semantic Kernel GitHub Discussion Channel<\/a>. We look forward to hearing from you!\u00a0We would also love your support, if you\u2019ve enjoyed using Semantic Kernel, give us a star on\u00a0<a href=\"https:\/\/github.com\/microsoft\/semantic-kernel\" target=\"_blank\" rel=\"noopener\">GitHub<\/a>.<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>Bring your AI Copilots to the edge with Phi-3 and Semantic Kernel Today we&#8217;re featuring a guest author, Arafat Tehsin, who&#8217;s a Microsoft Most Valuable Professional (MVP) for AI. He&#8217;s written an article we&#8217;re sharing below, focused on how to Bring your AI Copilots to the edge with Phi-3 and Semantic Kernel. We&#8217;ll turn it [&hellip;]<\/p>\n","protected":false},"author":149071,"featured_media":2302,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[117],"tags":[48,63,9],"class_list":["post-3263","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-guest-blog","tag-ai","tag-microsoft-semantic-kernel","tag-semantic-kernel"],"acf":[],"blog_post_summary":"<p>Bring your AI Copilots to the edge with Phi-3 and Semantic Kernel Today we&#8217;re featuring a guest author, Arafat Tehsin, who&#8217;s a Microsoft Most Valuable Professional (MVP) for AI. He&#8217;s written an article we&#8217;re sharing below, focused on how to Bring your AI Copilots to the edge with Phi-3 and Semantic Kernel. We&#8217;ll turn it [&hellip;]<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/posts\/3263","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/users\/149071"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/comments?post=3263"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/posts\/3263\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/media\/2302"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/media?parent=3263"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/categories?post=3263"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/tags?post=3263"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}