How to use Hugging Face Models with Semantic Kernel

Nilesh Acharya

Image skpatternlarge

We are thrilled to announce the integration of Semantic Kernel with Hugging Face models!

With this integration, you can leverage the power of Semantic Kernel combined with accessibility of over 190,000+ models from Hugging Face. This integration allows you to use the vast number of models at your fingertips with the latest advancements in Semantic Kernel’s orchestration, skills, planner and contextual memory support.

What is Hugging Face?

Hugging Face is a leading provider of open-source models. Models are pre-trained on large datasets and can be used to quickly perform a variety of tasks, such as sentiment analysis, text classification, and text summarization. Using Hugging Face model services can provide great efficiencies as models are pre-trained, easy to swap out and cost-effective with many free models available.

How to use Semantic Kernel with Hugging Face?

This video will give you a walk-through how to get started or dive right into the Python Sample here. For the remainder of this blog post we will be using the Hugging Face Sample with Skills as reference.

In the first two cells we install the relevant packages with a pip install and import the Semantic Kernel dependances.

!python -m pip install -r requirements.txt

import semantic_kernel as sk
import semantic_kernel.connectors.ai.hugging_face as sk_hf

Next, we create a kernel instance and configure the hugging face services we want to use. In this example we will use gp2 for text completion and sentence-transformers/all-MiniLM-L6-v2 for text embeddings.

kernel = sk.Kernel()

# Configure LLM service
kernel.config.add_text_completion_service(
    "gpt2", sk_hf.HuggingFaceTextCompletion("gpt2", task="text-generation")
)
kernel.config.add_text_embedding_generation_service(
    "sentence-transformers/all-MiniLM-L6-v2",
    sk_hf.HuggingFaceTextEmbedding("sentence-transformers/all-MiniLM-L6-v2"),
)
kernel.register_memory_store(memory_store=sk.memory.VolatileMemoryStore())
kernel.import_skill(sk.core_skills.TextMemorySkill())

We have chosen to use volatile memory, which uses the in-machine memory. We define the text memory skill which we use for this example.

Now we have Kernel setup, the next cell we define the fact memories we want to the model to reference as it provides us responses. In this example we have facts about animals. Free to edit and get creative as you test this out for yourself. Lastly we create a prompt response template that provides the details on how to respond to our query. That is it! Now we are all set to send our query.

The last cell in the notebook, defines the query parameters, relevancy and returns our output.

context = kernel.create_new_context()
context[sk.core_skills.TextMemorySkill.COLLECTION_PARAM] = "animal-facts"
context[sk.core_skills.TextMemorySkill.RELEVANCE_PARAM] = 0.3

context["query1"] = "animal that swims"
context["query2"] = "animal that flies"
context["query3"] = "penguins are?"
output = await kernel.run_async(my_function, input_vars=context.variables)

output = str(output).strip()

query_result1 = await kernel.memory.search_async(
    "animal-facts", context["query1"], limit=1, min_relevance_score=0.3
)
query_result2 = await kernel.memory.search_async(
    "animal-facts", context["query2"], limit=1, min_relevance_score=0.3
)
query_result3 = await kernel.memory.search_async(
    "animal-facts", context["query3"], limit=1, min_relevance_score=0.3
)

print(f"gpt2 completed prompt with: '{output}'")

Feel free to play with token sizes to vary your response lengths and other parameters to test the different responses.

Happy testing!

Next Steps:

Explore the sample in GitHub

Learn more about Semantic Kernel

Join the community and let us know what you think: https://aka.ms/sk/discord

Image skpatternsmallbw

6 comments

Comments are closed. Login to edit/delete your existing comments

  • Keith Tobin 1

    Do we have C# huggingface yet. If not could you give me a pointer of what class need to be overidden or interface e implemented.

    Thanks

    Keith.

  • manoj kadam 2

    Hugging Face is a popular open-source platform for building and sharing state-of-the-art models in natural language processing. The Semantic Kernel API, on the other hand, is a powerful tool that allows developers to perform various NLP tasks, such as text classification and entity recognition, using pre-trained models. By using Hugging Face models with the Semantic Kernel API, developers can leverage the strengths of both tools to build more accurate and efficient NLP applications.

    To use Hugging Face models with Semantic Kernel, the first step is to install the transformers library, which is required to use Hugging Face models. This library can be installed using pip, a package installer for Python. Once the transformers library is installed, the next step is to choose a pre-trained Hugging Face model that is suitable for the specific NLP task at hand. The Hugging Face model hub provides a wide range of models that can be used for various NLP tasks, such as text classification, question-answering, and language generation.

    After choosing a suitable Hugging Face model, the model needs to be loaded into a Python script using the transformers library. Once the model is loaded, it can be used with the Semantic Kernel API to perform various NLP tasks. For example, the model can be used to classify text into different categories or extract named entities from text.

    Overall, using Hugging Face models with the Semantic Kernel API is a powerful way to build accurate and efficient NLP applications. By leveraging the strengths of both tools, developers can create NLP applications that are more effective and easier to build.

    • anonymous 0

      this comment has been deleted.

Feedback usabilla icon