May 4th, 2023

How to use Hugging Face Models with Semantic Kernel

Nilesh Acharya
Principal Product Manager

Image skpatternlarge

We are thrilled to announce the integration of Semantic Kernel with Hugging Face models!

With this integration, you can leverage the power of Semantic Kernel combined with accessibility of over 190,000+ models from Hugging Face. This integration allows you to use the vast number of models at your fingertips with the latest advancements in Semantic Kernel’s orchestration, skills, planner and contextual memory support.

What is Hugging Face?

Hugging Face is a leading provider of open-source models. Models are pre-trained on large datasets and can be used to quickly perform a variety of tasks, such as sentiment analysis, text classification, and text summarization. Using Hugging Face model services can provide great efficiencies as models are pre-trained, easy to swap out and cost-effective with many free models available.

How to use Semantic Kernel with Hugging Face?

This video will give you a walk-through how to get started or dive right into the Python Sample here. For the remainder of this blog post we will be using the Hugging Face Sample with Skills as reference.

In the first two cells we install the relevant packages with a pip install and import the Semantic Kernel dependances.

!python -m pip install -r requirements.txt

import semantic_kernel as sk
import semantic_kernel.connectors.ai.hugging_face as sk_hf

Next, we create a kernel instance and configure the hugging face services we want to use. In this example we will use gp2 for text completion and sentence-transformers/all-MiniLM-L6-v2 for text embeddings.

kernel = sk.Kernel()

# Configure LLM service
kernel.config.add_text_completion_service(
    "gpt2", sk_hf.HuggingFaceTextCompletion("gpt2", task="text-generation")
)
kernel.config.add_text_embedding_generation_service(
    "sentence-transformers/all-MiniLM-L6-v2",
    sk_hf.HuggingFaceTextEmbedding("sentence-transformers/all-MiniLM-L6-v2"),
)
kernel.register_memory_store(memory_store=sk.memory.VolatileMemoryStore())
kernel.import_skill(sk.core_skills.TextMemorySkill())

We have chosen to use volatile memory, which uses the in-machine memory. We define the text memory skill which we use for this example.

Now we have Kernel setup, the next cell we define the fact memories we want to the model to reference as it provides us responses. In this example we have facts about animals. Free to edit and get creative as you test this out for yourself. Lastly we create a prompt response template that provides the details on how to respond to our query. That is it! Now we are all set to send our query.

The last cell in the notebook, defines the query parameters, relevancy and returns our output.

context = kernel.create_new_context()
context[sk.core_skills.TextMemorySkill.COLLECTION_PARAM] = "animal-facts"
context[sk.core_skills.TextMemorySkill.RELEVANCE_PARAM] = 0.3

context["query1"] = "animal that swims"
context["query2"] = "animal that flies"
context["query3"] = "penguins are?"
output = await kernel.run_async(my_function, input_vars=context.variables)

output = str(output).strip()

query_result1 = await kernel.memory.search_async(
    "animal-facts", context["query1"], limit=1, min_relevance_score=0.3
)
query_result2 = await kernel.memory.search_async(
    "animal-facts", context["query2"], limit=1, min_relevance_score=0.3
)
query_result3 = await kernel.memory.search_async(
    "animal-facts", context["query3"], limit=1, min_relevance_score=0.3
)

print(f"gpt2 completed prompt with: '{output}'")

Feel free to play with token sizes to vary your response lengths and other parameters to test the different responses.

Happy testing!

Next Steps:

Explore the sample in GitHub

Learn more about Semantic Kernel

Join the community and let us know what you think: https://aka.ms/sk/discord

Image skpatternsmallbw

Author

Nilesh Acharya
Principal Product Manager

6 comments

Discussion is closed. Login to edit/delete existing comments.

Newest
Newest
Popular
Oldest
  • manoj kadam

    Hugging Face is a popular open-source platform for building and sharing state-of-the-art models in natural language processing. The Semantic Kernel API, on the other hand, is a powerful tool that allows developers to perform various NLP tasks, such as text classification and entity recognition, using pre-trained models. By using Hugging Face models with the Semantic Kernel API, developers can leverage the strengths of both tools to build more accurate and efficient NLP applications.

    To use Hugging...

    Read more
    • anonymous

      this comment has been deleted.

  • Keith Tobin

    Do we have C# huggingface yet. If not could you give me a pointer of what class need to be overidden or interface e implemented.

    Thanks

    Keith.

Feedback