April 1st, 2025

Building AI Applications with Memory: Mem0 and Azure AI Integration

Farzad Sunavala
Principal Product Manager

Building AI Applications with Memory: Mem0 and Azure AI Integration

TL;DR

Learn how to integrate Mem0 with Azure AI Search and Azure OpenAI to create AI applications with persistent memory. This tutorial provides code examples for setting up a memory layer using Azure services and demonstrates how to build a travel planning assistant that remembers user preferences across conversations.

Introduction

One of the key limitations of most AI systems is their inability to maintain context beyond a single session. This lack of memory significantly impacts the quality of user interactions, often requiring users to repeat information they’ve already provided. Enter Mem0, a powerful memory layer designed specifically for AI applications.

In this guide, we’ll explore how to integrate Mem0 with Azure AI services to create AI applications with persistent memory. We’ll cover:

  1. Setting up Mem0 with Azure AI Search and Azure OpenAI
  2. Basic memory operations (storing, retrieving, and searching memories)
  3. Building a practical travel planning assistant that remembers user preferences

Prerequisites

  • Azure OpenAI account with access to model deployments
  • Azure AI Search service
  • Python environment with required packages

First, let’s install the necessary packages:

pip install mem0ai python-dotenv

Setting Up Your Azure Environment

To get started, you’ll need to configure your Azure environment variables:

import os
from mem0 import Memory
from openai import AzureOpenAI

# Load Azure OpenAI configuration
AZURE_OPENAI_ENDPOINT = os.getenv("AZURE_OPENAI_ENDPOINT")
AZURE_OPENAI_API_KEY = os.getenv("AZURE_OPENAI_API_KEY")
AZURE_OPENAI_CHAT_COMPLETION_DEPLOYED_MODEL_NAME = os.getenv("AZURE_OPENAI_CHAT_COMPLETION_DEPLOYED_MODEL_NAME")
AZURE_OPENAI_EMBEDDING_DEPLOYED_MODEL_NAME = os.getenv("AZURE_OPENAI_EMBEDDING_DEPLOYED_MODEL_NAME")

# Load Azure AI Search configuration
SEARCH_SERVICE_ENDPOINT = os.getenv("AZURE_SEARCH_SERVICE_ENDPOINT")
SEARCH_SERVICE_API_KEY = os.getenv("AZURE_SEARCH_ADMIN_KEY")
SEARCH_SERVICE_NAME = "your-search-service-name"

# Create Azure OpenAI client
azure_openai_client = AzureOpenAI(
    azure_endpoint=AZURE_OPENAI_ENDPOINT,
    api_key=AZURE_OPENAI_API_KEY,
    api_version="2024-10-21"
)

Let’s start with a simple example demonstrating how to store and retrieve memories:

# Configure Mem0 with Azure AI Search
memory_config = {
    "vector_store": {
        "provider": "azure_ai_search",
        "config": {
            "service_name": SEARCH_SERVICE_NAME,
            "api_key": SEARCH_SERVICE_API_KEY,
            "collection_name": "memories",
            "embedding_model_dims": 1536,
        },
    },
    "embedder": {
        "provider": "azure_openai",
        "config": {
            "model": AZURE_OPENAI_EMBEDDING_DEPLOYED_MODEL_NAME,
            "embedding_dims": 1536,
            "azure_kwargs": {
                "api_version": "2024-10-21",
                "azure_deployment": AZURE_OPENAI_EMBEDDING_DEPLOYED_MODEL_NAME,
                "azure_endpoint": AZURE_OPENAI_ENDPOINT,
                "api_key": AZURE_OPENAI_API_KEY,
            },
        },
    },
    "llm": {
        "provider": "azure_openai",
        "config": {
            "model": AZURE_OPENAI_CHAT_COMPLETION_DEPLOYED_MODEL_NAME,
            "temperature": 0.1,
            "max_tokens": 2000,
            "azure_kwargs": {
                "azure_deployment": AZURE_OPENAI_CHAT_COMPLETION_DEPLOYED_MODEL_NAME,
                "api_version": "2024-10-21",
                "azure_endpoint": AZURE_OPENAI_ENDPOINT,
                "api_key": AZURE_OPENAI_API_KEY,
            },
        },
    },
    "version": "v1.1",
}

# Initialize memory
memory = Memory.from_config(memory_config)
print("Mem0 initialized with Azure AI Search")

The configuration above sets up:

  1. Vector Store: Using Azure AI Search to store and retrieve vectors
  2. Embedder: Using Azure OpenAI to generate embeddings for semantic search
  3. LLM: Using Azure OpenAI for language model capabilities

Storing Memories

With Mem0, you can store three types of memories:

  1. Simple statements:
memory.add(
    "I enjoy hiking in national parks and taking landscape photos.",
    user_id="demo_user"
)
  1. Conversations:
conversation = [
    {"role": "user", "content": "I'm planning a trip to Japan in the fall."},
    {"role": "assistant", "content": "That's a great time to visit Japan!"},
    {"role": "user", "content": "I'd like to visit Tokyo and Kyoto, and see Mount Fuji."}
]
memory.add(conversation, user_id="demo_user")
  1. Memories with metadata:
memory.add(
    "I prefer window seats on long flights and usually bring my own headphones.",
    user_id="demo_user",
    metadata={"category": "travel_preferences", "importance": "medium"}
)

Searching Memories

When you need to retrieve relevant memories, you can use the search method:

search_results = memory.search(
    "What are this user's travel plans?",
    user_id="demo_user",
    limit=3
)

for i, result in enumerate(search_results['results'], 1):
    print(f"{i}. {result['memory']} (Score: {result['score']:.4f})")

This will return memories sorted by relevance to the query, along with their similarity scores.

Retrieving All Memories

You can also retrieve all memories for a user:

all_memories = memory.get_all(user_id="demo_user")
print(f"Total memories: {len(all_memories['results'])}")

Building a Travel Planning Assistant with Memory

Now, let’s create a more practical example: a London travel planning assistant that remembers user preferences across conversations.

class TravelAssistant:
    def __init__(self, user_id):
        """Initialize travel assistant with memory for a specific user"""
        self.user_id = user_id

        # Configure memory for travel planning
        memory_config = {
            "vector_store": {
                "provider": "azure_ai_search",
                "config": {
                    "service_name": SEARCH_SERVICE_NAME,
                    "api_key": SEARCH_SERVICE_API_KEY,
                    "collection_name": "travel_memories",
                    "embedding_model_dims": 1536,
                    "compression_type": "binary",
                },
            },
            "llm": {
                "provider": "azure_openai",
                "config": {
                    "model": AZURE_OPENAI_CHAT_COMPLETION_DEPLOYED_MODEL_NAME,
                    "temperature": 0.7,
                    "max_tokens": 800,
                    "azure_kwargs": {
                        "azure_deployment": AZURE_OPENAI_CHAT_COMPLETION_DEPLOYED_MODEL_NAME,
                        "api_version": "2024-10-21",
                        "azure_endpoint": AZURE_OPENAI_ENDPOINT,
                        "api_key": AZURE_OPENAI_API_KEY,
                    },
                },
            },
            "embedder": {
                "provider": "azure_openai",
                "config": {
                    "model": AZURE_OPENAI_EMBEDDING_DEPLOYED_MODEL_NAME,
                    "embedding_dims": 1536,
                    "azure_kwargs": {
                        "api_version": "2024-10-21",
                        "azure_deployment": AZURE_OPENAI_EMBEDDING_DEPLOYED_MODEL_NAME,
                        "azure_endpoint": AZURE_OPENAI_ENDPOINT,
                        "api_key": AZURE_OPENAI_API_KEY,
                    },
                },
            },
            "version": "v1.1",
        }

        self.memory = Memory.from_config(memory_config)
        self.azure_client = azure_openai_client
        print(f"Travel Assistant initialized for user {user_id}")

    def get_response(self, query, memory_context=True):
        """Get response from Azure OpenAI with memory context"""
        # Retrieve relevant memories if enabled
        memory_text = ""
        if memory_context:
            memories = self.memory.search(query, user_id=self.user_id)
            if 'results' in memories and memories['results']:
                memory_text = "\n\nRelevant information from previous conversations:\n"
                for i, mem in enumerate(memories['results'][:5], 1):
                    memory_text += f"{i}. {mem['memory']}\n"
                print(f"Including {len(memories['results'][:5])} memories in context")
            else:
                print("No relevant memories found")

        # Construct messages with system prompt and memory context
        messages = [
            {
                "role": "system",
                "content": "You are a helpful travel assistant for London travel planning. "
                           "Be concise, specific, and helpful. Refer to the user by name when appropriate. "
                           "Recommend specific places when relevant."
            },
            {
                "role": "user",
                "content": f"{query}\n{memory_text}"
            }
        ]

        # Get response from Azure OpenAI
        response = self.azure_client.chat.completions.create(
            model=AZURE_OPENAI_CHAT_COMPLETION_DEPLOYED_MODEL_NAME,
            messages=messages,
            temperature=0.7,
            max_tokens=800,
        )

        # Extract response content
        response_content = response.choices[0].message.content

        # Store the conversation in memory
        conversation = [
            {"role": "user", "content": query},
            {"role": "assistant", "content": response_content}
        ]
        self.memory.add(conversation, user_id=self.user_id)

        return response_content

    def verify_memories(self):
        """Verify what memories have been stored"""
        all_memories = self.memory.get_all(user_id=self.user_id)
        print(f"\n===== STORED MEMORIES ({len(all_memories['results'])}) =====")
        for i, memory in enumerate(all_memories['results'], 1):
            print(f"{i}. {memory['memory']}")
        print("==============================\n")
        return all_memories

Using the Travel Assistant

Now, let’s put our travel assistant to work:

# Create travel assistant for a user
assistant = TravelAssistant(user_id="farzad_london_2025")

# First interaction - Initial inquiry
query1 = "Hi, my name is Farzad. I'm planning a business trip to London next month for about 5 days."
print(f"User: {query1}")
response1 = assistant.get_response(query1, memory_context=False)  # No memories yet
print(f"Assistant: {response1}\n")

# Second interaction - Specific question about fish and chips
query2 = "I need recommendations for fish and chips restaurants near London Bridge cause I love the taste!"
print(f"User: {query2}")
response2 = assistant.get_response(query2)  # Should use memory context
print(f"Assistant: {response2}\n")

# Verify what memories have been stored
assistant.verify_memories()

This demonstration shows how the assistant:

  1. Stores Farzad’s name and travel plans from the first interaction
  2. Remembers these details in the second interaction
  3. Uses the memory context to provide a personalized response

Searching for Specific Preferences

You can also directly search for specific user preferences:

search_query = "What are Farzad's preferences for food in London?"
search_results = assistant.memory.search(search_query, user_id="farzad_london_2025")
print(f"Found {len(search_results['results'])} relevant memories:")
for i, result in enumerate(search_results['results'][:3], 1):
    print(f"{i}. {result['memory']} (Score: {result['score']:.4f})")

This might return results like:

Found 3 relevant memories:
1. Name is Farzad (Score: 0.6696)
2. Needs recommendations for fish and chips restaurants near London Bridge (Score: 0.6564)
3. Loves the taste of fish and chips (Score: 0.6324)

The Power of Persistent Memory

The key advantage of this approach is that the assistant maintains context across multiple interactions. By leveraging Mem0 with Azure AI services, we’ve created a system that:

  1. Remembers user details: The assistant stores information like names, preferences, and travel plans
  2. Personalizes responses: By retrieving relevant memories, the assistant can refer to the user by name and tailor recommendations
  3. Improves over time: As more interactions occur, the system builds a more comprehensive understanding of the user

This persistent memory dramatically improves the user experience by eliminating the need to repeat information in every conversation.

Conclusion

Integrating Mem0 with Azure AI services opens up a world of possibilities for creating more personalized and context-aware AI applications. By maintaining user memories across interactions, we can build assistants that feel more intelligent and responsive to user needs.

This tutorial has shown you how to:

  • Configure Mem0 with Azure AI Search and Azure OpenAI
  • Store and retrieve different types of memories
  • Build a practical travel assistant that remembers user preferences

As you implement this in your own applications, consider the different types of memories you might want to store and how they can be used to enhance user experiences. With Mem0’s flexible memory system, you can create AI applications that truly understand and adapt to individual users over time.

Next Steps

  • Explore using different types of metadata to categorize memories
  • Implement memory expiration or importance scoring
  • Combine memories with other Azure services like Azure AI Services for more advanced features

For more information on Mem0 and its capabilities, visit the Mem0 documentation.

Author

Farzad Sunavala
Principal Product Manager

Building knowledge retrieval capabilities for AI Agents.

0 comments