How to develop AI Apps and Agents in Azure - A Visual Guide

As organizations explore new AI-powered experiences and automated workflows, there’s a growing need to move beyond experiments and proofs-of-concept to production-ready applications. This guide walks you through the essential steps and decisions for building robust AI applications in Azure, focusing on reliability, security, and enterprise-grade quality.

Why Choose Azure’s Managed Services?

It’s easy to experiment with generative AI models and create proof-of-concept demos, but building production-ready applications that can scale reliably is a different challenge entirely. When deploying AI-powered applications for real business use, you need infrastructure that provides consistent performance, robust security, and reliable operations. Did you know that OpenAI’s ChatGPT, GitHub Copilot, and Microsoft’s Copilots, are all deployed on Azure’s managed services? Managed services reduce uncertainty when deploying AI agents with specific goals and guardrails, making them accessible to organizations of all sizes.

In this article I am going to provide you with a visual map to help you decide which Azure AI service is best for your use case. Let’s get started:

Need a Quick Start?

The quickest way to get started is by going to Azure AI Foundry at https://ai.azure.com, which serves as your central hub for AI development, offering:

Playgrounds: Add your data and ground models to your content through managed RAG (Retrieval Augmented Generation) with just a few clicks. Deploy production-ready chat experiences quickly without complex setup
Prompt Flow: Enables you to do evaluation-driven development, tune prompts, integrate with tools, and provides built-in observability and troubleshooting
Agent Service: Enable secure, scalable single-purpose agents with managed RAG, managed function calling, and bring-your-own customization options. Seamlessly integrate with enterprise systems

Pro Tip: Start Your AI Journey with Azure AI Foundry

Azure AI Foundry provides everything you need to kickstart your AI application development journey. It offers an intuitive platform with built-in development tools, essential AI capabilities, and ready-to-use models (1800+!). As your needs grow, you can seamlessly integrate additional Azure managed services to enhance your AI solutions further. This makes it the ideal starting point for both beginners and experienced developers.

Azure AI Foundry

Need More Control? Let’s Build Your Stack

What Kind of Model Do You Need?

Selecting the right AI model is a critical decision that impacts your application’s capabilities, performance, and cost-effectiveness. Azure offers a comprehensive range of models to address different requirements, from multimodal reasoning to specialized tasks. Here’s a guide to help you choose the most suitable model for your specific needs:

Requirement	Options
Multimodal reasoning (text + images)	Azure OpenAI GPT-4o: Latest multimodal model for understanding both text and images Llama models: Open-source multi-modal foundation model, designed for various natural language processing tasks, offering flexibility and customization for developers
Sensitive to latency and cost	Smaller LLMs like 4o-mini: optimized for lower latency and cost, making them suitable for applications where quick responses and resource efficiency are critical. Choose from 1800+ models in Azure AI Model Catalog: Specialized models for specific tasks
Embeddings for search or classification	Azure OpenAI’s text-embedding-3 family: provides embeddings that capture semantic meanings of text (vector representations), enabling effective search, classification, and clustering tasks. Cohere embeddings: Alternative text embedding models with strong multilingual support
Working with images	Azure OpenAI’s CLIP model + AI Search: enables vector-based image search by understanding and associating images with textual descriptions, enhancing image retrieval capabilities.
Advanced reasoning (System-2)	o1-preview: designed for complex problem-solving with built-in reflection mechanisms, enabling advanced reasoning and decision-making processes. o1-mini: offers efficient reasoning capabilities, providing a balance between performance and resource utilization for applications requiring streamlined decision-making.

Pro Tip: Choosing the Right Model

Start with Azure AI Foundry’s model catalog to explore and experiment with different models. For most enterprise applications, consider using GPT-4o for complex multimodal tasks, while leveraging specialized models like 4o-mini for latency-sensitive operations. When building RAG applications, pair embedding models with your LLM – Azure OpenAI’s text-embedding-3 family works seamlessly with GPT models. Remember that you can always switch or combine models as your needs evolve, so focus on finding the right balance between capability and efficiency for your specific use case.

How Will Your Agent Remember Things?

When building AI applications, choosing the right storage solution is crucial for managing different types of data effectively. Here’s a guide to help you select the appropriate memory solution for your needs:

Requirement	Options
Search Capabilities	Azure AI Search: Enterprise-grade search service with built-in AI capabilities including: Multi-modal semantic search OCR and image analysis Translation services Rich integrations with Azure AI services
Frequently Accessed Knowledge	Cosmos DB: Globally distributed database with multi-model support Azure Redis Cache: In-memory data store for high-performance scenarios requiring low latency, which is also integrated with APIM’s Gen-AI Gateway for semantic caching Azure AI Search: Combines search capabilities with knowledge storage
Episodic Memory (interaction history) and Knowledge Graphs	Cosmos DB: Native graph database support for complex relationship modeling via Gremlin API, or using GraphRAG solution accelerator Azure Database for PostgreSQL: Uses Apache Graph Extension for graph capabilities in a relational database – see Azure PostgreSQL’s GraphRaG implementation
Operational Data with Semantic Retrieval	NoSQL preference → Cosmos DB with DiskANN: Ideal for applications needing global distribution and vector search SQL preference → PostgreSQL with pgvector: Best for applications requiring both traditional SQL capabilities and vector operations MongoDB preference → MongoDB vCore: Fully managed MongoDB service with vector search capabilities

Pro Tip: Choosing the Right Memory Solution

Start by evaluating your search needs – Azure AI Search provides comprehensive multi-modal search capabilities with built-in AI services integration. For frequent data access, consider combining Azure Redis Cache for performance-critical operations with a persistent storage solution like Cosmos DB. When building knowledge graphs, leverage the GraphRAG solution accelerators available for both Cosmos DB and PostgreSQL to simplify implementation.

Where Will You Run Your Application?

Choosing the right runtime environment and frontend infrastructure is crucial for your AI application’s performance, scalability, and maintainability. Azure offers various options to match your specific deployment needs, from simple web apps to complex containerized solutions. Here’s a guide to help you select the most appropriate runtime configuration:

Requirement	Options
Web Applications	Azure App Service: Fully managed platform for building, deploying, and scaling web apps Built-in CI/CD integration Automatic scaling and load balancing Enterprise-grade security and compliance
Serverless and Event-Driven	Azure Container Apps: Fully managed serverless container service for AI workloads Serverless GPUs with scale-to-zero Dynamic Sessions for secure code interpretation with Hyper-V isolation Built-in data governance (data never leaves container boundaries) Enterprise features like private endpoints and planned maintenance Azure Functions: Serverless compute with Azure OpenAI integration OpenAI triggers and bindings for chat assistants and RAG patterns Pay-per-execution pricing with Flex consumption Support for vector stores (AI Search, Cosmos DB MongoDB, ADX) Managed identity support for secure service access
Container Orchestration	Azure Kubernetes Service (AKS): Managed Kubernetes service for complex container orchestration Full container orchestration control Multi-container deployments Enterprise-grade security features Advanced networking and scaling options
Communication Features	Azure Communication Services: Comprehensive platform for adding communication capabilities Voice and video calling SMS and chat functionality Easy integration with existing applications

Pro Tip: Choosing the Right Runtime Environment

Consider starting with Azure App Service for straightforward web applications. For event-driven workloads, both Azure Container Apps and Functions offer serverless capabilities with automatic scaling – choose Container Apps when you need container flexibility or GPU support, and Functions for lightweight compute with AI bindings. If you need full container orchestration control, AKS provides enterprise-grade Kubernetes management.

How Will Your AI Agent Take Action?

When building AI applications that need to interact with the real world, you’ll need tools that enable your agents to take actions, process information, and integrate with enterprise systems. Azure provides a comprehensive set of tools that let your AI agents create real-world impact while maintaining security and control. With the AI Agent Service in Azure AI Foundry, integrating these tools has become even more streamlined. Here’s a guide to help you choose the right tools for your AI application:

Requirement	Options
Plugins and Workflows (Function Calling)	Logic Apps: The primary tool for enabling AI agents to take actions Native integration with AI Agent Service for seamless function calling Support for On-Behalf-Of (OBO) flows 200+ pre-built connectors for enterprise systems Visual workflow designer for complex orchestrations Azure Functions: Serverless compute for custom tool implementations Custom function calling implementations Integration with AI services via bindings
AI Services (via APIs)	Content Understanding: Process and structure any content type Unified processing of documents, images, videos, and audio Field extraction with configurable schema Built-in confidence scoring and source grounding Ideal for automation, RAG, and analytics workflows Document Intelligence: Extract and analyze information from documents Vision: Process and analyze images and videos Language: Natural language processing and understanding Speech & Avatar: Voice interaction and digital human experiences
Code Interpreter	Azure Container Apps Dynamic Sessions: Secure environment for running AI-generated code Isolated execution environment Support for multiple programming languages Integration with AI Agent Service

Pro Tip: Leveraging AI Agent Service for Tool Integration

The AI Agent Service in Azure AI Foundry significantly simplifies tool integration for your AI applications. It provides managed function calling capabilities and seamless integration with Logic Apps, making it easier to implement complex workflows and system interactions. When building AI agents that need to take actions, start with Logic Apps for orchestration and leverage the AI Agent Service’s built-in support for OBO flows and enterprise system integration.

How Will You Ensure Quality and Safety?

Enterprise AI applications require comprehensive quality controls across safety, evaluation, security, and reliability dimensions. Azure provides integrated services to help you build AI applications that meet the highest quality standards. Here’s a guide to help you implement the right quality attributes:

Requirement	Options
Quality Attributes & Reliability	API Management with GenAI Gateway: Enterprise-grade API management Token limit management and monitoring Semantic caching for optimized responses Load balancing across multiple endpoints Event Hubs: Reliable flow control Service Bus: Reliable messaging Azure Monitor: Comprehensive observability
AI Safety	Azure AI Content Safety: Comprehensive content safety service Detect harmful content in text and images Built-in Prompt Shields for LLM attack protection Support for custom safety categories Integration with Microsoft Defender for Cloud for threat protection
Evaluation & LLMOps	Azure AI Foundry Evaluations: Integrated evaluation platform Built-in metrics for quality, safety, and performance AI-assisted and NLP-based evaluation methods Support for custom evaluation flows Comprehensive evaluation metrics library Prompt Flow (integrated with AI Foundry): Evaluation-driven development Flow-based evaluation orchestration Built-in observability and troubleshooting
Security	Microsoft Entra Managed Identity: Secure identity management Microsoft Defender for Cloud: AI-specific threat protection Real-time threat detection for AI workloads Integration with Defender XDR Protection against data leakage and poisoning

Pro Tip: Building Enterprise-Grade AI Applications

Start with Azure AI Foundry’s evaluation capabilities to assess your application’s quality and safety. Use Content Safety service to protect against harmful content and integrate with API Management’s GenAI Gateway for production-grade reliability. Implement Microsoft Defender for Cloud to ensure comprehensive security coverage for your AI workloads. This layered approach helps create AI applications that meet enterprise requirements for quality, safety, and reliability.

Need Additional Development Support?

When building AI applications, you can accelerate your development by leveraging battle-tested frameworks that provide abstracted design patterns and pre-built integrations with the above managed services:

Framework	Capabilities
Semantic Kernel	Microsoft’s open-source SDK that integrates LLMs with conventional programming languages (C#, Python, Java). Ideal for enterprise applications requiring tight integration with existing code.
AutoGen	Framework for building multi-agent applications, enabling sophisticated agent-to-agent interactions and complex task completion.
Langchain	Popular framework for building LLM applications with ready-to-use components for common patterns like RAG, agents, and chains.
LlamaIndex	Data framework specialized in connecting custom data with LLMs, offering advanced RAG capabilities and data connectors.

Ready to Start Building?

Choose your path based on your needs:

Want the quickest start? Head to Azure AI Foundry for a guided experience with built-in best practices and patterns.
Need more control? Start with AI App templates for common patterns, or build your stack from scratch by selecting your models, memory solutions, and deployment options from the choices above.
Looking for development frameworks? Use battle-tested frameworks like Semantic Kernel, AutoGen, or LangChain that provide abstracted design patterns and pre-built integrations for rapid development.

Remember: Major players like ChatGPT, GitHub Copilot, and Microsoft’s Copilots all run on these same services – you’re building on proven infrastructure. To accelerate your development:

Landing Zone Reference Architectures: Ready-to-deploy infrastructure templates that follow best practices for security, scaling, and governance
AI App Templates: Quickly customize existing AI applications for your specific business needs using production-tested patterns

AI App Templates

Prerequisites to Get Started

Before you begin, ensure you have:

An Azure subscription
User role with Azure AI Developer permissions
Azure AI Inference Deployment Operator permissions (if models aren’t already deployed)

This guide will continue to evolve as Azure’s AI capabilities expand. Start building today and transform your AI experiments into production-ready applications!