Introduction
Recently, my team collaborated with a customer eager to explore the use of Azure AI Studio as an easy and robust way to develop a RAG-based “chat with your data” solution.
Our goal was to create a demo that showcased AI Studio’s capabilities and also provide Infrastructure as Code (IaC) templates adhering to the customer’s stringent security standards.
As is common with large enterprises, a critical requirement was ensuring that all resources remained as private as possible. Hence, for each required resource, utilizing a managed virtual network with private endpoints was necessary.
The focus of this blog is about the challenges we faced when testing our secure instance of AI Studio and how we overcame it. This blog does not explore how we created the IaC templates for Azure AI Studio.
The Challenge
We decided to build the demo in our own Azure subscription, which follows similar security standards to those of most enterprises. This meant that accessing Azure resources had to be done from a company-owned or managed device, such as one enrolled in Intune.
Traditional approaches, like setting up a jumpbox and accessing it through Azure Bastion, weren’t feasible due to specific security policies. For example, accessing AI Studio required Azure AD login, which would reject any device not enrolled in Intune, showing a non-compliance message and blocking access until the device was Intune-compliant. Additionally, a shared jumpbox setup is often not scalable as it may be used by different users, which conflicts with our security requirements.
While a jumpbox works well for accessing resources that rely on keys rather than AD, such as a storage account, ACR, or API, they were unsuitable for our scenario involving Azure AD-protected resources.
This presented us with a dual challenge:
- Ensuring secure, compliant access to AI Studio while meeting stringent security requirements.
- Testing the solution without options like Bastion or a jumpbox, which, while commonly used, were not viable given our restrictions.
Decision Flow
To illustrate the flow of our decision-making process, the following decision tree highlights each step, showing how we evaluated options and addressed the DNS resolution challenge after selecting the VPN:
Understanding the Limitations
As part of our user journey and in collaboration with the customer, we began by identifying the limitations before solutioning. Our AI Studio instance was deployed in a managed virtual network with private endpoints to the required resources. Because of this setup:
- Resources are only accessible through private endpoints.
- Resources are not accessible from the public internet.
Attempting to access these resources from a different network results in an error message similar to:
Optional Solutions
From the Azure documentation, three approaches are suggested for accessing such resources:
- ExpressRoute (learn more)
- Azure Bastion with a Jumpbox VM (learn more)
- VPN Connection (In Azure -> Azure VPN Gateway – (learn more))
Let’s evaluate these options in our testing scenario.
Evaluating Access Options
We began by exploring different tools and strategies to securely connect to our Azure resources, each with unique requirements. Starting with a focus on access restrictions, we tested different methods and evaluated how well each approach aligned with our strict security policies. Hereโs a quick overview of our process:
-
ExpressRoute: Our initial thought was to leverage ExpressRoute due to its direct connection benefits. However, we quickly realized it wasn’t a fit for a demo environmentโitโs a robust solution but requires significant setup, making it more suitable for production environments than testing.
-
Azure Bastion with a Jumpbox VM: We then considered Azure Bastion, which initially seemed promising for quick access. But as we delved into the details, we found that it didnโt fully meet our security policies. Accessing resources like storage accounts or APIs might work with Bastion, but its limitations for secure web interfaces meant it wasnโt ideal for AI Studio.
-
VPN Connection: Finally, we tested a VPN connection setup. This approach aligned well with our security policies, allowing us to use company-managed devices and providing a secure and compliant way to access the AI Studio instance. The VPN connection also required manageable setup and configuration, making it the preferred solution.
This journey of exploration led us to the following summary:
Solution | Pros | Cons |
---|---|---|
ExpressRoute | Reliable, high-speed connection suited for production | Significant cost and setup time for demo/testing |
Azure Bastion with Jumpbox | Quick access option for certain resources | Incompatible with strict access requirements |
VPN Connection | Secure, managed, and scalable access for private resources | Configuration effort for networking and DNS resolution |
While opting for a VPN connection addressed the network access issue, it introduced a new challenge: DNS resolution.
DNS Challenge of VPN Connection
In Azure, when you link a Private DNS Zone with a Virtual Network (VNet), DNS resolution for that zone is only available within the VNet. Azure’s default DNS server (168.63.129.16
) forwards requests to the Private DNS Zone, but this DNS server isn’t accessible from outside Azure. Learn more
When accessing resources via VPN, clients need to resolve the private DNS zones associated with the VNet. However, because the Azure DNS can not resolve private DNS Zones from on-premises networks, VPN clients can’t resolve these addresses by default.
Potential Solutions
Given the need for a scalable solution that could accommodate a larger team securely, we began evaluating options for DNS resolution. While a quick test might be achievable with minimal setup, we required a more robust, manageable approach that aligned with enterprise standards. Two potential solutions emerged:
-
Update the Hosts File
Manually updating the hosts file on each client machine to map the private IP addresses. This approach is quick and easy, making it suitable for a “quick and dirty” test for a single user, but itโs cumbersome and not scalable for broader usage.
-
Use Azure DNS Private Resolver
Leverage the Azure DNS Private Resolver, a fully managed service that enables DNS resolution between Azure and on-premises environments without the need to deploy and manage your own DNS servers.
- Advantages: Scalability, manageability
- Disadvantages: Adds complexity and requires additional configuration work
Choosing the Right VPN Solution
We explored two options for setting up the VPN connection:
Solution | Pros | Cons |
---|---|---|
Hosting Our Own VPN Server | – Cost-effective – Full control over configuration – Flexible and customizable |
– Requires ongoing maintenance – Complex initial setup – Dependent on VM uptime |
Using Azure VPN Gateway | – Managed service with high availability – Seamless integration with Azure services – Quick to set up |
– Higher cost – Less customizable – Relies on Azure infrastructure |
Given our time constraints and the need for a reliable, managed solution, we opted for the Azure VPN Gateway.
Our Approach
Architecture Overview
We implemented the following VPN architecture to ensure secure, private access to Azure resources from our managed devices. This setup addresses both the connectivity and DNS resolution challenges by integrating the VPN Gateway and Private DNS Resolver within the Virtual Network.
Main Components
-
Virtual Network Gateway: An Azure-managed VPN Gateway facilitating secure connections between our development machines and the Azure VNet. It connects to a public IP address for external access and resides in a dedicated
GatewaySubnet
within the VNet. Learn more. -
Virtual Network (VNet): The Azure VNet where VPN clients connect and where the DNS Private Resolver is hosted. This VNet contains both the
GatewaySubnet
and the subnet for the DNS Private Resolver. -
Azure DNS Private Resolver: A fully managed service that enables DNS resolution between Azure and on-premises environments without deploying and managing your own DNS servers. This is crucial for resolving private DNS zones within the VNet. Learn more.
Configuration Steps
1. Configure the VNet’s DNS Settings
We changed the VNet’s DNS settings to use the IP address of the DNS Private Resolver’s inbound endpoint. By default, the VNet’s DNS is set to Azure’s well-known IP address (168.63.129.16
), which isn’t accessible from on-premises networks.
Updating the DNS settings ensures that VPN clients use the DNS Private Resolver, enabling them to resolve private DNS zones within the VNet.
Note: Another option is to modify the DNS settings directly on the VPN client to use the DNS Private Resolver IP address. However, we found this approach difficult to configure and not scalable for larger teams. For more details on custom DNS configurations, refer to Azure VPN Client Optional Configurations.
2. Set Up the VPN Gateway
To allow users to authenticate using their Azure AD credentials and to enhance security and simplify management, we configured the VPN Gateway with the following settings:
- Connection Type: Point-to-Site
- Tunnel Type: OpenVPN
- Authentication Type: Azure Active Directory
Integrating with AI Studio VNet
To ensure a secure, flexible, and scalable integration of VPN access with the AI Studio VNet, we evaluated two possible approaches. Our goal was not only to allow private access to AI Studio but also to maintain the option for future connectivity with other resources or architectures if needed.
-
Direct Connection
Connecting VPN clients directly to the Spoke VNet from the AI Studio architecture. This approach provides straightforward access but limits connectivity options with other resources and architectures, potentially reducing flexibility. In the context of Microsoftโs documentation, the term ‘Spoke VNet’ specifically refers to a VNet peered with the hub to enable controlled access to managed resources within AI Studioโs architecture.
-
VNet Peering
Peering the Spoke VNet from the AI Studio architecture with the VNet hosting the VPN Gateway. This setup keeps AI Studio private while allowing VPN clients secure access to its resources. Although VNet peering typically introduces low-latency, there may be a slight delay compared to direct connections, depending on network traffic and configuration. Learn more about VNet Peering.
Given the importance of future-proofing for additional resources, we opted for VNet Peering due to its balance of security, accessibility, and scalability, along with the flexibility to expand access if required. For scenarios that donโt require access to other resources, however, the Direct Connection could be a simpler alternative.
Integration Steps
-
Create VNet Peering
We established a VNet peering connection between the Spoke AI Studio VNet and the VPN Gateway VNet. This allows resources in both VNets to communicate securely.
-
Link Private DNS Zones
We linked the Private DNS Zones from the Spoke AI Studio VNet to the VPN VNet. This enables VPN clients to resolve private IP addresses associated with the AI Studio resources.
Benefits of Our Approach
-
Security: The AI Studio VNet remains private and secure, with controlled access via VPN.
-
Scalability: Additional VNets can be connected using VNet peering and DNS links, making the architecture adaptable to various scenarios.
-
Manageability: Using Azure managed services reduces maintenance overhead and ensures high availability.
Summarize the flow
-
Deploy Secure Resources
-
Deploy VPN infra
-
Configure VPN
-
DNS Private Resolver (Inbound endpoint)
-
VPN VNet DNS Settings target Inbound endpoint
-
Point-to-Site VPN Configuration
-
-
Integration
-
VNet Peerings
-
Example of one Private DNS Zone linked to VPN VNet
-
-
Access Resources
-
Azure VPN Client connected
-
Access of secure resources
-
Conclusion
By leveraging Azure VPN Gateway and Azure DNS Private Resolver, we established a secure and manageable way to access private Azure resources from our company-managed devices. This approach allowed us to meet stringent security requirements while successfully demonstrating AI Studio’s capabilities to our customer.
Our solution is adaptable and can be applied to other scenarios where secure access to private Azure resources is needed, providing a robust framework for similar challenges.
This solution not only met our security needs but also provided valuable lessons for future projects.
Acknowledgements
All testing and iterations for this solution were conducted with the invaluable help of Julian Lee and Mohammad Oloomi. Special thanks to Yogi Srivastava for her assistance with the security assessment, and to Gabriel Monteiro for suggesting the use of the Private DNS Resolver when we encountered the DNS resolution challenge.
References:
The feature image was generated using Bing Image Creator. Terms can be found here.