Microsoft Azure IaaS Architecture Best Practices for ARM
How to design and build an enterprise infrastructure in Azure using the Azure Resource Manager portal
Getting started in Azure is easy to do, and you can have production workloads running in the cloud in very little time. However, there are some essential aspects of the Azure platform that require some forethought and planning. While it is easy to get up and running quickly, without the necessary planning in some areas, you could find it necessary to rebuild these workloads later if you haven’t fully considered the bigger picture—from an enterprise perspective. Let’s avoid the necessity of having to redeploy or redesign your Azure architecture later by considering upfront those things that may become an issue later.
This post describes and demonstrates the best practices for implementing a consistent naming convention, Resource Group management strategy, and creating architectural designs for your Azure IaaS deployments. Your actual conventions and strategies will differ depending on your existing methodology, but this sample describes some of the key concepts for you to properly plan for your cloud assets. A video walkthrough guide of these principles in practice, is also available for a deeper understanding of the concepts presented here.
This article builds upon the following blog post that was released previously, and describes similar concepts utilizing Azure Service Management (ASM or Classic) resources: Essential Considerations for Azure Architectural Planning
The very top level container within an Azure enrollment is the subscription. An enrollment can contain many subscriptions—each with their own administrative boundaries. This works well for separation of departments and agencies, as well as for separation of specific workloads such as production, staging, testing, and development. While this is great for establishing clean administrative boundaries, centrally managing many subscriptions can create additional overhead. For instance, a virtual network (VNET) cannot cross a subscription boundary, so if you are utilizing Site-to-Site VPNs for your hybrid connectivity, you will need to create multiple VPNs—one for each VNET. If you are utilizing ExpressRoute (E/R) for your connectivity, this makes it much easier to connect multiple VNETs to your on-premises network. It is also possible to utilize VNET peering to share a single VPN connection as long your VPN edge device supports route-based (or dynamic) routing.
Azure in Education has posted a great article about enterprise and subscription management. For more information, check out this article: Introduction to Azure Enterprise and Subscription Management.
Consistent naming conventions are critical to any government agency or commercial enterprise with numerous different departments, services, networks, and applications. If consistent naming is not applied from the very beginning, resources can quickly become hard to find or rapidly identify. As such, it is important to establish a standard convention that will be used throughout these various services. [Note: these are typical examples and will likely vary from your current established naming convention. Just keep in mind that you should have some mechanism in place to distinguish Azure based assets from on-premises based assets when you determine your actual naming convention.]
Additional guidance on naming convention best practices is located here: Naming Conventions
Within a subscription, the Resource Group (RG) is the top-level container to keep similar workloads or items grouped together. Typically, these RGs are utilized to separate things like virtual machine workloads, network components, storage accounts, and other such items. That makes it easy to go directly to the desired area or workload to find or manage components within it.
A typical resource group naming convention is like the following: RG-Region-Type-SubType/Workload
Example: RG-West-VM-Identity where RG indicates it as a Resource Group West indicates the WestUS region VM indicates that it contains virtual machines Identity indicates the “identity” workload
Example: RG-West-Network where RG indicates it as a Resource Group West indicates the WestUS region Network indicates that it contains the Vnet components
A typical resource group naming convention is like the following: Vnet-Region-Type-SubType/Workload
Example: VNET-West where VNET indicates it as a virtual network component West indicates the WestUS region
Example: VNET-West-GW where VNET indicates it as a virtual network component West indicates the WestUS region GW indicates that it is the gateway component
Example: VNET-West-GW-IP where VNET indicates it as a virtual network component West indicates the WestUS region GW indicates that it is the gateway component IP indicates that it is the IP address of the gateway
Storage accounts use publicly accessible URLs, so they require a globally unique DNS name.
A typical storage account naming convention is like the following: [Entity][Region][Type][Workload].*.core.windows.net (or for Azure Government [Entity][Region][Type][Workload].*.core.usgovcloudapi.net)
Example: spnwwusvmid (https://spnwwusvmid.blob.core.windows.net) where: spnw indicates the enterprise name wus indicates it is located in the WestUS region vm indicates it is for virtual machine disks id indicates that is for the identity workload
Example: spnweussql (https://spnweussql.blob.core.windows.net) where: spnw indicates the enterprise name eus indicates it is located in the EastUS region sql indicates it is for SQL data storage
A typical storage account naming convention is like the following: [Region][Role][Number]
Example: wusdc01 where: wus indicates the WestUS region dc indicates it is a domain controller 01 indicates it is the first domain controller
Example: wusadfs02 where: wus indicates the WestUS region adfs indicates it is an ADFS server 02 indicates it is the second server for this workload
Let’s build a sample scenario of an enterprise SharePoint farm in Azure. This scenario will include a highly available SharePoint farm that is deployed in the WestUS region, with a disaster recovery farm deployed in the EastUS region. They are connected via two on-premises Site-to-Site (S2S) VPNs (one to each region) as well as a VNET-to-VNET VPN that connects WestUS to EastUS. This last link is utilized for Domain Controller and SQL Always-on replication.
Identity Workload: 6 servers in West Region (2) Domain Controllers (2) Load balanced ADFS servers (2) Load balanced Web Proxy servers
SharePoint Workload: 6 servers in West Region (2) Load balanced SharePoint WFE servers (2) SharePoint APP servers (2) Load balanced SQL servers w/always-on
Disaster Recovery Workload: 4 servers in East Region (1) Domain Controller (1) SharePoint WFE server (1) SharePoint APP server (1) SQL Server w/always-on
The RGs that we have defined for this scenario are as follows:
|RG-West-VM-Identity||Contains identity VMs and their storage (DCs, ADFS, Proxy)|
|RG-West-VM-SharePoint||Contains SharePoint VMs (WFE, APP)|
|RG-West-VM-Database||Contains SQL database VMs (SQL)|
|RG-West-Network||Contains network related components (Vnet, S2S VPNs, public IPs, load balancers)|
|RG-East-VM-Identity||Contains identity VMs and their storage (DC)|
|RG-East-VM-SharePoint||Contains SharePoint VMs (WFE, APP)|
|RG-East-VM-Database||Contains SQL database VMs (SQL)|
|RG-East-Network||Contains network related components (Vnet, public IP, etc.)|
The networking components defined for this scenario are as follows:
|Vnet-West||The virtual network configuration (IP ranges, subnets, etc.)|
|Vnet-West-GW||The virtual network gateway|
|Vnet-West-GW-IP||The public IP address of the gateway|
|Vnet-West-GW-Local||The local (on-premises) gateway configuration (IP address, connection type, etc.)|
|Vnet-West-Vnet-East-Connection||The S2S VPN connecting WestUS to EastUS|
|Vnet-West-Local-Connection||The S2S VPN connecting WestUS to on-premises|
|PLB-West-ADFSProxy||The public load balancer for the ADFS proxy servers|
|PLB-West-ADFSProxy-IP||The public IP address of the load balancer|
|ILB-West-ADFS||The internal load balancer for the ADFS servers|
|PLB-West-SP||The public load balancer for the SharePoint WFEs|
|PLB-West-SP-IP||The public IP address of the load balancer|
|Vnet-East||The virtual network configuration (IP ranges, subnets, etc.)|
|Vnet- East -GW||The virtual network gateway|
|Vnet- East -GW-IP||The public IP address of the gateway|
|Vnet- East -GW-Local||The local (on-premises) gateway configuration (IP address, connection type, etc.)|
Storage accounts (S/A) are publicly available locations where your virtual hard drives (and other data types) are stored. They are IOPS limited depending on the type of storage that is required. A standard S/A has a limit of 20K IOPS and utilizes typical HDDs with a maximum IOPS limit of 500 per disk. A premium S/A is limited to 100K IOPS and utilizes typical SSDs with a maximum IOPS limit of 5000 per disk. As such, it is recommended to split your VHDs into several S/As so that your VMs can use their maximum potential data transfer speeds.
|spnwwusvmid||West Identity VM S/A|
|spnwwusvmsp||West SharePoint VM S/A|
|spnwwusvmdb||West SQL data VM S/A|
|spnwwusvmdiag||West VM diagnostics S/A|
|spnweusvmid||East Identity VM S/A|
|spnweusvmsp||East SharePoint VM S/A|
|spnweusvmdb||East SQL data VM S/A|
|spnweusvmdiag||East VM diagnostics S/A|
Availability sets group identical server workloads together to provide high availability in Azure. In order to provide a Service Level Agreement (SLA) for specific virtual machine workloads, each workload must contain at least two servers in an availability set, or single instance machines must utilize premium storage for their virtual hard disks. As such, the best practice is to include two servers running each critical workload. In addition, we will add load balancers to these workloads where required.
For this scenario, the WestUS region is our primary location. The EastUS region will only be utilized if a disaster occurs in the WestUS region, so a few single instance VMs in the East would be fine in that scenario. If desired, a fully redundant and high performing infrastructure could be built in the East as well—including the full ADFS resilient identity workload.
The availability sets defined for this scenario are as follows:
|AS-DC||The A/S for the domain controllers|
|AS-ADFS||The A/S for the ADFS servers|
|AS-ADFSPXY||The A/S for the ADFS proxy servers|
|AS-SPWFE||The A/S for the SharePoint web front end servers|
|AS-SPAPP||The A/S for the SharePoint app servers|
|AS-SPSQL||The A/S for the SharePoint SQL servers|
All of the virtual machine components are listed in the table following:
|VM||IP||Subnet||L/B||Avail Set||VM Size||Resource Group||Storage Account|
The build out of this scenario workload is fully recorded for your review. These videos highlight all the key components of this document so that you can understand how it all comes together in Azure.
2) Creating Virtual Network and VPN Connections in ARM (30 minutes)
3) Creating Basic Virtual Machines in the Azure Portal (23 minutes)
4) Creating Advanced Virtual Machines in the Azure Portal (16 minutes)
We welcome your comments and suggestions to help us continually improve your Azure Government experience. To stay up to date on all things Azure Government, be sure to subscribe to our RSS feed and to receive emails, click “Subscribe by Email!” on the Azure Government Blog. To experience the power of Azure Government for your organization, sign up for an Azure Government Trial.