Have AzCopy, Will Travel
[Update: Azure uses sparse storage, so only the bits that represent user data are transferred. When transferring disk VHDs that are less than full, this will make the throughput look seemingly much better.]
As some of the Azure Government blog posts have eluded to, AzCopy is the preferred way to move files around. Many of us are using Azure credits from our MSDN development accounts, public Azure Pay-As-You-Go accounts, and public Azure EA accounts. This means that most of us will need to move our data files, disk images, DB backups or VM snapshots into our Azure Government accounts (or vice versa). In the case of data disks, or VHDs, these files might be hundreds of GBs in size—slow undertaking, right? Happily, I it’s not so bad with a cloud-to-cloud AzCopy.
AzCopy can move files around in almost any scenario (e.g., on-prem to cloud, cloud to on-prem, across storage accounts etc.). Full details of the complete tool are located here. In this post, I’d like to cover the cross-cloud scenario (i.e. Azure Public /Azure Government).
You can download the GA version of AzCopy here. This will install the app as an add-on tool for the Azure SDK. Also, AzCopy is a command line tool so you will need to locate its location and operate out of that folder. I recommend that you add its file location to your path environment variable. This will allow you to use it no matter how deep you are buried in your system’s folder structure. On my system it worked out to be: %ProgramFiles(x86)%\Microsoft SDKs\Azure\AzCopy. It will vary depending on the install that you use (i.e. 32bit, 64bit, preview etc.).
At its basic, the command looks like this:
AzCopy /Source:<source> /Dest:<destination> /Pattern:<filepattern> [Options]
For the example in this post, I will perform a simple copy from my Public Azure to Azure Government storage account. I will move a complete VM image of my Visual Studio 2015 RC VM (a full 127GB). Here is the command I used:
Azcopy /source:https://[my public storage account].blob.core.windows.net/vhds/ /dest:https://[my government storage account].blob.core.usgovcloudapi.net/vhds /sourcekey:[my public storage key] /destkey:[my government storage key] /pattern:skdev2015RC-20150530-764249-os-2015-05-30.vhd
Here is the layout for the parameters:
- source: this is the URI to the storage account holding the files you want to copy
- dest:this is the URI for the storage account where you want the files to be copied to
- pattern:this is the file match pattern that will identify the files that will be copied. In my example, I just used the entire filename of my Virtual Hard Drive (VHD), but there are various wildcard and pattern matching techniques that can be used.
- sourcekey/destkey: this is the primary or secondary key for the respective storage accounts. The keys are available in storage account section within the Azure Portal:
Once executed, the console will display will display a transfer speed number; however, this number is a sample of the transfer on individual blocks and typically is well below the actual overall throughput (read further for the real numbers).
When the transfer is complete you will see the following:
This is the key point I’d like to draw your attention to—the copy throughput between clouds. I’ll use a simple calculator to demonstrate:
My 127GB VHD (w/sparse storage) blazed through at 327 mbps—not bad! Your mileage may vary, but this is going to be the most cost effective technique for most.
Items to note:
- Azure uses sparse storage, so only the bits that represent user data are transferred. When transferring disk VHDs that are less than full, this will make the throughput look seemingly much better
- The actual throughput numbers may be more than double the number the console shows.
- The internals of AzCopy use the Azure Copy Blob REST API. If your endpoints are https (the default), you transport will be secured over SSL.
- There isn’t an SLA for the AZCopy tool and depending on network conditions, throughput could vary considerably, but for most, this is a great tool for copying files between Azure Commercial/Gov.
- Egress traffic from the source location is billable
- Be sure that when you are copying disks that they are detached and/or are not online (this will ensure the data is in a consistent state).