If you’re coming from on-premises, you know that before you do any changes on a virtual machine (updates, upgrades, configuration changes, etc.), you do a snapshot of the disk just in case of a failure. This is something we’ve all ignored up to a point until it bit us badly and we’ve then just ran snapshots for every change.
The problem in the cloud is that you do not have the same easy way to do VM snapshots and restores in case of an issue. This changed in Azure when they introduced the OS Disk swap feature which allows administrators to run snapshots of a virtual machine, do their thing and if anything goes wrong they just restore the checkpoint.
As I mentioned above, if anything goes wrong with a VM that you’re maintaining then you have a few options available depending on the situation. Before OS Disk swap existed, your only solution to fixing a broken VM was to either restore it from Azure Backup, download the VHD and hope to fix it from a Hyper-V machine, and the worst case was to redeploy it. Now you have the option of just doing a snapshot of the VM, do your stuff and if something happens just swap in the good disk 🙂
How do I do it?
Swapping the OS Disks is a simple operation that can be done via PowerShell or CLI (no portal support yet). You need either the latest AzureRM PowerShell Module installed on your computer or just use the Azure Cloud Shell – shell.azure.com
For this example, I will use my WorkStation VM in Azure located in the Workstation RG
Let’s start by setting the VM object in a variable:
Let’s assume that you utterly broke the VM, beyond repair so now you have to start the OS Disk Swap procedure. This part requires to stop the VM.
#stop the vm and create a new managed disk from the snapshot.
Stop-AzureRmVM -Name $VM.Name -ResourceGroupName $VM.ResourceGroupName
$genMSDiskFromSnapshot = Get-AzureRmSnapshot -SnapshotName SNAP -ResourceGroupName $VM.ResourceGroupName
$newDiskConfig = New-AzureRmDiskConfig -AccountType Premium_LRS -Location $vm.Location -CreateOption Copy -SourceResourceId $genMSDiskFromSnapshot.ID
$newMSDisk = New-AzureRmDisk -DiskName newOSDisk -Disk $newDiskConfig -ResourceGroupName $VM.ResourceGroupName
#Set the new OSDisk
Set-AzureRmVMOSDisk -VM $VM -ManagedDiskId $newMSDisk .Id -Name $newMSDisk.Name
Update-AzureRmVM -VM $vm -ResourceGroupName $vm.ResourceGroupName
The Update command will start up the VM and you will have reverted a disaster 🙂
Managed Disk from SnapshotSwap completed successfully.
After you RDP / SSH to the VM and validate that everything is working as before, you can just take the old disk image and play around with it in Hyper-V instance so you can see what happened or just delete it.
In an earlier blog post, I talked about what is a managed disk and why you should use it and in this post, we will cover how easy is to move your existing virtual machines from the storage account model to managed disk model.
The first thing that you need to do is to plan for failure. Yes, you heard that right. You need to be prepared for things to go wrong so you can have a plan to recover.
Planning and taking action, in this case, is simple. You need to have a recent backup of the VM and to plan for downtime. The conversion process from regular storage disks to managed disks is not an online operation so your VMs need a reboot. If you have VMs in availability sets then this is quite simple as you’re going to take them down one at a time.
Another thing that you need to check, is that if you have any extensions, then all of them should have a success status otherwise this will fail.
After you have a plan to recover in case of a failure, it’s time to convert the VMs.
The VM I have created for this example is pretty simple. It’s a single VM which has an OS Disk and a Data Disk in a Storage Account.
The conversion process.
In order to convert the VM, you will need to turn to PowerShell so you can run some Azure cmdlets.
The first thing you need to make sure is that you have the latest version of the AzureRM PowerShell cmdlets otherwise this will not work.
Select-AzureRmSubscription – This selects the subscription you’re going to perform the conversion process
Stop-AzureRmVM – This as the name says will stop the VM (Remember the planning phase)
ConverTo-AzureRmVmManagedDisk – This cmdlet will convert the VM from normal storage to Managed Disks and will start it up after you
If you have multiple VMs in an availability set then the first thing you need to do is to convert the Availability Set to support Managed Disks. You can do that with the code down below.
Update-AzureRmAvailabilitySet – This command will convert the Availability Set to support managed disks. It will not disrupt available VMs in that availability set, so you can run it without issues.
Once the conversion process is done, the VMs boot up and you’re golden. Just verify if everything is OK and delete the storage account.
That’s the whole process. If you encounter any errors during the conversion phase, just run the cmdlet again to unblock the process. This can happen due to a transient error on Azure’s side and all it needs is a re-run.
In my last blog post, I talked about why we should stop using regular storage accounts for our IaaS VMs and why should we use Managed Disks. In today’s blog post I will talk about how you can modify your existing ARM templates that deploy your VMS to use Managed Disks from now on.
Let’s take a look at a regular storage account based ARM Template:
We have a resources block where we specify a storage account, and we use that resource to create an OS Disk and a Data Disk for the particular VM.
If we want to add more disks then we copy paste the what’s between the dataDisks array a couple of times, modify the LUN and name and we’re happy.
Converting the template to a managed disk format is pretty easy. You first need to reference in the template the API Compute (do note that we’re not modifying storage API) version 2016-04-30-preview or a later version (never use -preview in your production templates!)
You change the storage profile to reference managed disks as shown the code snip below:
Every service that you use in Azure uses storage. We want everything that we create in Azure to be persistent because if it would be temporary, then we would have a problem. When you create a Virtual Machine, you need to create one or two storage accounts where the VMs disks and diagnostic data will sit. This worked well for a while, but when we’re talking about scale, then this becomes an issue when you’re talking about high availability, disaster recovery or even disk maintenance.
To solve the problems from above and all other VM storage-related problems, Microsoft announced a new type of offering called “Managed Disks.”
Why were storage accounts a problem for Virtual Machines?
For starters; Using the ARM model, you would create VMs in a Resource Group and select one single storage account where all those VM disks would sit in. The problem was that when you’re provisioning a storage account, basically you’re tying it to a storage stamp (cluster) and you would get limited performance and scalability. Storage in Azure has the biggest SLA when compared to other services, but it’s not 100%, and mistakes/issues do happen. When you’re saving all your OS / Data disks in one storage account thus having all your eggs in one basket which you should know by now that that’s a bad thing 🙂
To make matters worse, each storage account has a limited amount of IOPS and can have a limited number of disks, e.g., 30000 IOPS for standard storage and 50000 IOPS for premium storage. With ten premium storage disks, you would hit the storage account limits.
Usually, people created 5-10-20 VMs per a single storage account and then they had performance issues because, in peaks, those VMs were consuming all the IOPS of that storage account. Performance is not the single problem that you would have, the other problem was availability, you would create a couple of VMs, set them in an Availability Group for that 99.95% SLA and then a storage issue would happen, and all of them go down. Why? Well because your storage account was created in a single storage cluster which had a problem.
The solution to most of the performance/availability problems were to architect your VM deployments in such a way to benefit from multiple storage accounts. You would create for example 5 storage accounts and use them in a mesh so that if one of them goes down then you’re still online, but that added more work to the IT Admin who had to manage more than one storage account.
Another problem was the security aspect. You would have all your eggs in one basket and no easy way to grant access to that storage account. If you would allow access to the storage account then anybody would be able to download what’s in it. You had no way to assign permissions in a granular form as ARM allows you to. That’s one aspect of the problem. The other element was that storage accounts have access keys and anybody who has access to those keys, would have access to all the contents inside of it.
How does Managed Disks solve all of those problems?
The idea of Managed Disks was to simply IaaS VMs management and security by removing storage accounts from the equation. When you want to create a VM with a managed disk, you specify the type (Standard or Premium) and Azure does all the work for you. No more architecting and managing storage accounts, no more scalability issues and the best one is that each managed disk is considered an ARM resource in the Azure portal, thus granting you the ability to apply RBAC rules to them.
By using Managed Disks, you would get some excellent benefits like:
1. Independent resource management from a security and operation perspective
2. No single point of failure
3. Copy disks instantly inside the same region.
4. Share images without storage account copy operations.
This is just the tip of the iceberg.
Now from a price standpoint things changed a bit. With regular standard storage accounts (not premium) you would be billed the Pay-As-You-Go model, meaning that if I use 1 GB then I pay for 1GB even tho I provisioned a 1 TB disk for my VM. Managed disks are don’t have the same billing model. With them, you pay a fixed amount for each one you provision. If you provision ten 1 TB disks, then you would pay ten times x 1 TB disk at today’s rate. If you use premium disks, then you know exactly the price model. You won’t pay the same price as you would pay for a premium disk but just so that you know, this will be a fixed price depending on the size of the disk.
From a business standpoint, this makes much more sense because you can actually price VMs much easier. You have one VM, and an N amount of disks that costs X. No more storage transactions or any other “hidden” costs.
I won’t reference any price because these change and this would become outdated information faster than I press submit 🙂
The size model is almost identical to the premium disk offering; You have S4, S6, S10, S20 and S30 Standard disks and for reference, I have attached a screenshot with Standard and Premium disks performance and limits.
As you can see, HDD storage is marked with an S (Standard) and SSD storage is marked as before with P (Premium). So if I want a 1 TB HDD storage disk, then I would go for an S30 disk. That simple.
Talking about simplicity, creating a VM with a Managed Disk is easy as you would say 1,2,3. When you create a VM from the portal, you have a step where you have to configure networking, and in that step, you will be asked if you want to use managed disks. If you press yes then that’s it 🙂
From a high availability standpoint, you are not obligated anymore to put all your eggs in one basket or do storage account design to have high availability in case of an underlying storage issue. When you create a Managed Disk, you provision that much space on a storage stamp, and when you create another one, it will go to a different stamp thus having multiple fault domains.
This can be controlled by using availability sets. When you specify an availability set for your deployment, the fault domains for your VMs will also be applied to your Managed Disks.
There is one catch though; Managed Disks are LRS (Locally Redundant Storage) which means that if you need GRS (Geo-Redundant Storage), then you’re still stuck with storage accounts.
I will end this right here as I don’t want to add too much wall of text to a single blog post. I suggest that you give them a try and see what you get 🙂