You've successfully subscribed to Florin Loghiade
Great! Next, complete checkout for full access to Florin Loghiade
Welcome back! You've successfully signed in
Success! Your account is fully activated, you now have access to all content.

PowerShell Scripts to add / remove data disks on Azure VMs

in

First things first. Happy New Year!

So after I finished a long waking up cycle I remembered working on a project that involved migrating some SQL workloads to Azure and those workloads required a high amount of IOPS in order to perform optimally. Now after testing the storage system of the on-premise servers I found that they were capable of delivering about 2000 IOPS which is not much and that was with 15K RPM spindles configured in a RAID 5. Now achieving 2000 IOPS in Azure is very easy from a hardware perspective. On a Standard type of VM, one data disk can offer 500 IOPS so you would need four data disks and then configure a software RAID to stripe the data across all disks. This can be done on Windows using Storage Spaces or MDADM if the operating system is Linux.

Now here’s the fun part with Azure. When you’re using a Standard Storage Account (HDD storage), you’re paying on a usage basis and that means that if you have a VM with 10 TB of storage and you’re using only 10 GB out of the max then you will be paying only for those 10GB. Not more, not less. So if the VM can have 16 data disks then there is no reason to not use that to your advantage because 16 data disks times 500 IOPS each equals 8000 IOPS and you’re only paying for what you use. If you want more than 500 IOPS per disks then you can choose to deploy your solutions on SSD storage using Premium Storage Accounts and get up to 5000 IOPS per data disk and you can get up to 80000 IOPS on a GS5 VM but there’s a different pricing model. Premium storage pricing model is per GB and there are three disks types, P10, P20 and P30 where P10 offers 500 IOPS, P20, 2300 IOPS and P30 offers 5000 IOPS and as I said the pricing is per GB used or not used so if you attach ten P30 disks to a VM, you’re going to pay for all of them even if you’re using only 10 GB.

If you want more information about Azure Standard or Premium Storage, you can read more about it here – https://azure.microsoft.com/en-us/pricing/details/storage/ –

The scripts down below handle both ARM (Azure Resource Manager) and ASM (Azure Service Manager) deployments and you can use them to add a specific number of data disks or the maximum amount. The scripts that remove the data disks are more for proof of concept deployments where you add a bunch, remove, change the sizes and what not so they are not quite recommended to be used on production deployments.

Scripts for adding and removing data disks using Azure Service Manager (Classic):

#requires -Version 2 -Modules Azure
function Add-AzureVMDataDisks
{
    [CmdletBinding()]
    param
    (
        [Parameter(Mandatory = $true, Position = 0)]
        [String]
        $CloudService,
        
        [Parameter(Mandatory = $true, Position = 1)]
        [String]
        $VMname,
        
        [Parameter(Mandatory = $true, ParameterSetName = 'MaxDisks', Position = 2)]
        [Switch]
        $MaxDisks,
        
        [Parameter(Mandatory = $true, ParameterSetName = 'NrOfDisks', Position = 2)]
        [Int]
        $NoOfDisks,
        
        [Parameter(Mandatory = $false, Position = 3)]
        [Int]
        $DiskSize = 1023,
        
        [Parameter(Mandatory = $false, Position = 4)]
        [String]
        [ValidateSet('None','ReadOnly', 'ReadWrite')]
        $HostCaching = 'None'
        
    )
    
    
    $VM = Get-AzureVM -ServiceName $CloudService -Name $VMname 
    $GetMaxDataDiskCount = Get-AzureRoleSize -InstanceSize $VM.InstanceSize
    
    If ($MaxDisks -eq $true)
    
    {
        $NoOfDisks = $GetMaxDataDiskCount.MaxDataDiskCount
    }
        
    if ($NoOfDisks -gt $GetMaxDataDiskCount.MaxDataDiskCount)
    {
        Write-Error -Message "The VM does not support $NoOfDisks data disks. Please reduce the number to a number lesser or equal to $GetMaxDataDiskCount"
        break
    }
    
    $GetStorageURI = (Get-AzureOSDisk -VM $VM).MediaLink.Host

    for($i = 0; $i -le $NoOfDisks-1; $i++) 
    {
        $DiskName = "$VMname-datadisk" + $i.ToString() 
        Add-AzureDataDisk -VM $VM -CreateNew -DiskSizeInGB $DiskSize -DiskLabel $DiskName -LUN $i -MediaLocation "https://$GetStorageURI/vhds/$DiskName.vhd" -HostCaching None
    }        
    $VM | Update-AzureVM
}
function Remove-AzureVMDataDisks
{
    [CmdletBinding()]
    param
    (
        [Parameter(Mandatory = $true, Position = 0)]
        [String]
        $CloudService,
        
        [Parameter(Mandatory = $true, Position = 1)]
        [String]
        $VMname
    )
    
    $VM = Get-AzureVM -ServiceName $CloudService -Name $VMname
    
    $GetDataDisks = $VM | Get-AzureDataDisk
    for($i = 0; $i -le $GetDataDisks.Count-1; $i++) 
    {
        Remove-AzureDataDisk -VM $VM -LUN $i -DeleteVHD
    }        
    
    $VM | Update-AzureVM
}

Scripts for adding and removing data disks using Azure Resource Manager (IaaS V2):

#requires -Version 3 -Modules AzureRM.Compute
function Add-AzureRMVMDataDisks
{
    [CmdletBinding()]
    param
    (
        [Parameter(Mandatory = $true, Position = 0)]
        [String]
        $ResourceGroup,
        
        [Parameter(Mandatory = $true, Position = 1)]
        [String]
        $VMname,
        
        [Parameter(Mandatory = $true, ParameterSetName = 'MaxDisks', Position = 2)]
        [Switch]
        $MaxDisks,
        
        [Parameter(Mandatory = $true, ParameterSetName = 'NrOfDisks', Position = 2)]
        [Int]
        $NoOfDisks,
        
        [Parameter(Mandatory = $false, Position = 3)]
        [Int]
        $DiskSize = 1023,
        
        [Parameter(Mandatory = $false, Position = 4)]
        [String]
        [ValidateSet('None','ReadOnly', 'ReadWrite')]
        $HostCaching = 'None'
        
    )
    
    $VM = Get-AzureRmVM -Name $VMname -ResourceGroupName $ResourceGroup
    $GetMaxDataDiskCount = Get-AzureRmVMSize -Location $VM.Location | Where-Object -Property Name -EQ -Value $VM.HardwareProfile.VmSize
    
    If ($MaxDisks -eq $true)
    
    {
    $NoOfDisks = $GetMaxDataDiskCount.MaxDataDiskCount
    }
        
    if ($NoOfDisks -gt $GetMaxDataDiskCount.MaxDataDiskCount)
    {
        Write-Error -Message "The VM does not support $NoOfDisks data disks. Please reduce the number to a number lesser or equal to $GetMaxDataDiskCount"
        break
    }
    
    $GetStorageURI = ($VM.StorageProfile.OSDisk.Vhd.Uri).Split('/')[2]

    for($i = 0; $i -le $NoOfDisks-1; $i++) 
    {
        $DiskName = "$VMname-datadisk" + $i.ToString() 
        Add-AzureRmVMDataDisk -VM $VM -Name $DiskName -VhdUri "https://$GetStorageURI/vhds/$DiskName.vhd" -Lun $i -Caching $HostCaching -DiskSizeInGB $DiskSize -CreateOption empty
    }        
    $VM | Update-AzureRmVM -
}
#requires -Version 2 -Modules AzureRM.Compute
#requires -Version 2 -Modules AzureRM.Compute
function Remove-AzureRMVMDataDisks
{
    [CmdletBinding()]
    param
    (
        [Parameter(Mandatory = $true, Position = 0)]
        [String]
        $ResourceGroup,
        
        [Parameter(Mandatory = $true, Position = 1)]
        [String]
        $VMname
    )
    
    $VM = Get-AzureRmVM -Name $VMname -ResourceGroupName $ResourceGroup
    
    
    foreach ($VMDataDisk in $VM.DataDiskNames)
    {
        Remove-AzureRMVMDataDisk -VM $VM -DataDiskNames $VMDataDisk -Verbose 
    }
    
    
    $VM | Update-AzureRmVM
}

Once the data disks are attached, the only thing that remains to be done is to create either the Simple Storage Space or MDADM Software Raid and you don’t need to worry about redundancy because that’s handled by Azure’s Storage backend. Each data disk has three copies stored in the datacenter and if required, you can choose to have an additional three in a second datacenter that’s located several hundred kilometers away and you can get that by creating or converting the Storage Account to Geographically Redundant Storage (GRS) which can be easily done by a click in the portal. Now in order to get the best performance, you will need to use Locally Redundant Storage because the async jobs that copies the data to the second datacenter may slow the performance. That being said, you don’t need to create a Mirrored Storage Space or a RAID 1 inside the VM.

For Windows Server 2012/R2 deployments you can use Storage Spaces to stripe the data between the disks and achieve the maximum amount of performance for general workloads and for that you can use following “one liner” to make it so:

$PhysicalDisks = Get-PhysicalDisk -CanPool $True
New-StoragePool -FriendlyName 'DataPool' -StorageSubSystemFriendlyName 'Storage Spaces*' -PhysicalDisks $PhysicalDisks |
New-VirtualDisk -FriendlyName 'DataDisk' -UseMaximumSize -NumberOfColumns $PhysicalDisks.Count -ResiliencySettingName 'Simple' -ProvisioningType Fixed -Interleave 65536 |
Initialize-Disk -Confirm:$false -PassThru |
New-Partition -AssignDriveLetter -UseMaximumSize |
Format-Volume -FileSystem NTFS -NewFileSystemLabel 'DATA' -AllocationUnitSize 64KB -Confirm:$false 

It basically does one simple thing. It get’s the available disks that can be added to the Storage Space then it creates a storage pool and after that it creates a virtual disk and sets the resiliency type, interleave and the number of columns to the number of disks. The default interleave setting is 256KB which is not a very good number for obtaining maximum performance and 64 KB is the best for general workloads like SQL or something else. Now interleave doesn’t say much about what it is or what it does but in short when data is written to disk, it’s broken up in interleaves or slabs of the specific size and by default the interleaves are 256 KB in size and in our example they are 64 KB in size. Configuring the number of columns equal to the number of disks means that data is written to and read from all the disks at the same time. The rest is simple. Initialize, create partition, format and done.

Later edit:
I just realized that I forgot a very important aspect of automation. In order to run the one-line above, we need to login to the server, open up PowerShell with admin rights, paste the one liner and run it. That’s too many steps, so let’s get rid of them shall we?
The best candidate for this job is PowerShell Desired State Configuration or DSC for short and Azure has a nifty extension called Azure PowerShell DSC which can be used to inject DSC configuration files inside a VM and run them and all of this can be done from the ISE without logging on any server.

While creating the configuration, I figured that if I’m still hardcoding the letters I may as well change the Virtual CDROMs letter to Z:\ because why not?

Without further ado, here’s the configuration script:

#requires -Modules PSDesiredStateConfiguration
#requires -Version 4
Configuration CreateStorageSpace
{
    Node localhost
    {
        Script ChangeDVDDriveLetter {

            Getscript = 
            {
                $DVDDrivePath = 'E:\'
                return @{
                    Result     = Test-Path $DVDDrivePath
                    GetScript  = $GetScript
                    TestScript = $TestScript
                    SetScript  = $SetScript
                    }
            }
            SetScript = {
                (Get-CimInstance -Class Win32_CDROMDrive).drive | ForEach-Object -Process {
                    $DVDDrive = mountvol.exe $_ /l
                    mountvol.exe $_ /d
                    $DVDDrive = $DVDDrive.Trim()
                    mountvol.exe z: $DVDDrive
                }
                                
            }
            TestScript = {
                $Path = 'Z:\'
                if (Test-Path $Path)
                {
                    Write-Verbose -Message "The drive with the letter $path exists, no action required"
                    $True
                }
                Else
                {
                    Write-Verbose -Message "The drive with the letter $path is missing. Changing the DVD Drive letter"
                    $False
                }
            }
        }

        Script CreateStoragePool { 
            DependsOn = '[Script]ChangeDVDDriveLetter'
            GetScript = {
                $Path = 'E:\'
                return @{
                    Result     = Test-Path $Path
                    GetScript  = $GetScript
                    TestScript = $TestScript
                    SetScript  = $SetScript
                }
            }


            TestScript = {
                $Path = 'E:\'
                
                if (Test-Path $Path)
                {
                    Write-Verbose -Message "The drive with the letter $path exists, no need to create the Simple Storage Space"
                    $True
                }
                Else
                {
                    Write-Verbose -Message "The drive with the letter $path is missing. Creating the Storage Space"
                    $False
                }
            } 
           
            SetScript = { 
                $PhysicalDisks = Get-PhysicalDisk -CanPool $True
                New-StoragePool -FriendlyName 'DataPool' -StorageSubSystemFriendlyName 'Storage Spaces*' -PhysicalDisks $PhysicalDisks |
                New-VirtualDisk -FriendlyName 'DataDisk' -UseMaximumSize -NumberOfColumns $PhysicalDisks.Count -ResiliencySettingName 'Simple' -ProvisioningType Fixed -Interleave 65536 |
                Initialize-Disk -Confirm:$False -PassThru |
                New-Partition -AssignDriveLetter -UseMaximumSize |
                Format-Volume -FileSystem NTFS -NewFileSystemLabel 'DATA' -AllocationUnitSize 64KB -Confirm:$False
            } 

        }   
    }
} 

Now in order to leverage the Azure DSC Extension, we need to publish the file in a storage account that will be later accessed by the agent to inject in the VM. So copy the script above, save it on your hard drive and then run the Publish-AzureVMDscConfiguration or Publish-AzureRmVMDscConfiguration cmdlet.
For this example I used an ARM VM and the cmdlet syntax looks like this:

Publish-AzureRmVMDscConfiguration -ResourceGroupName 'Tests' -StorageAccountName 'tests6785' -ConfigurationPath 'D:\CreateStorageSpace.ps1'

After the cmdlet is run, a new container named windows-powershell-dsc will be created in the storage account you supplied and then you need to inject the DSC configuration in the VM by running either Set-AzureVMDscExtension or Set-AzureRmVMDscExtension cmdlets

Set-AzureRmVMDscExtension -ResourceGroupName 'Tests' -VMName 'ARMVM' -ArchiveBlobName 'CreateStorageSpace.ps1.zip' -ArchiveStorageAccountName 'tests6785' -ConfigurationName 'CreateStorageSpace' -Location 'West Europe' -Version 1.5 -WmfVersion 4.0 -Verbose

Once you run the cmdlet it will take a while to make it so and you can check the progress in the portal but in the end you will get a ‘Success’ status in the ISE which tells you that everything has been set up and the portal will confirm it.

Status : Succeeded
StatusCode : OK
RequestId : 5d42ecd9-72d7-48d6-91b4-ad6e74613b12
Output :
Error :
StartTime : 1/2/2016 2:06:40 AM +02:00
EndTime : 1/2/2016 2:16:46 AM +02:00
TrackingOperationId : ebbfd0fe-4d77-42ea-b7be-726867126216

AzureDSC_Portal

DSC_StorageSpaces

Cool is it not? ?

Now for Linux, the best way I found to do this procedure is by installing MDADM and creating a level 0 software RAID and after initializing all the disks using fdisk, the mdadm command would look something like this:

sudo mdadm --create /dev/md127 --level 0 --raid-devices n \
  /dev/sdc1 /dev/sdd1 /dev/sde1 /dev/sdf1 ... /dev/sdx1
mkfs -t ext4 /dev/md127
mkdir /data

Please keep in mind one very important thing when you’re adding the newly created disk to fstab is that you need to add the disks UUID and not the disks name like /dev/md127. It’s very important that you use the UUID in fstab because if you reboot the VM and something happens on the iSCSI connection and one of the LUNs gets remounted, very bad things will happen.

You can automate this procedure by doing something like this:

var=$(blkid /dev/md127 -s UUID | awk -F'UUID="|"' '{print $2}')

echo >> /etc/fstab "UUID=$var /data ext4 defaults 0 2"

In the end please test everything I provided in a dev/test environment because I’m not assuming any responsibility for any of the scripts ?

That’s it. Have a great one!