You've successfully subscribed to Florin Loghiade
Great! Next, complete checkout for full access to Florin Loghiade
Welcome back! You've successfully signed in
Success! Your account is fully activated, you now have access to all content.
Moving away from ARM to Terraform

Moving away from ARM to Terraform

I remember when ARM first came out, and I was skeptical about how the implementation was done. Going from single-resource management in Azure Service Manager to multi-resource management in Azure Resource Manager was a bit much. While looking for explanations on why this is a better model than the old one, I got good answers, one being that it moved from XML to JSON, and you can have parallel deployments with dependencies and much more.

This was the winning answer as I've done automation on ASC, and a PowerShell script to create 10 VMs was 900 lines long; it took 1.3h to deploy them, and if the process failed, rerun it from scratch.

Translating that to ARM after dissecting the ARM schema allowed me to deploy 300 VMs in 30 minutes with the continue from feature if something failed. A considerable improvement, wouldn't you say?

As years passed by and many ARM templates later, I reached the point where they started showing their limitations. One limitation was that debugging a problem was complicated as the error system spewed out JSON garbled mess, general errors, or circular references. The solution to the problem was to migrate to something else, and the framework of choice at that time was Terraform.

We went with Terraform because it allowed us to upskill into a framework that worked almost everywhere and with extensibility options that let us upskill in one framework and go from there.

How to get started?

First, I don't believe I have to tell you that you must take it on a one-by-one approach, starting with a small to large complexity sizing; otherwise, you will get overwhelmed.

Second, this will not be a comprehensive guide on how to do it as it would be a highly complex topic to digest.

The first thing I want to mention when you're getting started is that there is a tool that can help you import resources into the state file and get something started.

Azure/aztfexport: A tool to bring existing Azure resources under Terraform's management (github.com)

GitHub - Azure/aztfexport: A tool to bring existing Azure resources under Terraform’s management
A tool to bring existing Azure resources under Terraform’s management - GitHub - Azure/aztfexport: A tool to bring existing Azure resources under Terraform’s management

This tool is a lifesaver when you want to start importing resources into a state file; however, it will not create comprehensive terraform main files, which remains a job for you.

The first approach to get going with terraform is to start using modules from the first place. My recommendation is to create or reuse modules that target specific resources. For starters, you write a module for creating VMs, and you need to consider that the module needs to be reusable, just like a PowerShell module.

Example VM Module

resource "random_string" "random" {
  length           = 5
  number = true
  special          = false
  override_special = "/@£$"
}
  
resource "azurerm_network_interface" "main" {
  name                = "${var.prefix}-nic-${random_string.random.result}"
  location            = var.location
  resource_group_name = var.rg_name

  ip_configuration {
    name                          = "${var.prefix}-ipconfiguration-${random_string.random.result}"
    subnet_id                     = var.subnet_id
    private_ip_address_allocation = "Dynamic"
  }
}

resource "azurerm_virtual_machine" "main" {
  name                  = "${var.vm_name}-${random_string.random.result}"
  location            = var.location
  resource_group_name = var.rg_name
  network_interface_ids = [azurerm_network_interface.main.id]
  vm_size               = var.vm_size

  storage_image_reference {
    publisher = "Canonical"
    offer     = "UbuntuServer"
    sku       = "20.04-LTS"
    version   = "latest"
  }
  storage_os_disk {
    name              = "${var.prefix}-osdisk-${random_string.random.result}"
    create_option     = "FromImage"
  }
  os_profile {
    computer_name  = "localhost"
    admin_username = "adminuser"
    admin_password = ${random_string.random.result}
  }
  os_profile_linux_config {
    disable_password_authentication = false
  }
}

The above example is a simple main.tf file, which creates VMs with random suffixes. It's simple and can be called easily from the main terraform file with a for_each loop like shown below.


resource "azurerm_resource_group" "group" {
  name     = var.group
  location = var.location
}

resource "azurerm_virtual_network" "vnet" {
  name                = "${var.prefix}-network"
  address_space       = ["10.0.0.0/16"]
  location            = azurerm_resource_group.group.location
  resource_group_name = azurerm_resource_group.group.name
}

resource "azurerm_subnet" "subnet" {
  name                 = "internal"
  resource_group_name  = azurerm_resource_group.group.name
  virtual_network_name = azurerm_virtual_network.vnet.name
  address_prefixes     = ["10.0.2.0/24"]
}
module "virtual_machines" {

  source   = "./modules/virtual_machine"
  for_each = var.virtual_machines

  vm_name   = each.key
  location  = each.value.location
  vm_size   = each.value.size
  rg_name   = each.value.resource_group
  subnet_id = each.value.subnet_id
  prefix    = var.prefix
}

The variables file looks like this.

  prefix     = "demo"
  location  = "westeurope"
  vm_size   = "Standard_DS1_v2"
  group        = "${var.prefix}-${var.location}-resources"
  subnet_id = azurerm_subnet.subnet1.id
  virtual_machines = {
    "vm1" = { size = var.vm_size, location = var.location, resource_group = var.group, subnet_id = var.subnet_id },
    "vm2" = { size = var.vm_size, location = var.location, resource_group = var.group, subnet_id = var.subnet_id },
    "vm3" = { size = var.vm_size, location = var.location, resource_group = var.group, subnet_id = var.subnet_id },
    "vm4" = { size = var.vm_size, location = var.location, resource_group = var.group, subnet_id = var.subnet_id },
    "vm5" = { size = var.vm_size, location = var.location, resource_group = var.group, subnet_id = var.subnet_id },
    "vm6" = { size = var.vm_size, location = var.location, resource_group = var.group, subnet_id = var.subnet_id },
    "vm7" = { size = var.vm_size, location = var.location, resource_group = var.group, subnet_id = var.subnet_id },
    "vm8" = { size = var.vm_size, location = var.location, resource_group = var.group, subnet_id = var.subnet_id },
    "vm9" = { size = var.vm_size, location = var.location, resource_group = var.group, subnet_id = var.subnet_id },
    "vm10" = { size = var.vm_size, location = var.location, resource_group = var.group, subnet_id = var.subnet_id }
  }

This is a partial working example, but you understand how to do it. The ARM template that the example replaces is this one:

{
  "$schema": "http://schema.management.azure.com/schemas/2015-01-01/deploymentTemplate.json#",
  "contentVersion": "1.0.0.0",

  "parameters": {
    "environmentPrefixName": {
      "type": "string",
      "maxLength": 7,
      "metadata": {
        "description": "Prefix of the environment."
      },
      "defaultValue": "easc"
    },
    "virtualNetworkResourceGroup": {
      "type": "string",
      "metadata": {
        "description": "Name of resource group with VNET."
      },
      "defaultValue": "VirtualNetworks"
    },
    "adminUserName": {
      "type": "string",
      "metadata": {
        "description": "User name for local administrator account."
      },
      "defaultValue": "ecadmin"
    },
    "adminPassword": {
      "type": "securestring",
      "metadata": {
        "description": "Password for local adminstrator account."
      },
      "defaultValue": "PASS@word123"
    },
    "vmWebCount": {
      "type": "int",
      "minValue": 2,
      "maxValue": 9,
      "metadata": {
        "description": "Number of frontend virtual machines behind a loadbalancer."
      },
      "defaultValue": 2
    }
  },
  "variables": {
    "computeApiVersion": "2016-03-30",
    "networkApiVersion": "2016-06-01",
    "storageApiVersion": "2015-06-15",

    "imagePublisher": "Canonical",
    "imageVersion": "latest",
    "imageSKU": "20.04-LTS",
    "imageOffer": "UbuntuServer",

    "location": "[resourceGroup().location]",

    "saVhd01Name": "[concat(parameters('environmentPrefixName'), uniqueString(resourceGroup().id))]",
    "storageAccountType": "Standard_LRS",
    "vNetName": "VMNetworks",
    "vNetSN1Name": "General",
    "vNetID": "[resourceId(parameters('virtualNetworkResourceGroup'), 'Microsoft.Network/virtualNetworks', variables('vNetName'))]",
    "vNetSN1Ref": "[concat(variables('vNetID'), '/subnets/', variables('vNetSN1Name'))]",
    "vmWeb0xComputerName": "[concat(parameters('environmentPrefixName'), 'web0')]",
    "asWebName": "[concat(parameters('environmentPrefixName'), '.ASWEB')]",
    "nicWeb0xName": "[concat(parameters('environmentPrefixName'), '.NICWEB0')]",
    "vmWeb0xName": "[concat(parameters('environmentPrefixName'), '.VMWEB0')]",
    "vmWebSize": "Standard_F2"
  },

  "resources": [
    {
      "apiVersion": "[variables('storageApiVersion')]",
      "type": "Microsoft.Storage/storageAccounts",
      "name": "[variables('saVhd01Name')]",
      "location": "[variables('location')]",
      "properties": {
        "accountType": "[variables('storageAccountType')]"
      }
    },
    {
      "apiVersion": "[variables('computeApiVersion')]",
      "type": "Microsoft.Compute/availabilitySets",
      "name": "[variables('asWebName')]",
      "location": "[variables('location')]",
      "dependsOn": [],
      "properties": {
      }
    },
    {
      "apiVersion": "[variables('networkApiVersion')]",
      "type": "Microsoft.Network/networkInterfaces",
      "name": "[concat(variables('nicWeb0xName'), copyIndex())]",
      "location": "[variables('location')]",
      "properties": {
        "ipConfigurations": [
          {
            "name": "ipconfig1",
            "properties": {
              "privateIPAllocationMethod": "Dynamic",
              "subnet": {
                "id": "[variables('vNetSN1Ref')]"
              }
            }
          }
        ],
        "dnsSettings": {
          "dnsServers": [
          ]
        }
      },
      "copy": {
        "name": "vmCopy",
        "count": "[parameters('vmWebCount')]"
      }
    },
    {
      "apiVersion": "[variables('computeApiVersion')]",
      "type": "Microsoft.Compute/virtualMachines",
      "name": "[concat(variables('vmWeb0xName'), copyIndex())]",
      "location": "[variables('location')]",
      "dependsOn": [
        "[concat('Microsoft.Storage/storageAccounts/', variables('saVhd01Name'))]",
        "[concat('Microsoft.Compute/availabilitySets/', variables('asWebName'))]",
        "[concat('Microsoft.Network/networkInterfaces/', concat(variables('nicWeb0xName'), copyIndex()))]"
      ],
      "properties": {
        "availabilitySet": {
          "id": "[resourceId('Microsoft.Compute/availabilitySets', variables('asWebName'))]"
        },
        "hardwareProfile": {
          "vmSize": "[variables('vmWebSize')]"
        },
        "storageProfile": {
          "imageReference": {
            "publisher": "[variables('imagePublisher')]",
            "offer": "[variables('imageOffer')]",
            "sku": "[variables('imageSKU')]",
            "version": "[variables('imageVersion')]"
          },
          "osDisk": {
            "name": "[concat(concat( variables('vmWeb0xName'), copyIndex() ), '-osdisk')]",
            "vhd": {
              "uri": "[concat('http://', variables('saVhd01Name'), '.blob.core.windows.net/vhds/', concat(concat( variables('vmWeb0xName'), copyIndex() ), '-osdisk.vhd'))]"
            },
            "caching": "ReadWrite",
            "createOption": "FromImage"
          }
        },
        "osProfile": {
          "computerName": "[concat(variables('vmWeb0xComputerName'), copyIndex())]",
          "adminUsername": "[parameters('adminUserName')]",
          "adminPassword": "[parameters('adminPassword')]"
        },
        "networkProfile": {
          "networkInterfaces": [
            {
              "id": "[resourceId('Microsoft.Network/networkInterfaces', concat(variables('nicWeb0xName'), copyIndex()))]",
              "properties": { "primary": true }
            }
          ]
        },
        "diagnosticsProfile": {
          "bootDiagnostics": {
            "enabled": "true",
            "storageUri": "[concat('http://', variables('saVhd01Name'), '.blob.core.windows.net')]"
          }
        }
      },
      "resources": [
      ],
      "copy": {
        "name": "vmCopy",
        "count": "[parameters('vmWebCount')]"
      }
    }
  ]
}

As you can see from the ARM Template above, the simplification and reusability are there, and you expand away from there. Another great thing is that with multiple tfvar files, you can reference in your CI/CD pipelines which should be used.

While the examples don't match 100%, the key takeaway from here is to look at how I create 10 VMs.

 source   = "./modules/virtual_machine"
  for_each = var.virtual_machines

  vm_name   = each.key
  location  = each.value.location
  vm_size   = each.value.size
  rg_name   = each.value.resource_group
  subnet_id = each.value.subnet_id
  prefix    = var.prefix

In Terraform, iterations are done with for_each on an array and then each parameter is populated with each.{value} while in ARM you have something like this.

"copy": {
  "name": "vmCopy",
  "count": "[parameters('vmWebCount')]",
}

Not a highly complex difference, but you have to do the copy block on each resource in the ARM Template.

Another benefit of Terraform is the state management which works similarly to PowerShell Desired State Configuration or Azure Guest Configurations, which prevents configuration drift. ARM doesn't do that. It is a fire-and-forget mechanism system that works well if you've written the template correctly. The snip of the template I showed above is over six years old, and it even works today, while Terraform has had many breaking changes over the years, and some configurations broke after an update.

What's next?

I, for one, went through the hurdles of migrating from ARM to Terraform; honestly, I don't regret it, and it was worth every moment. However, I only converted some of my ARM templates to Terraform because, in some cases, the fire-and-forget mechanism is much better for POCs, demos, or one-off things which shouldn't be part of a CI/CD process.

I've done it with a combination of doing it from scratch, aztfexporter, and a lot of trial & error figuring out what I wanted to do in Terraform. The main takeaway from the article is that you can do it quickly enough if you plan for it correctly, set your expectations right, and with a lot of patience. The benefits of going this route will save a lot of time in the future, and you will not need to reinvent the wheel for new scenarios.

That being said, have a good one!