You've successfully subscribed to Florin Loghiade
Great! Next, complete checkout for full access to Florin Loghiade
Welcome back! You've successfully signed in
Success! Your account is fully activated, you now have access to all content.
Migrating your all your shell automations to AKS

Migrating your all your shell automations to AKS

in

In a previous post (Modernizing Functions in Kubernetes), I talked about modernizing Azure Functions that are deployed as containers. The update since then is that the machine is continuing to evolve, and the number of things that the AKS cluster orchestrates grows exponentially.

Today we'll continue that journey by migrating all the shell scripts we've written under one roof.

My favorite scripting language of choice is PowerShell, and I've been using it since it came out, but it doesn't always get the job done as quickly as Python. Depending on the use case, you might pivot from one to another to achieve whatever you're looking for as fast as possible.

The problem with multiple scripting languages is that they require setup to get them working correctly. So far, Automation Accounts have worked quite well for a while, but migration should be in the works once you have a well-oiled machine that's eager to be used.

Automation accounts can be triggered with a schedule or via a webhook, and you get a nice dashboard where you get all the information you need. The UI is quite impressive; however, the Automation Account service in Azure fell quite behind in features, and in today's world, where everything has to be fast, it doesn't cut it.

Today we will continue migrating our shell scripts and move them to AKS. In the previous post, I talked about migrating Azure Functions inside Kubernetes. While those can be of many times, the overhead of running the Functions code is too high for simple or even complex automation.

Kubernetes has the concept of jobs and cronjobs, and KEDA can even trigger jobs using the Scaled Jobs feature based on different triggers, and I've talked a bit about this subject in the past, but I only went so far in deploying to production. Let's remember it's open to us, but we don't need it now.

Getting started.

The principle remains the same I want to copy-paste, focus on code, and press commit.
First, the building blocks need to be implemented to achieve that, and afterward, we can have the cookie.

Let's start!

We need to centralize everything we have, and with this occasion, we can fix some of the code that we wrote years ago, and now with the extra experience we have, we can make them better or cleaner.

The folder naming convention can be whatever we want; however, it's best to be consistent and have them in a folder named cronjobs or something clear for everybody that they are cronjobs. Those scripts should be split by the scripting language, e.g., PowerShell, Python, Perl, Ruby, Rust, or whatever floats your boat.

Alongside every scripting language folder, you should have a dockerfile to create an image containing all those scripts.

FROM mcr.microsoft.com/powershell:7.2.2-ubuntu-20.04 as linux
COPY . .
Dockerfile for building Powershell image
FROM registry.access.redhat.com/ubi9/python-39
COPY . .
RUN pip install -r requirements.txt --no-cache-dir
Dockerfile for building Python image

The docker files are pretty simple; you copy everything inside the image as they run ad-hoc when called.
The next part is the hard part, the helm charts. As with the previous post, we will need a folder called helm containing all the templates we require to instantiate these cronjobs inside the cluster. The main difference here is that we don't need a complex template to do a significant deployment; this is where we say that we need to run a cronjob on a particular schedule, instantiate a set of credentials and use an identity if required.

apiVersion: batch/v1
kind: CronJob
metadata:
  name: {{ .Release.Name }}-_automation_-job
{{ include "automation.namespace" . | indent 2 }}
spec:
  schedule: {{ .Values.cronjobs._automation_.schedule | quote }}
  jobTemplate:
    spec:
      template:
        spec:
          {{- if eq .Values.cronjobs._automation_.azIdentity true }}
          serviceAccountName: {{ .Values.cronjobs._automation_.serviceAccount.name }}
            {{- end }}
          containers:
            - name: {{ .Release.Name }}-_automation_-job
              image: "{{ .Values.cronjobs.image.repository }}/{{ .Values.cronjobs._automation_.scriptType }}:{{ .Values.cronjobs.image.tag | default .Chart.AppVersion }}"
              imagePullPolicy: Always
              command:
                {{- toYaml .Values.cronjobs._automation_.commands | nindent 16 }}
              volumeMounts:
                - name: secrets-store
                  mountPath: '/mnt/secrets-store'
                  readOnly: true
              env:
                - name: APPLICATION-INSIGHTS-KEY
                  valueFrom:
                    configMapKeyRef:
                      name: kv-secrets
                      key: application-insights-key
          volumes:
            - name: secrets-store
              csi:
                driver: secrets-store.csi.k8s.io
                readOnly: true
                volumeAttributes:
                  secretProviderClass: 'azure-kvname'
          restartPolicy: OnFailure

The helm template needs more work as it's not copy-pasteable, but it's there in terms of what is required. The template is simple; it's deployed in a namespace which you define in the helm upgrade step. It uses workload identity to connect to Azure if needed, and it uses the CSI Secret store provider to pull secrets.

cronjobs:
  image:
    repository: acrimages.azurecr.io/schedules
    pullPolicy: Always
    tag: 'latest'

  automationCronJob:
    schedule: '0 2/2 * * *' # every 2 hours
    scriptType: 'powershell'
    imagePullSecrets: []
    nameOverride: ''
    fullnameOverride: ''

    azIdentity: true

    serviceAccount:
      name: 'workload-identity'

    commands:
      - pwsh
      - ./doSomethingScript/run.ps1
values.yaml

The values files contain all the information needed for the cronjob to run; we have the script that needs to be called, when it should be called and if the workload identity should be injected. For cronjobs, I use https://crontab.guru/#

From there, you know everything that remains to be done:

  • Start testing the helm chart.
  • Fix all the errors.
  • Build a pipeline that builds the image and deploys the helm chart.

Building the CI/CD pipeline should be a breeze if everything checks out. The primary purpose of that pipeline is to run docker build on the dockerfile, push that image to an ACR, and then you need a job to run helm upgrade --install.

      - task: Docker@2
        displayName: Build automations powershell container image
        condition: ${{ parameters.condition }}
        inputs:
          containerRegistry: "acrimages.azurecr.io"
          repository: "schedules"
          command: "build"
          Dockerfile: "location_/powershell/Dockerfile"
          tags: |
            $(Build.BuildId)
            latest

      - task: Docker@2
        displayName: Push automations powershell container image
        condition: ${{ parameters.condition }}
        inputs:
          containerRegistry: "acrimages.azurecr.io/schedules"
          repository: "schedules"
          command: "push"
          Dockerfile: location_/powershell/Dockerfile"
          tags: |
            $(Build.BuildId)
            latest

The tasks to build and publish the image are native to AzDo; build and push to an ACR. Nothing sophisticated.

- task: AzureCLI@2
  displayName: "Deploy automations Helm Chart"
  inputs:
    azureSubscription: ${{ parameters.serviceConnection }}
    scriptType: "bash"
    scriptLocation: "inlineScript"
    workingDirectory: "$(Pipeline.Workspace)/helm/automations"
    inlineScript: |
      helm upgrade automations . -n default --atomic --create-namespace --install
Snippet from AzDevOps

The deployment job should connect to the AKS cluster, do its work, and run the helm upgrade --install command to set the new information inside the cluster.

Takeaway

After the migration to Kubernetes, the infrastructure footprint remained the same, but by centralizing everything under one roof and removing extras such as Automation Accounts and VMs, costs were reduced.

The adoption of AKS allowed for more efficient management, a single pane of glass using OpenLens, and out-of-box scaling on both the compute side and inside the cluster using cluster-autoscale and KEDA.

The idea of copy-pasting things and focusing on functionality can provide a valuable framework for quick implementation. I've been using it many times so far without any regret-ish. Nevertheless, over-engineering can lead to code redundancy, architectural complexity, and unnecessary dependencies. Striking the right balance between functionality and over-engineering is crucial to successfully implementing and maintaining a system.

That being said, I hope you learned something new, and as always, have a good one!