Serverless anywhere with Kubernetes

Serverless anywhere with Kubernetes

KEDA (Kubernetes-based Event-Driven Autoscaling) is an opensource project built by Microsoft in collaboration with Red Hat, which provides event-driven autoscaling to containers running on an AKS (Azure Kubernetes Service), EKS ( Elastic Kubernetes Service), GKE (Google Kubernetes Engine) or on-premises Kubernetes clusters 😉

KEDA allows for fine-grained autoscaling (including to/from zero) for event-driven Kubernetes workloads. KEDA serves as a Kubernetes Metrics Server and allows users to define autoscaling rules using a dedicated Kubernetes custom resource definition. Most of the time, we scale systems (manually or automatically) using some metrics that get triggered.

For example, if CPU > 60% for 5 minutes, scale our app service out to a second instance. By the time we’ve raised the trigger and completed the scale-out, the burst of traffic/events has passed. KEDA, on the other hand, exposes rich events data like Azure Message Queue length to the horizontal pod auto-scaler so that it can manage the scale-out for us. Once one or more pods have been deployed to meet the event demands, events (or messages) can be consumed directly from the source, which in our example is the Azure Queue.

Container security in the Cloud. A state of uncertainty.

Container security in the Cloud. A state of uncertainty.

Thinking that you managed to escape security? Guess again.

In this post, I will focus on containers with Azure Kubernetes in mind but these examples apply to all the worlds out there -EKS, GKE, and on-premises.

Containers gained popularity because they made publishing and updating applications a straightforward process. You can go from build to release very fast with little overhead, whether it’s the cloud or on-premises, but there’s still more to do.

The adoption of containers in recent years has skyrocketed in such a way that we see enterprises in banking, public sector, and health looking into or are already deploying containers. The main difference here would be that these companies have a more extensive list of security requirements and they need to be applied to everything. Containers included.

AKS – Working with GPUs

AKS – Working with GPUs

I’ve discussed about AKS before but recently I have been doing a lot of production deployments of AKS, and the recent deployment I’ve done was with Nvidia GPUs. 

This blog post will take you through my learnings after dealing with a deploying of this type because boy some things are not that simple as they look. 

The first problems come after deploying the cluster. Most of the times if not all, the NVIDIA driver doesn’t get installed and you cannot deploy any type of GPU constrained resources. The solution is to basically install an NVIDIA daemon and go from there but that also depends on the AKS version.

For example, if your AKS is running version 1.10 or 1.11 then the NVIDIA Daemon plugin must be 1.10 or 1.11 or anything that matches your version located here

apiVersion: extensions/v1beta1
kind: DaemonSet
  labels: "true"
  name: nvidia-device-plugin
  namespace: gpu-resources
      # Mark this pod as a critical add-on; when enabled, the critical add-on scheduler
      # reserves resources for critical add-on pods so that they can be rescheduled after
      # a failure.  This annotation works in tandem with the toleration below.
      annotations: ""
        name: nvidia-device-plugin-ds
      # Allow this pod to be rescheduled while the node is in "critical add-ons only" mode.
      # This, along with the annotation above marks this pod as a critical add-on.
      - key: CriticalAddonsOnly
        operator: Exists
      - image: nvidia/k8s-device-plugin:1.10 # Update this tag to match your Kubernetes version
        name: nvidia-device-plugin-ctr
          allowPrivilegeEscalation: false
            drop: ["ALL"]
          - name: device-plugin
            mountPath: /var/lib/kubelet/device-plugins
        - name: device-plugin
            path: /var/lib/kubelet/device-plugins
      nodeSelector: linux
        accelerator: nvidia

The code snip from above creates a DaemonSet that installs the NVIDIA driver on all the nodes that are provisioned in your cluster. So for three nodes, you will have 3 Nvidia pods.

The problem that can appear is when you upgrade your cluster. You go to Azure and upgrade the cluster and guess what, you forgot to update the yaml file and everything that relies on those GPUs dies on you.

The best example I can give is the TensorFlow Serving container which crashed with a very “informative” error that the Nvidia version was wrong.

Other problems that appear is monitoring. How can I monitor GPU usage? What tools should I use?

Here you have a good solution which can be deployed via Helm. If you do a helm search for prometheus-operator you will find the best solution to monitor your cluster and your GPU 🙂

The prometheus-operator chart comes with Prometheus, Grafana and Alertmanager but out of the box you will not get the GPU metrics that are required for monitoring because of an error in the Helm chart which sets the cAdvisor metrics with https, the solution would be to modify the exporter HTTPS to false.

  enabled: true
  namespace: kube-system

    ## Scrape interval. If not set, the Prometheus default scrape interval is used.
    interval: ""

    ## Enable scraping the kubelet over https. For requirements to enable this see
    https: false

And import the dashboard required to monitor your GPUs which you can find here: and set it up as a configmap.

In most cases, you will want to monitor your cluster from outside and for that you will need to install / upgrade the prometheus-operator chart with the grafana.ingress.enabled value as true and grafana.ingress.hosts={domain.tld}

Next in line, you have to deploy your actual containers that use the GPU. As a rule, a container cannot use a part of a GPU but only the whole GPU so thread carefully when you’re deploying your cluster because you can only scale horizontally as of now.

When you’re defining the POD, add in the container spec the following snip below:

      limits: 1

End result would look like this deployment:

apiVersion: apps/v1beta1
kind: Deployment
  name: tensorflow
  replicas: 2
        app: tensorflow
      - name: tensorflow
        image: tensorflow/serving:latest
        imagePullPolicy: IfNotPresent

What happens if everything blows up and nothing is working?

In some rare cases, the Nvidia driver may blow up your data nodes. Yes that happened to me and needed to solved it.

The manifestation looks like this. The ingress controller works randomly, cluster resources show as evicted. The nvidia device restarts frequently and your GPU containers are stuck in pending.

The way to fix it is first by deleting the evicted / error status pods by running this command:

kubectl get pods --all-namespaces --field-selector 'status.phase==Failed' -o json | kubectl delete -f -

And then restart all the data nodes from Azure. You can find them in the Resource Group called MC_<ClusterRG><ClusterName><Region>

That being said, it’s fun times to run AKS in production 🙂

Signing out.

Integrating Azure Container Instances in AKS

In a previous blog post, I talked about how excellent the managed Kubernetes service is in Azure and in another blog post I spoke about Azure Container Instances. In this blog post, we will be combining them so that we get the best of both worlds.

We know that we can use ACI for some simple scenarios like task automation, CI/CD agents like VSTS agents (Windows or Linux), simple web servers and so on but it’s another thing that we need to manage. Even though that ACI has almost no strings attached, e.g. no VM management, custom resource sizing and fast startup, we still may want to control them from a single pane of glass.

ACI doesn’t provide you with auto-scaling, rolling upgrades, load balancing and affinity/anti-affinity, that’s the work of a container orchestrator. So if we want the best of both worlds, we need an ACI connector.

The ACI Connector is a virtual kubelet that get’s installed on your AKS cluster, and from there you can deploy containers just by merely referencing the node.

If you’re interested in the project, you can take a look here.

To install the ACI Connector, we need to cover some prerequisites.
The first thing that we need to do is to do is to create a service principal for the ACI connector. You can follow this document here on how to do it.

When you’ve created the SPN, grant it contributor rights on your AKS Resource Group and then continue with the setup.

I won’t be covering the Windows Subsystem for Linux or any other bash system as those have different prerequisites. What I will cover in this blog post is how to get started using the Azure Cloud Shell.

So pop open an Azure Cloud Shell and (assuming you already have an AKS cluster) get the credentials.

az aks get-credentials -g RG -n AKSNAME

After that, you will need to install helm and upgrade tiller. For that, you will run the following.

helm init
helm init --upgrade

The reason that you need to initialize helm and upgrade tiller is not very clear to me but I believe that helm and tiller should be installed and upgraded to the latest version every time.

Once those are installed, you’re ready to install the ACI connector as a virtual kubelet. Azure CLI installs the connector using a helm chart. Type in the command below using the SPN you created.

az aks install-connector -g <AKS RG> -n <AKS name> --connector-name aciconnector --location westeurope --service-principal <applicationID>  --client-secret <applicationSecret> --os-type both 

As you can see the in command from above, I typed both for the –os-type. ACI supports Windows and Linux containers so there’s no reason not to get both 🙂

After the install, you can query the Kubernetes cluster for the ACI Connector.

kubectl --namespace=default get pods -l "app=aciconnector-windows-virtual-kubelet-for-aks"       # Windows
kubectl --namespace=default get pods -l "app=aciconnector-linux-virtual-kubelet-for-aks"         # Linux

Now that the kubelet is installed, all you need to do is just to run kubectl -f create YAML file, and you’re done 🙂

If you want to target the ACI Connector with the YAML file, you need to reference a nodeName of virtual-kubelet-ACICONNECTORNAME-linux or windows.

apiVersion: apps/v1beta1
kind: Deployment
  name: nginx-deployment
  replicas: 3
        app: nginx
      - name: nginx
        image: nginx:1.7.9
        - containerPort: 80
            memory: 1G
            cpu: 250m
      nodeName: virtual-kubelet-aciconnector-linux

You run that example from above and the AKS cluster will provision an ACI for you.

What you should know

The ACI connector allows the Kubernetes cluster to connect to Azure and provision Container Instances for you. That doesn’t mean that it will provision the containers in the same VNET as the K8 is so you can do some burst processing or those types of workloads. This is let’s say an alpha concept which is being built upon and new ways of using it are being presented every day. I have been asked by people, what’s the purpose of this thing because I cannot connect to it, but the fact is that you cannot expect that much from a preview product. I have given suggestions on how to improve it, and I suggest you should too.

Well that’s it for today. As always have a good one!

Deploying your containers to Kubernetes using VSTS

Visual Studio Team Services or VSTS is Microsoft’s cloud offering that provides a complete set of tools and services that ease the life of small teams or enterprises when they are developing software.

I don’t want to get into a VSTS introduction in this blog post, but what we need to know about VSTS is that it’s the most integrated CI/CD system with Azure. The beautiful part is that Microsoft has a marketplace with lots of excellent add-ons that extend the functionally of VSTS.

Creating a CI/CD pipeline in VSTS to deploy containers to Kubernetes is quite easy. I will show in this blog post a straightforward pipeline design to build the container and deploy it to the AKS cluster.

The prerequisites for are the following:
VSTS Tenant and Project – Create for free here with a Microsoft Account that has access to the Azure subscription
VSTS Task installed – Replace Tokens Task
AKS Cluster
Azure Container Registry

Before we even start building the VSTS pipeline, we need to get some connection prerequisites out of the way. To deploy containers to the Kubernetes cluster, we need to have a working connection with it.

Open a Cloud Shell in Azure and type in:

az aks get-credentials -g AKS_RG -n AKS_NAME

It will tell you that the current context is located in “/home/NAME/.kube/config.”

Now open the /home/NAME/.kube/config with nano or cat and paste everything from there in a notepad. You need that wall of text to establish the connection to the cluster using VSTS.

Let’s go to VSTS where we will create a service endpoint to our Kubernetes cluster.
At the project dashboard, press on the whell icon and press on services.

Press on the New Service Endpoint and select Kubernetes.

Paste in the details from the .kube/config file in the kubeconfig box and the https://aksdns

Create a repository and add the following files and contents to it:
*I know it would be easier to clone from my Github Repo but when I’m learning I like doing copying and pasting stuff in VSCode, analyse and then upload.

apiVersion: apps/v1beta1
kind: Deployment
  name: nginxdemo
  replicas: 5
        app: nginxdemo
      - name: nginxdemo
        image: __ACR_DNS__/nginxdemo:__BUILD_ID__
        - containerPort: 80
apiVersion: v1
kind: Service
  name: nginxdemo
  type: LoadBalancer
  - port: 80
    app: nginxdemo
FROM alpine:3.7

LABEL maintainer="NGINX Docker Maintainers <[email protected]>"


RUN GPG_KEYS=B0F4253373F8F6F510D42178520A9993A1C052F8 \
	&& CONFIG="\
		--prefix=/etc/nginx \
		--sbin-path=/usr/sbin/nginx \
		--modules-path=/usr/lib/nginx/modules \
		--conf-path=/etc/nginx/nginx.conf \
		--error-log-path=/var/log/nginx/error.log \
		--http-log-path=/var/log/nginx/access.log \
		--pid-path=/var/run/ \
		--lock-path=/var/run/nginx.lock \
		--http-client-body-temp-path=/var/cache/nginx/client_temp \
		--http-proxy-temp-path=/var/cache/nginx/proxy_temp \
		--http-fastcgi-temp-path=/var/cache/nginx/fastcgi_temp \
		--http-uwsgi-temp-path=/var/cache/nginx/uwsgi_temp \
		--http-scgi-temp-path=/var/cache/nginx/scgi_temp \
		--user=nginx \
		--group=nginx \
		--with-http_ssl_module \
		--with-http_realip_module \
		--with-http_addition_module \
		--with-http_sub_module \
		--with-http_dav_module \
		--with-http_flv_module \
		--with-http_mp4_module \
		--with-http_gunzip_module \
		--with-http_gzip_static_module \
		--with-http_random_index_module \
		--with-http_secure_link_module \
		--with-http_stub_status_module \
		--with-http_auth_request_module \
		--with-http_xslt_module=dynamic \
		--with-http_image_filter_module=dynamic \
		--with-http_geoip_module=dynamic \
		--with-threads \
		--with-stream \
		--with-stream_ssl_module \
		--with-stream_ssl_preread_module \
		--with-stream_realip_module \
		--with-stream_geoip_module=dynamic \
		--with-http_slice_module \
		--with-mail \
		--with-mail_ssl_module \
		--with-compat \
		--with-file-aio \
		--with-http_v2_module \
	" \
	&& addgroup -S nginx \
	&& adduser -D -S -h /var/cache/nginx -s /sbin/nologin -G nginx nginx \
	&& apk add --no-cache --virtual .build-deps \
		gcc \
		libc-dev \
		make \
		openssl-dev \
		pcre-dev \
		zlib-dev \
		linux-headers \
		curl \
		gnupg \
		libxslt-dev \
		gd-dev \
		geoip-dev \
	&& curl -fSL$NGINX_VERSION.tar.gz -o nginx.tar.gz \
	&& curl -fSL$NGINX_VERSION.tar.gz.asc  -o nginx.tar.gz.asc \
	&& export GNUPGHOME="$(mktemp -d)" \
	&& found=''; \
	for server in \ \
		hkp:// \
		hkp:// \ \
	; do \
		echo "Fetching GPG key $GPG_KEYS from $server"; \
		gpg --keyserver "$server" --keyserver-options timeout=10 --recv-keys "$GPG_KEYS" && found=yes && break; \
	done; \
	test -z "$found" && echo >&2 "error: failed to fetch GPG key $GPG_KEYS" && exit 1; \
	gpg --batch --verify nginx.tar.gz.asc nginx.tar.gz \
	&& rm -r "$GNUPGHOME" nginx.tar.gz.asc \
	&& mkdir -p /usr/src \
	&& tar -zxC /usr/src -f nginx.tar.gz \
	&& rm nginx.tar.gz \
	&& cd /usr/src/nginx-$NGINX_VERSION \
	&& ./configure $CONFIG --with-debug \
	&& make -j$(getconf _NPROCESSORS_ONLN) \
	&& mv objs/nginx objs/nginx-debug \
	&& mv objs/ objs/ \
	&& mv objs/ objs/ \
	&& mv objs/ objs/ \
	&& mv objs/ objs/ \
	&& ./configure $CONFIG \
	&& make -j$(getconf _NPROCESSORS_ONLN) \
	&& make install \
	&& rm -rf /etc/nginx/html/ \
	&& mkdir /etc/nginx/conf.d/ \
	&& mkdir -p /usr/share/nginx/html/ \
	&& install -m644 html/index.html /usr/share/nginx/html/ \
	&& install -m644 html/50x.html /usr/share/nginx/html/ \
	&& install -m755 objs/nginx-debug /usr/sbin/nginx-debug \
	&& install -m755 objs/ /usr/lib/nginx/modules/ \
	&& install -m755 objs/ /usr/lib/nginx/modules/ \
	&& install -m755 objs/ /usr/lib/nginx/modules/ \
	&& install -m755 objs/ /usr/lib/nginx/modules/ \
	&& ln -s ../../usr/lib/nginx/modules /etc/nginx/modules \
	&& strip /usr/sbin/nginx* \
	&& strip /usr/lib/nginx/modules/*.so \
	&& rm -rf /usr/src/nginx-$NGINX_VERSION \
	# Bring in gettext so we can get `envsubst`, then throw
	# the rest away. To do this, we need to install `gettext`
	# then move `envsubst` out of the way so `gettext` can
	# be deleted completely, then move `envsubst` back.
	&& apk add --no-cache --virtual .gettext gettext \
	&& mv /usr/bin/envsubst /tmp/ \
	&& runDeps="$( \
		scanelf --needed --nobanner --format '%n#p' /usr/sbin/nginx /usr/lib/nginx/modules/*.so /tmp/envsubst \
			| tr ',' '\n' \
			| sort -u \
			| awk 'system("[ -e /usr/local/lib/" $1 " ]") == 0 { next } { print "so:" $1 }' \
	)" \
	&& apk add --no-cache --virtual .nginx-rundeps $runDeps \
	&& apk del .build-deps \
	&& apk del .gettext \
	&& mv /tmp/envsubst /usr/local/bin/ \
	# Bring in tzdata so users could set the timezones through the environment
	# variables
	&& apk add --no-cache tzdata \
	# forward request and error logs to docker log collector
	&& ln -sf /dev/stdout /var/log/nginx/access.log \
	&& ln -sf /dev/stderr /var/log/nginx/error.log

COPY nginx.conf /etc/nginx/nginx.conf
COPY nginx.vh.default.conf /etc/nginx/conf.d/default.conf
COPY index.html /usr/share/nginx/html/



CMD ["nginx", "-g", "daemon off;"]
user  nginx;
worker_processes  1;

error_log  /var/log/nginx/error.log warn;
pid        /var/run/;

events {
    worker_connections  1024;

http {
    include       /etc/nginx/mime.types;
    default_type  application/octet-stream;

    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';

    access_log  /var/log/nginx/access.log  main;

    sendfile        on;
    #tcp_nopush     on;

    keepalive_timeout  65;

    #gzip  on;

    include /etc/nginx/conf.d/*.conf;
server {
    listen       80;
    server_name  localhost;

    #charset koi8-r;
    #access_log  /var/log/nginx/host.access.log  main;

    location / {
        root   /usr/share/nginx/html;
        index  index.html index.htm;

    #error_page  404              /404.html;

    # redirect server error pages to the static page /50x.html
    error_page   500 502 503 504  /50x.html;
    location = /50x.html {
        root   /usr/share/nginx/html;

    # proxy the PHP scripts to Apache listening on
    #location ~ \.php$ {
    #    proxy_pass;

    # pass the PHP scripts to FastCGI server listening on
    #location ~ \.php$ {
    #    root           html;
    #    fastcgi_pass;
    #    fastcgi_index  index.php;
    #    fastcgi_param  SCRIPT_FILENAME  /scripts$fastcgi_script_name;
    #    include        fastcgi_params;

    # deny access to .htaccess files, if Apache's document root
    # concurs with nginx's one
    #location ~ /\.ht {
    #    deny  all;

You will need to add an “index.html” with whatever you want it to be written in it. I went with “One does not simply push changes to containers. Said no one ever.”

You’re done building the repository; now it’s time to setup the build definition.

Go to Build and Releases, Press on New and select the new Git repository that you just created.

At the template screen press on Container and press Apply.

On the new screen, go to variables and create two new variables

ACR_DNS with the value of your ACR registry link in the form of
BUILD_ID with the value $(Build.BuildId)

Now go back to the task pane by pressing the cross where the phase 1 task says and add the Replace Tokens task and Publish Artifacts task. The result should look like the screenshot below.

For each task fill in the following:

Build an Image
Container Registry Type = Azure Container Registry
Azure Subscription = Your Subscription
Azure Container Registry = Select what you created
Action = Build an Image
Docker File = **\Dockerfile
Use Default Build Context = Checked
ImageName = nginxdemo
Qualify Image Name = Checked
Additional Image Tags = $(Build.BuildId)

Push an Image
Container Registry Type = Azure Container Registry
Azure Subscription = Your Subscription
Azure Container Registry = Select what you created
Action = Puysh an image
ImageName = nginxdemo
Qualify Image Name = Checked
Additional Image Tags = $(Build.BuildId)

Replace Tokens
Target Files = **/*.yaml
Files Encoding = auto
Token Prefix = __ (Double Underscore)
Token Suffix = __ (Double Underscore)

Publish Artifact
Path to publish = deploy.yaml
Artifact name = deploy
Artifact publish location = Visual Studio Team Service/TFS

Now go to triggers, select Continuous integration and check “Enable continuous integration” then press the arrow on Save & queue and press save.

The build has been defined; now we need to create a release.

Go to Build and Releases and press on Release

Press on the cross and then on the “Create release definition”

In the New Release Definition pane, select the “Deploy to Kubernetes Cluster template and press on Apply

Now that the template is pre-populated to deploy to the Kubernetes Cluster, you need to add an artifact, select the Build Definition and add it.

Now it’s time to enable Continuous deployment so press on the lightning bolt that’s located in the upper right corner of the artifact and enable the CD trigger.

Now go to the Tasks tab located near the Pipeline and modify the kubectl apply command.
kubectl apply
Kubernetes Service Connection = Select the K8 connection that you created
Command = Apply
Use Configuration files = Checked
Configuration File = press on the three dots and reference the deploy.yaml or copy what is below.

Now press save, queue a new build and wait for the container to get deployed and when it’s done just type in the Azure Cloud Shell kubectl get services and the IP will pop.

Final Thoughts

So you finished configuring the CI/CD pipeline and deployed your first container to an AKS cluster. This might seem complicated at first but once you do this a couple of times, you will be a pro at it, and the problems you will face will be on how to make it more modular. I do similar things at clients most of the times when I’m automating application deployments for cloud-ready or legacy applications. This type of CI/CD deployment is quite easy to deploy, when you want to automate a full blow microservices infrastructure, then you will have a lot more tasks to do jobs. My most significant CI/CD pipeline consisted of 150 tasks that were needed to automate a legacy application.

What I would consider some best practices for CI/CD pipelines in VSTS or any other CI/CD tool is to never hard code parameters into tasks and make use of variables/variable groups. Tasks like the “Replace Tokens” one permit you to reference those variables so when one changes or you create one dynamically, they just get filled in the code. This is very useful when your release pipeline deploys to more than one environment, and you can have global variables and environment specific variables.

Well, I hope this was useful.

Until next time!

Pin It on Pinterest