Did this problem ever happen to you? If yes, then you know that the way to solving this issue is by booting the distro into the Single User mode. But how do you do that in Azure? Well Serial Console to the rescue!
Usually this is easily solvable using the Run Command or by using the Reset Password blade but in this case imagine that they don’t work. This is the case of the SAP deployment using the RHEL VMs. You cannot do anything if you’ve lost access and if the VM crashes it’s even worse.
You need to get to grub so you can boot the VM in single user mode. The problem here is that the VM is very fast for the serial console to connect and press the ESC button in the magic moment.
So what can you do?
The solution to that problem is to stop the VM without de-allocating it. This means that the VM on the Hyper-V server in the backend is not deleted but preserved. This means that you can have the serial console in standby to have a chance at that magic moment. How do you know that? Check figs 1 and 2.
#stop the VM without de-allocation.
Stop-AzureRMVM -Name $VMNAME -ResourceGroup $RESOURCEGROUP -StayProvisioned -Verbose
Once you’ve gotten to the screens that the VM is starting, this is what you need to watch for and then mash the ESC button:
Once you’ve managed to enter GRUB, you’re home free to reset the password using the steps below Press e in the Serial Console to edit the first OS line.
Go to the kernel line which starts with linux16
Add rd.break to the end of the line which will break the boot cycle. If selinux is enabled then add rd.break enforcing=0
Exit GRUB and reboot with the rd.break command saved by pressing ctrl x
During this reboot, the VM will go into the Emergency Mode where you have to mount the systemroot using the “mount -o remount,rw /sysroot” command.
This will boot you in single user mode, where you will have to type in chroot /sysroot to switch into the sysroot jail and then reset the password for the root user with passwd
Edit the sshd_config file “nano /etc/ssh/sshd_config” using your preferred editor so you enable root access using the Serial Console by setting PermitRootLogin yes
Once you’re done, reboot the VM and you’ve gotten root access.
After you’re done resetting all the passwords, installing all the agents so you’re not confronted with this again, set PermitRootLogin no and you’re golden 🙂
a long time Azure had a feature that permitted the users to see what
was happening when the VM was booting which allowed them to do root
cause analyses for when a VM crashed and stopped booting or any other
issues that could occur in the boot process. This feature is called boot
diagnostics which takes screenshots of your VM console and serial
output so you can do your debugging.
problem was that you had the information, you knew what happened and
knew exactly what to do to fix the issue but the only way you could
apply any fix was to download the VHD, boot the VM in Hyper-V, apply the
fix and then re-upload the VM back to Azure and continue. While you
might say that you should have had backups and just do a simple restore;
This is something that’s not always possible.
came up with a solution to this problem with the feature called Azure
Serial Console which provides you with a text based console via COM1
that allows to you to run simple diagnostic operations or start a Bash /
PowerShell session and get on working and the only thing that you need
to do for it to simply work, is just to have boot diagnostics enabled on
might ask yourself, why did Microsoft take so much time to develop this
while others had it? The answer to the question was security. Other
were using the NPAPI API to tunnel the traffic to the VM which was
deprecated in all the major browsers. The problem was that in a
hyper-scale environment is that you share the underlying infrastructure
with others and a feature like this could be used to siphon data from
one VM or all of them for that matter. Basically Microsoft solved this
problem by developing a new secure way that tunnel the COM1 traffic to
the specific user interface via the Hyper-V VMBus so that you have
access to the VM that you own and not others.
How to use it?
First of all, the VM must have boot diagnostics on. If it’s not enabled then Serial Console will not work:
Then you need to have contributor rights to the storage account (where you enabled the boot diagnostics) and the VM.
Wherever it is a Linux or Windows VM, simply just go to the portal and press on the Azure Serial Console from the VM blade in the Support + Troubleshooting section
This pops up a screen which shows the dmesg output if it’s a Linux VM or some VM Health reports if it’s Windows. When you press the Enter key it will push you to a login screen where you will need to provide the admin credentials to login to the console.
a Windows machine, this process a bit different because Windows by
default doesn’t send output to the COM1 port and Microsoft had to
develop a Special Administrative Console (SAC for short).
The SAC allows you do to simple RCA steps and if needed you can pop up a PowerShell console and rock on fixing it!
The login experience is the same as with a Putty session. The nice part is that you have the option of sending NMIs or other SysRq commands 🙂
When you get to the SAC> channel, you will need to perform the following to open up a PS Command:
If you type in help and / or ch -h , you will get a list of help items that will allow you to navigate throw the console:
As you can see, in SAC you have some useful commands for Windows and if you want to start a PS session, input the commands as shown below:
cmd # Start a cmd instance on the first available channel, in this case 1
ch -si 1 #change the channel to channel 1
#Enter admin credentials
powershell #Type it in :)
Cool huh? This is a great step forward when it comes to debugging virtual machines because in the past, you only had one way to do that and I mentioned it above. Is it perfect? No, there are some quirks to it but it’s better than nothing 🙂