In the last couple of weeks due to a bug that was fixed, I had split brain on a host. Every time, I would place a particular host into maintenance mode it would begin to move VMs to other hosts in the cluster. However, I would get a couple VMs that would flp between hosts in the vCenter GUI. I was able to confirm the VMs were running on other hosts but this host still could not enter maintenance mode as a result of split brain. The fix, kill the VM processes running on the split brain host allowing it to enter maintenance mode to be rebooted.
To get the World ID of a VM:
#esxcli vm process list
To kill the VM or it’s processes running on a host:
#esxcli vm process kill –type= [soft,hard,force] –world-id= WorldNumber
*Soft – attempts to shutdown the VM softly – preferred method
**Hard – it is an immediate shutdown of the VM
***Force – hard kill of the VM – should use if only option left
As you can imagine killing the process of a VM in production is never a great thing to do. Sometimes, you are left with no choice. I hope if you are ever putting a host into maintenance mode to reboot that you have a change control in place and can bounce VMs if needed. As always, I hope y’all found this article useful.