Monday, 11 August 2014

RHEV and RHEL Clustering - Fencing without RHEVM



UPDATE: I've added the script see the post: here.


For an upcoming project I'll be using a Red Hat Cluster inside a RHEV environment.
At first glance I didn't see any problems since RHEL's High Availability add-on already includes a fencing script for the RHEV-M.
But what happens when the RHEV-M is down or unresponsive and the cluster need to fence one of the nodes?

This could mean trouble since the cluster would stop every service it manages resulting in a potential downtime for our applications.

After some research I've come up with a possible solution that allows for the fencing of a VM without a RHEV-M.

The process is quite simple but needs a few steps:

1 - Get a list of all the hypervisors inside your RHEV system where the VM can run

2 - For each of these hosts do the following operations until we find the VM:

  • Connect to the host as root
  • Check if there is a QEMU process for our VM in our current host. If there is proceed with the following commands, if not then try the next hypervisor.
  • Create a new set of credentials to interact with libvirt: 
  • saslpasswd2 -p -a libvirt fenceagent

    (fenceagent is a username and this command will ask for a password)

  • Restart the VM with the following command:
  • virsh  qemu-monitor-command --hmp VM_NAME system_reset

    (Change VM_NAME for the name of the VM as it appears on RHEV)
  • Remove the user you created.
  • Log off from the hypervisor

In a few days I'll transform this into a Python script so I can add it to the Cluster. 
I've already validated this process manually so I think there will be no major issues with it.

But there is a potential issue, since this requires an iteration over all the hypervisors (or at least until you find the VM) it can that a lot of time if there are lots of hypervisors, but at least your cluster won't go berserk :D

This will also need some extra configuration like a list of Hypervisors where the VM can be run and the VM name also needs to be passed as an argument to the fence script


For future reference, I based this "algorithm" in the following information:



More updates on this to follow.



No comments:

Post a Comment