MGD Blog: RHEV and RHEL Clustering - Fencing without RHEVM

UPDATE: I've added the script see the post: here.

For an upcoming project I'll be using a Red Hat Cluster inside a RHEV environment.
At first glance I didn't see any problems since RHEL's High Availability add-on already includes a fencing script for the RHEV-M.
But what happens when the RHEV-M is down or unresponsive and the cluster need to fence one of the nodes?

This could mean trouble since the cluster would stop every service it manages resulting in a potential downtime for our applications.

After some research I've come up with a possible solution that allows for the fencing of a VM without a RHEV-M.

The process is quite simple but needs a few steps:

1 - Get a list of all the hypervisors inside your RHEV system where the VM can run

2 - For each of these hosts do the following operations until we find the VM:

Connect to the host as root
Check if there is a QEMU process for our VM in our current host. If there is proceed with the following commands, if not then try the next hypervisor.
Create a new set of credentials to interact with libvirt:

saslpasswd2 -p -a libvirt fenceagent

(fenceagent is a username and this command will ask for a password)

Restart the VM with the following command:

virsh qemu-monitor-command --hmp VM_NAME system_reset

(Change VM_NAME for the name of the VM as it appears on RHEV)

Remove the user you created.
Log off from the hypervisor

In a few days I'll transform this into a Python script so I can add it to the Cluster.

I've already validated this process manually so I think there will be no major issues with it.

But there is a potential issue, since this requires an iteration over all the hypervisors (or at least until you find the VM) it can that a lot of time if there are lots of hypervisors, but at least your cluster won't go berserk :D

This will also need some extra configuration like a list of Hypervisors where the VM can be run and the VM name also needs to be passed as an argument to the fence script

For future reference, I based this "algorithm" in the following information:

http://comments.gmane.org/gmane.comp.emulators.ovirt.user/6551

http://blog.vmsplice.net/2011/03/how-to-access-qemu-monitor-through.html

More updates on this to follow.

MGD Blog

Monday, 11 August 2014

RHEV and RHEL Clustering - Fencing without RHEVM

No comments:

Post a Comment