[libvirt-users] Right way to do SAN-based shared storage?

Tue Feb 18 20:28:51 UTC 2014

On 12/2/14 22:29, Franky Van Liedekerke wrote:
> On Wed, 12 Feb 2014 21:51:53 +0100
> urgrue <urgrue at bulbous.org> wrote:
>
>> I'm trying to set up SAN-based shared storage in KVM, key word being
>> "shared" across multiple KVM servers for a) live migration and b)
>> clustering purposes. But it's surprisingly sparsely documented. For
>> starters, what type of pool should I be using?
> It's indeed not documented at all.
> After many trial and errors, this is the result of my experience:
>
> - set up a basic cluster using cman and pacemaker (when using redhat or
>    centos). If unsure about the multicast performance of your switches,
>    use unicast (I needed this in some cases).
> - don't use a shared FS for your virtual machines. GFS2 works ok, but
>    the IO performance of your virtual machines drops a lot.
> - because of the cluster, you can use clvmd. Even if not using
>    clustered logical volumes, you can still decide to stop the volume on
>    one server and start it on another via the pacemaker/heartbeat agents.
> - use pacemaker to manage virtual machines (and, if not using clvmd, to
>    stop/start your lvm's using tags). For the xml files describing your
>    vm's you'll unfortunately either need a small GFS2 partition or use
>    rsync between the 2 servers. But use the VirtualDomain resource agent
>    from git, it contains a lot of fixes (even some from me :-) ). Also
>    compile libvirtd from source (1.2.1 is very stable with a small
>    extra patch to talk to older qemu versions), reason for this is that
>    you can then have more than 20 (or is it 25) virtual machines running
>    on one kvm without issues (and also lots of memory leak fixes, and it
>    provides a addon: virtlockd). Also, since you don't touch qemu from
>    the release you're using, it's not that big a deal.
> - as an extra layer of protection, you can use virtlockd (to be sure
>    your vm doesn't run on 2 nodes at the same time). The disadvantage of
>    this is you need a small gfs2 shared partition, but that's ok if you
>    don't want to use rsync for your xml files anyway.
>
> I'm open for any questions and/or bashing :-)
>
> Franky
>

Hi Franky,

Thanks for sharing your experience. I considered using clvm but I was 
hoping for something simpler for the reasons that:
- clvm refuses to start (initial start) if all cluster nodes aren't up, 
which in some scenarios is a little problematic.
- I wouldn't like to have fencing enabled (imagine only one VM is using 
clvm and the rest are on local disk), but red hat support requires fencing.
- KISS...
- oVirt/RHEV uses plain old non-clustered LVM.

I like this idea because it's super simple. Indeed, there is no 
protection against something using that disk simultaneously on the host, 
but that's why I'd use clustering HA LVM (or clvm) in the VMs themselves.

My only real concern with non-clustered LVM is that the libvirt 
"logical" pool type doesn't get it at all. I'm not sure what it does 
differently compared to standard lvm commands but it will only start the 
pool on the first KVM node, and errors out on the next.

I'm still investigating (also have a case open at red hat) so if I find 
out anything interesting