[libvirt-users] Researching why different cache modes result in 'some' guest filesystem corruption..

Tue Jul 30 20:40:49 UTC 2019

Hi All,

I've been chasing down an issue in recent weeks (my own lab, so no prod 
here) and I'm reaching out in case someone might have some guidance to 
share.

I'm running fairly large VMs (RHOSP underclouds - 8vcpu, 32gb ram, about 
200gb single disk as a growable qcow2) on some RHEL7.6 hypervisors (kernel 
3.10.0-927.2x.y, libvirt 4.5.0, qemu-kvm-1.5.3) on top of SSD/NVMe drives 
with various filesystems (vxfs, zfs, etc..) and using ECC RAM.

The issue can be described as follows:

- the guest VMs work fine for a while (days, weeks) but after a kernel
   update (z-stream) comes in, I am often greeted by the following message
   immediately after rebooting (or attempting to reboot into the new
   kernel):

"error: not a correct xfs inode"

- booting the previous kernel works fine and re-generating the initramfs
   for the new kernel (from the n-1 kernel) does not solve anything.

- if booted from an ISO, xfs_repair does not find errors.

- on ext4, there seems to be some kind of corruption there too.

I'm building the initial guest image qcow2 for those guest VMs this way:

1) start with a rhel-guest image (currently 
rhel-server-7.6-update-5-x86_64-kvm.qcow2)

2) convert to LVM by doing this:
  qemu-img create -f qcow2 -o preallocation=metadata,cluster_size=1048576,lazy_refcounts=off final_guest.qcow2 512G
  virt-format -a final_guest.qcow2 --partition=mbr --lvm=/dev/rootdg/lv_root --filesystem=xfs
  guestfish --ro -a rhel_guest.qcow2 -m /dev/sda1 -- tar-out / - | \
  guestfish --rw -a final_guest.qcow2 -m /dev/rootdg/lv_root  -- tar-in - /

3) use "final_guest.qcow2" as the basis for my guests with LVM.

After chasing down this issue some more and attempting various 
things (build the image on Fedora29, build a real XFS filesystem inside a 
VM and use the generated qcow2 as a basis instead of virt-format)..

..I've noticed that the SATA disk of each of those guests were using 
'directsync' (instead of 'Hypervisor Default'). As soon as I switched to 
'None', the XFS issues disappeared and I've now applied several 
consecutive kernel updates without issues. Also, 'directsync' or 
'writethrough', while providing decent performance, both exhibited the XFS 
'corruption' behaviour.. Only 'none' seem to have solved that.

I've read the docs but I thought it was OK to use those modes (UPS, 
Battery-Backed RAID, etc..)

Does anyone have any idea what's going on or what I may be doing wrong?

Thanks for reading,

Vincent