[Linux-cluster] Finding the bottleneck between SAN and GFS2

Wed Jul 1 12:54:19 UTC 2015

----- Original Message -----
> Hello,
> 
> We are experiencing slow VMs on our OpenNebula architecture:
(snip)
> The short result is that bare metal access to the GFS2 without any cache
> is terribly slow, around 2Mb/s and 90 requests/s.
> 
> Is there a way to find out if the problem comes from my
> GFS2/corosync/pacemaker configuration or from the SAN?
> 
> Regards.

Hi Daniel,

Diagnosing and solving GFS2 performance problems is a long and complex
thing. Many things can be causing the slowdown, but there's usually a
bottleneck or two that need to be identified and solved. We have a
lot of tools in our arsenal to do this, so it's not something that can
be easily explained. There are several articles on the Red Hat Customer
Portal if you're a customer. For example: 

https://access.redhat.com/articles/628093

If you don't have access to the portal, you can try to determine your
bottleneck by performing some tests yourself, such as:

1. Use filefrag to see if your file system is severely fragmented.
2. Raw speed of the device without any file system (as Steve Whitehouse suggested)
3. Test the throughput of the same device using a different file system.
4. Test the network throughput
5. DLM throughput (via dlm_klock)
6. See if you're running out of memory and swapping (via top)
7. See if there is GFS2 glock contention (via glocktop)
   http://people.redhat.com/rpeterso/Experimental/*/glocktop
   Source code in:
   http://people.redhat.com/rpeterso/Tools/
8. NUMA related problems.
   There's a good recent two-hour talk about performance tuning and NUMA
   from this year's Red Hat Summit: https://www.youtube.com/watch?v=ckarvGJE8Qc
9. Slowdowns due to small block sizes (4K is recommended)
10. Slowdowns due to journals being too small (128MB is recommended)
11. Slowdowns due to resource groups being too big (avoid 2GB rgrps for now).
12. Slowdowns due to backups pushing all the dlm lock masters to one node.
   (We have specific backup recommendations to avoid this. Or you can unmount
   after doing a backup).
13. Check how many glocks are in slab (slabtop and such)
14. Check if you're CPU bound ("perf top -g" and such)

I'd like to add that Red Hat has done a tremendous amount of work to speed
up GFS2 in the past couple years. We drastically reduced fragmentation. We've
added an Orlov block allocator. We've done a lot of things and newer is better.
Newer versions of RHEL6 are going to be faster than older RHEL6. Older RHEL6
is going to be faster than RHEL5. RHEL7 is faster than RHEL6, and so on.
That's because we tend to focus our development efforts on the newest release
and don't often port performance improvements back to older releases. But
like many things, GFS2 is only as strong as its weakest link, so you need to
identify what the weakest link is. I hope this helps.

Regards,

Bob Peterson
Red Hat File Systems