[linux-lvm] cache on SSD makes system unresponsive

Oleg Cherkasov o1e9 at member.fsf.org
Thu Oct 19 21:59:15 UTC 2017


On 19. okt. 2017 21:09, John Stoffel wrote:
 >
 > Oleg> Recently I have decided to try out LVM cache feature on one of
 > Oleg> our Dell NX3100 servers running CentOS 7.4.1708 with 110Tb disk
 > Oleg> array (hardware RAID5 with H710 and H830 Dell adapters).  Two
 > Oleg> SSD disks each 256Gb are in hardware RAID1 using H710 adapter
 > Oleg> with primary and extended partitions so I decided to make ~240Gb
 > Oleg> LVM cache to see if system I/O may be improved.  The server is
 > Oleg> running Bareos storage daemon and beside sshd and Dell
 > Oleg> OpenManage monitoring does not have any other services.
 > Oleg> Unfortunately testing went not as I expected nonetheless at the
 > Oleg> end system is up and running with no data corrupted.
 >
 > Can you give more details about the system.  Is this providing storage
 > services (NFS) or is it just a backup server?

It is just a backup server, Bareos Storage Daemon + Dell OpenManage for 
LSI RAID cards (Dell's H7XX and H8XX are LSI based).  That host 
deliberately do no share any files or resources for security reasons, so 
no NFS or SMB.

Server has 2x SSD drives by 256Gb each and 10x 3Tb drives.  In addition 
there are two MD1200 disk arrays attached with 12x 4Tb disks each.  All 
disks exposed to CentOS as Virtual so there are 4 disks in total:

NAME                                      MAJ:MIN RM   SIZE RO TYPE
sda                                         8:0    0 278.9G  0 disk
├─sda1                                      8:1    0   500M  0 part /boot
├─sda2                                      8:2    0  36.1G  0 part
│ ├─centos-swap                           253:0    0  11.7G  0 lvm  [SWAP]
│ └─centos-root                           253:1    0  24.4G  0 lvm
├─sda3                                      8:3    0     1K  0 part
└─sda5                                      8:5    0 242.3G  0 part
sdb                                         8:16   0    30T  0 disk
└─primary_backup_vg-primary_backup_lv     253:5    0 110.1T  0 lvm
sdc                                         8:32   0    40T  0 disk
└─primary_backup_vg-primary_backup_lv     253:5    0 110.1T  0 lvm
sdd                                         8:48   0    40T  0 disk
└─primary_backup_vg-primary_backup_lv     253:5    0 110.1T  0 lvm

RAM 12Gb, swap around 12Gb as well.  /dev/sda is a hardware RAID1, the 
rest are RAID5.

I did make a cache and cache_meta on /dev/sda5.  It used to be a 
partition for Bareos spool for quite some time and because after 
upgrading to 10GbBASE network I do not need that spooler any more so I 
decided to try LVM cache.

 > How did you setup your LVM config and your cache config?  Did you
 > mirror the two SSDs using MD, then add the device into your VG and use
 > that to setup the lvcache?
All configs are stock CentOS 7.4 at the moment (incrementally upgraded 
from 7.0 of course), so I did not customize or tried to make any 
optimization on config.
 > I ask because I'm running lvcache at home on my main file/kvm server
 > and I've never seen this problem.  But!  I suspect you're running a
 > much older kernel, lvm config, etc.  Please post the full details of
 > your system if you can.
3.10.0-693.2.2.el7.x86_64

CentOS 7.4, as been pointed by Xen, released about a month ago and I had 
updated about a week ago while doing planned maintenance on network so 
had a good excuse to reboot it.

 > Oleg> Initially I have tried the default writethrough mode and after
 > Oleg> running dd reading test with 250Gb file got system unresponsive
 > Oleg> for roughly 15min with cache allocation around 50%.  Writing to
 > Oleg> disks it seems speed up the system however marginally, so around
 > Oleg> 10% on my tests and I did manage to pull more than 32Tb via
 > Oleg> backup from different hosts and once system became unresponsive
 > Oleg> to ssh and icmp requests however for a very short time.
 >
 > Can you run 'top' or 'vmstat -admt 10' on the console while you're
 > running your tests to see what the system does?  How does memory look
 > on this system when you're NOT runnig lvcache?

Well, it is a production system and I am not planning to cache it again 
for test however if any patches would be available then try to run a 
similar system test on spare box before converting it to FreeBSD with ZFS.

Nonetheless I tried to run top during the dd reading test however with 
in first few minutes I did not notice any issues with RAM.  System was 
using less then 2Gb of 12GB and the rest are wired (cache/buffers). 
After few minutes system became unresponsive even dropping ICMP ping 
requests and ssh session frozen and then dropped after time out, so no 
way to check top measurements.

I have recovered some of SAR records and I may see the last 20 minutes 
SAR did not manage to log anything from 2:40pm to 3:00pm before system 
got rebooted and back online at 3:10pm:

User stat:
02:00:01 PM     CPU     %user     %nice   %system   %iowait    %steal 
  %idle
02:10:01 PM     all      0.22      0.00      0.08      0.05      0.00 
  99.64
02:20:35 PM     all      0.21      0.00      5.23     20.58      0.00 
  73.98
02:30:51 PM     all      0.23      0.00      0.43     31.06      0.00 
  68.27
02:40:02 PM     all      0.06      0.00      0.15     18.55      0.00 
  81.24
Average:        all      0.19      0.00      1.54     17.67      0.00 
  80.61

I/O stat:
02:00:01 PM       tps      rtps      wtps   bread/s   bwrtn/s
02:10:01 PM      5.27      3.19      2.08    109.29    195.38
02:20:35 PM   4404.80   3841.22    563.58 971542.00 140195.66
02:30:51 PM   1110.49    586.67    523.83 148206.31 131721.52
02:40:02 PM    510.72    211.29    299.43  51321.12  76246.81
Average:      1566.86   1214.43    352.43 306453.67  88356.03

DMs:
02:00:01 PM       DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz 
    await     svctm     %util
Average:       dev8-0    370.04    853.43  88355.91    241.08     85.32 
   230.56      1.61     59.54
Average:      dev8-16      0.02      0.14      0.02      8.18      0.00 
     3.71      3.71      0.01
Average:      dev8-32   1196.77 305599.78      0.04    255.35      4.26 
     3.56      0.09     11.28
Average:      dev8-48      0.02      0.35      0.06     18.72      0.00 
    17.77     17.77      0.04
Average:     dev253-0    151.59    118.15   1094.56      8.00     13.60 
    89.71      2.07     31.36
Average:     dev253-1     15.01    722.81     53.73     51.73      3.08 
   204.85     28.35     42.56
Average:     dev253-2   1259.48 218411.68      0.07    173.41      0.21 
     0.16      0.08      9.98
Average:     dev253-3    681.29      1.27  87189.52    127.98    163.02 
   239.29      0.84     57.12
Average:     dev253-4      3.83     11.09     18.09      7.61      0.09 
    22.59     10.72      4.11
Average:     dev253-5   1940.54 305599.86      0.07    157.48      8.47 
     4.36      0.06     11.24

dev253:2 is the cache or actually was ...

Queue stat:
02:00:01 PM   runq-sz  plist-sz   ldavg-1   ldavg-5  ldavg-15   blocked
02:10:01 PM         1       302      0.09      0.05      0.05         0
02:20:35 PM         0       568      6.87      9.72      5.28         3
02:30:51 PM         1       569      5.46      6.83      5.83         2
02:40:02 PM         0       568      0.18      2.41      4.26         1
Average:            0       502      3.15      4.75      3.85         2

RAM stat:
02:00:01 PM kbmemfree kbmemused  %memused kbbuffers  kbcached  kbcommit 
  %commit  kbactive   kbinact   kbdirty
02:10:01 PM    256304  11866580     97.89     66860   9181100   2709288 
    11.10   5603576   5066808        32
02:20:35 PM    185160  11937724     98.47     56712     39104   2725476 
    11.17    299256    292604        16
02:30:51 PM    175220  11947664     98.55     56712     29640   2730732 
    11.19    113912    113552        24
02:40:02 PM  11195028    927856      7.65     57504     62416   2696248 
    11.05    119488    164076        16
Average:      2952928   9169956     75.64     59447   2328065   2715436 
    11.12   1534058   1409260        22

SWAP stat:
02:00:01 PM kbswpfree kbswpused  %swpused  kbswpcad   %swpcad
02:10:01 PM  12010984    277012      2.25     71828     25.93
02:20:35 PM  11048040   1239956     10.09     88696      7.15
02:30:51 PM  10723456   1564540     12.73     38272      2.45
02:40:02 PM  10716884   1571112     12.79     77928      4.96
Average:     11124841   1163155      9.47     69181      5.95



Cheers,
Oleg




More information about the linux-lvm mailing list