[linux-lvm] System hangs after snapshot autoextend, 'dmsetup info' confirms device is SUSPENDED

Tom Crane TPClvm at mklab.ph.rhul.ac.uk
Wed Jul 31 14:36:01 UTC 2019


For some time I have suffered occasional system hangs following a snapshot 
autoextend operation.  The system becomes unusable, with many processes 
stuck in the uninterruptible 'D' wait state.

My setup is quite basic: bare metal system, plain HD, conventional/thick 
volumes, no lvmetad, plenty of free space on the PV & VG. eg.,

# pvs
   PV         VG  Fmt  Attr PSize  PFree
   /dev/sda2  VG0 lvm2 a--  <1.82t <368.47g
# vgs
   VG  #PV #LV #SN Attr   VSize  VFree
   VG0   1  15  10 wz--n- <1.82t <368.47g
# lvs
   LV                     VG  Attr       LSize   Pool Origin Data%  Meta% 
Move Log Cpy%Sync Convert
   data                   VG0 -wi-ao---- 600.00g
   home                   VG0 owi-aos--- 100.00g
   home_daily_snapshot    VG0 swi-a-s---   5.00g      home   27.29
   home_daily_snapshot2   VG0 swi-a-s---   5.00g      home   41.83
   home_monthly_snapshot  VG0 swi-a-s--- <27.86g      home   45.95
   home_monthly_snapshot2 VG0 swi-a-s---  40.80g      home   48.52
   root                   VG0 owi-aos--- 100.00g
   root_daily_snapshot    VG0 swi-a-s---   5.00g      root   8.84
   root_daily_snapshot2   VG0 swi-a-s---   5.00g      root   26.53
   root_monthly_snapshot  VG0 swi-a-s--- <37.09g      root   45.83
   root_monthly_snapshot2 VG0 swi-a-s---  53.43g      root   45.60
   swap                   VG0 -wi-ao----  10.00g
   tmp                    VG0 owi-aos--- 400.00g
   tmp_daily_snapshot     VG0 swi-a-s---  35.44g      tmp    47.81
   tmp_daily_snapshot2    VG0 swi-a-s---  69.08g      tmp    49.72

I have the following config in /etc/lvm/lvm.conf activation{},

         snapshot_autoextend_threshold=50
         snapshot_autoextend_percent=10

Typically the last thing I see in syslog before the hang is, eg.,

lvm[1028]: Size of logical volume VG0/root_monthly_snapshot changed from 33.71 GiB (8631 extents) to <37.09 GiB (9495 extents).

To investigate the problem I setup a job to start at boot time which does 
a 'dmsetup info' hourly, around the clock and logs the O/P to /data (which 
has no snapshots).  It showed the root LV had been suspended after the 
hang/resize, remaining in that state indefinitely, eg.

Name:              VG0-root
State:             SUSPENDED
Read Ahead:        256
Tables present:    LIVE
Open count:        1
Event number:      0
Major, minor:      253, 1
Number of targets: 1
UUID: LVM-hVoK8kkkqDj1vfBDfvIhg7vdMospBTT4a9rD2V1t3dKHgp4igXP0uny8bFOQ2sya

All the other devices were in the ACTIVE state as normal.

Any thoughts on further diagnosis/fixing this problem?

Additional system/lvm details;
# lvm version
   LVM version:     2.02.177(2) (2017-12-18)
   Library version: 1.02.146 (2017-12-18)
   Driver version:  4.39.0
   Configuration:   ./configure --disable-readline --enable-cmdlib 
--enable-dmeventd --enable-applib --libdir=/usr/lib64 
--with-usrlibdir=/usr/lib64 --mandir=/usr/man --enable-realtime 
--with-lvm1=internal --enable-pkgconfig --enable-udev_sync 
--enable-udev_rules --with-udev-prefix= --with-device-uid=0 
--with-device-gid=6 --with-device-mode=0660 
--with-default-locking-dir=/run/lock/lvm --with-default-run-dir=/run/lvm 
--with-default-dm-run-dir=/run/lvm --with-clvmd-pidfile=/run/lvm/clvmd.pid 
--with-cmirrord-pidfile=/run/lvm/cmirrord.pid 
--with-dmeventd-pidfile=/run/lvm/dmeventd.pid 
--build=x86_64-slackware-linux

# uname -a
Linux mklab.ph.rhul.ac.uk 4.19.59 #2 SMP Sun Jul 14 16:07:23 CDT 2019 
x86_64 AMD FX-8320E Eight-Core Processor AuthenticAMD GNU/Linux

Thanks
Tom Crane
-- 
Tom Crane, Dept. Physics, Royal Holloway, University of London, Egham Hill,
Egham, Surrey, TW20 0EX, England.
Email:  T.Crane at rhul.ac.uk




More information about the linux-lvm mailing list