[vdo-devel] dmsetup stuck for more than one day

Łukasz Michalski lm at zork.pl
Thu Nov 5 15:20:52 UTC 2020


Hi,

Is it possible to upgrade only vdo and stick with CentOS 7.5.1804 for rest of packages?

Regards,
Łukasz

On 05/11/2020 16.17, Sweet Tea Dorminy wrote:
> Greetings Łukasz;
>
> I think this may be a instance of BZ 1821275 <https://bugzilla.redhat.com/show_bug.cgi?id=1821275>, fixed in 6.1.3.23. Is it feasible to restart the machine (unfortunately there's no other way to stop a presumably hung attempt to start VDO), upgrade to at least that version, and try again?
>
> Thanks!
>
> Sweet Tea Dorminy
>
>
> On Thu, Nov 5, 2020 at 9:54 AM Łukasz Michalski <lm at zork.pl <mailto:lm at zork.pl>> wrote:
>
>     Details below.
>
>     Now I see that I was looking at wrong block device, My vdo is on /dev/sda and atop shows no activity for it.
>
>     Thanks,
>     Łukasz
>
>     On 05/11/2020 15.26, Andrew Walsh wrote:
>>     Hi Lukasz,
>>
>>     Can you please confirm a few details?  These will help us understand what may be going on.  We may end up needing additional information, but this will help us identify a starting point for the investigation.
>>
>>     **Storage Stack Configuration:**
>>     High Level Configuration: [e.g. SSD -> MD RAID 5 -> VDO -> XFS]
>
>     Two servers, on each:
>     Hardware RAID6, 54Tb -> LVM -> VDO -> GlusterFS (XFS for bricks) -> Samba shares.
>     Currently samba and gluster are disabled.
>
>>     Output of `blockdev --report`:
>     [root at ixmed1 /]# blockdev --report
>
>     RO    RA   SSZ   BSZ   StartSec            Size   Device
>     rw   256   512  4096          0  59999990579200   /dev/sda
>     rw   256   512  4096          0    238999830528   /dev/sdb
>     rw   256   512  4096       2048      1073741824   /dev/sdb1
>     rw   256   512  4096    2099200    216446009344   /dev/sdb2
>     rw   256   512  4096  424845312     21479030784   /dev/sdb3
>     rw   256   512  4096          0    119810293760   /dev/dm-0
>     rw   256   512  4096          0     21470642176   /dev/dm-1
>     rw   256   512  4096          0     32212254720   /dev/dm-2
>     rw   256   512  4096          0     42949672960   /dev/dm-3
>     rw   256   512  4096          0     21474836480   /dev/dm-4
>     rw   256   512  4096          0  21990232555520   /dev/dm-5
>     rw   256   512  4096          0     21474144256   /dev/drbd999
>
>>     Output of `lsblk -o name,maj:min,kname,type,fstype,state,sched,uuid`:
>     [root at ixmed1 /]# lsblk -o name,maj:min,kname,type,fstype,state,sched,uuid
>     lsblk: dm-6: failed to get device path
>     lsblk: dm-6: failed to get device path
>     NAME              MAJ:MIN KNAME   TYPE FSTYPE   STATE SCHED    UUID
>     sda                 8:0   sda     disk LVM2_mem runni deadline ggCzji-1O8d-BWCa-XwLe-BJ94-fwHa-cOseC0
>     └─vgStorage-LV_vdo_Rada--ixmed
>                       253:5   dm-5    lvm  vdo      runni          b668b2d9-96bf-4840-a43d-6b7ab0a7f235
>     sdb                 8:16  sdb     disk          runni deadline
>     ├─sdb1              8:17  sdb1    part xfs            deadline f89ef6d8-d9f4-4061-8f48-3ffae8e23b1e
>     ├─sdb2              8:18  sdb2    part LVM2_mem       deadline pHO0UQ-aGWu-Hg6g-siiq-TGPT-kw4B-gD0fgs
>     │ ├─vgSys-root    253:0   dm-0    lvm  xfs      runni          4f48e2c7-6324-4465-953a-c1a9512ab782
>     │ ├─vgSys-swap    253:1   dm-1    lvm  swap     runni          97234c91-7804-43b2-944f-0122c90fc962
>     │ ├─vgSys-cluster 253:2   dm-2    lvm  xfs      runni          97b4c285-4bfe-4d4f-8c3c-ca716157bf52
>     │ └─vgSys-var     253:3   dm-3    lvm  xfs      runni          6f5c860b-88e0-4d28-bc09-2e365299f86e
>     └─sdb3              8:19  sdb3    part LVM2_mem       deadline nvBfNi-qm2u-bt5T-dyCL-3FgQ-DSic-z8dUDq
>       └─vgSys-pgsql   253:4   dm-4    lvm  xfs      runni          5c3e18cc-9e0f-4c81-906b-3e68f196cafe
>         └─drbd999     147:999 drbd999 disk xfs                     5c3e18cc-9e0f-4c81-906b-3e68f196cafe
>>
>>     **Hardware Information:**
>>      - CPU: [e.g. 2x Intel Xeon E5-1650 v2 @ 3.5GHz]
>>      - Memory: [e.g. 128G]
>>      - Storage: [e.g. Intel Optane SSD 900P]
>>      - Other: [e.g. iSCSI backed storage]
>
>     Huawei 5288 V5
>     64GB RAM
>     2 X Intel(R) Xeon(R) Silver 4116 CPU @ 2.10GHz
>     RAID: Symbios Logic MegaRAID SAS-3 3008 [Fury] (rev 02) (from lspci, megaraid_sas driver)
>
>>
>>     **Distro Information:**
>>      - OS: [e.g. RHEL-7.5]
>     CentOS Linux release 7.5.1804 (Core)
>>      - Architecture: [e.g. x86_64]
>     x86_64
>>      - Kernel: [e.g. kernel-3.10.0-862.el7]
>     3.10.0-862.el7
>>      - VDO Version: [e.g. vdo-6.2.0.168-18.el7, or a commit hash]
>>      - KVDO Version: [e.g. kmod-kvdo6.2.0.153-15.el7, or a commit hash]
>     [root at ixmed1 /]# yum list |grep vdo
>     kmod-kvdo.x86_64                          6.1.0.168-16.el7_5           @updates
>     vdo.x86_64                                6.1.0.168-18                 @updates
>>      - LVM Version: [e.g. 2.02.177-4.el7]
>     2.02.177(2)-RHEL7 (2018-01-22
>>      - Output of `uname -a`: [e.g. Linux localhost.localdomain 3.10.0-862.el7.x86_64 #1 SMP Wed Mar 21 18:14:51 EDT 2018 x86_64 x86_64 x86_64 GNU/Linux]
>
>     Linux ixmed1 3.10.0-862.el7.x86_64 #1 SMP Fri Apr 20 16:44:24 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
>
>>
>>     On Thu, Nov 5, 2020 at 6:49 AM Łukasz Michalski <lm at zork.pl <mailto:lm at zork.pl>> wrote:
>>
>>         Hi,
>>
>>         I have two 20T two devices that was crashed during power outage - on two servers.
>>
>>         After server restart I see in logs on the first server:
>>
>>         [root at ixmed1 /]# dmesg |grep vdo
>>         [   11.223770] kvdo: modprobe: loaded version 6.1.0.168
>>         [   11.904949] kvdo0:dmsetup: starting device 'vdo_test' device instantiation 0 write policy auto
>>         [   11.904979] kvdo0:dmsetup: underlying device, REQ_FLUSH: not supported, REQ_FUA: not supported
>>         [   11.904985] kvdo0:dmsetup: Using mode sync automatically.
>>         [   11.905017] kvdo0:dmsetup: zones: 1 logical, 1 physical, 1 hash; base threads: 5
>>         [   11.966414] kvdo0:journalQ: Device was dirty, rebuilding reference counts
>>         [   12.452589] kvdo0:logQ0: Finished reading recovery journal
>>         [   12.458550] kvdo0:logQ0: Highest-numbered recovery journal block has sequence number 70548140, and the highest-numbered usable block is 70548140
>>         [   12.458556] kvdo0:logQ0: Replaying entries into slab journals
>>         [   13.538099] kvdo0:logQ0: Replayed 5568767 journal entries into slab journals
>>         [   14.174984] kvdo0:logQ0: Recreating missing journal entries
>>         [   14.175025] kvdo0:journalQ: Synthesized 0 missing journal entries
>>         [   14.177768] kvdo0:journalQ: Saving recovery progress
>>         [   14.636416] kvdo0:logQ0: Replaying 2528946 recovery entries into block map
>>
>>         [root at ixmed1 /]# uptime
>>          12:41:33 up 1 day,  4:07,  2 users,  load average: 1.06, 1.05, 1.16
>>
>>         [root at ixmed1 /]# ps ax |grep vdo
>>           1135 ?        Ss     0:00 /usr/bin/python /usr/bin/vdo start --all --confFile /etc/vdoconf.yml
>>           1210 ?        R    21114668:39 dmsetup create vdo_Rada-ixmed --uuid VDO-b668b2d9-96bf-4840-a43d-6b7ab0a7f235 --table 0 72301908952 vdo /dev/disk/by-id/dm-name-vgStorage-LV_test 4096 disabled 0 32768 16380 on auto vdo_test ack=1,bio=4,bioRotationInterval=64,cpu=2,hash=1,logical=1,physical=1
>>           1213 ?        S      1:51 [kvdo0:dedupeQ]
>>           1214 ?        S      1:51 [kvdo0:journalQ]
>>           1215 ?        S      1:51 [kvdo0:packerQ]
>>           1216 ?        S      1:51 [kvdo0:logQ0]
>>           1217 ?        S      1:51 [kvdo0:physQ0]
>>           1218 ?        S      1:50 [kvdo0:hashQ0]
>>           1219 ?        S      1:52 [kvdo0:bioQ0]
>>           1220 ?        S      1:51 [kvdo0:bioQ1]
>>           1221 ?        S      1:51 [kvdo0:bioQ2]
>>           1222 ?        S      1:51 [kvdo0:bioQ3]
>>           1223 ?        S      1:48 [kvdo0:ackQ]
>>           1224 ?        S      1:49 [kvdo0:cpuQ0]
>>           1225 ?        S      1:49 [kvdo0:cpuQ1]
>>
>>         The only activity I see is that there are small writes shown in 'atop' to vdo underlying device.
>>
>>         On the first server dmsetup takes 100% cpu (one core), on the second server dmsetup seems to be idle.
>>
>>         What should I do in this situation?
>>
>>         Regards,
>>         Łukasz
>>
>>
>>
>>         _______________________________________________
>>         vdo-devel mailing list
>>         vdo-devel at redhat.com <mailto:vdo-devel at redhat.com>
>>         https://www.redhat.com/mailman/listinfo/vdo-devel
>>
>
>     _______________________________________________
>     vdo-devel mailing list
>     vdo-devel at redhat.com <mailto:vdo-devel at redhat.com>
>     https://www.redhat.com/mailman/listinfo/vdo-devel
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/vdo-devel/attachments/20201105/740dac60/attachment.htm>


More information about the vdo-devel mailing list