[vdo-devel] dmsetup stuck for more than one day

Sweet Tea Dorminy sweettea at redhat.com
Fri Nov 6 11:47:42 UTC 2020


Awesome, glad to hear it worked out!

On Fri, Nov 6, 2020 at 6:24 AM Łukasz Michalski <lm at zork.pl> wrote:

> Ok, no problem. I built 6.1.3.23 for CentOS 7.5, updated kmod-vdo and vdo
> and everything seems to be working now.
> I only had to install vdo with --no-deps and link python2.7 to
> /usr/libexec/platform-python, because CentOS 7.5 does not have
> platform-python concept.
>
> Many thanks for your support,
> Łukasz
>
> On 05/11/2020 18.55, Andrew Walsh wrote:
>
> Hi Lukasz,
>
> Version 6.1.3.7 is the latest available as of RHEL-7.8, and 6.1.3.23 is
> the latest available as of RHEL-7.9.  Perhaps the CentOS repos haven't been
> updated to include RHEL-7.9 content just yet.
>
> Unfortunately the fix for the issue you encountered isn't available in
> 6.1.3.7 as it was actually fixed in 6.1.3.23.
>
> Andy Walsh
>
>
> On Thu, Nov 5, 2020 at 11:57 AM Łukasz Michalski <lm at zork.pl> wrote:
>
>> Hmmm looking at http://mirror.centos.org/centos/7/os/x86_64/Packages/ I
>> see kmod-kvdo-6.1.3.7-5.el7.x86_64.rp
>>
>> Is 6.1.3.23 available somewhere?
>>
>>
>> On 05/11/2020 17.50, Sweet Tea Dorminy wrote:
>>
>> No, I believe you'd need to update the kernel also to go along with the
>> updated kmod-kvdo.
>>
>> On Thu, Nov 5, 2020 at 10:21 AM Łukasz Michalski <lm at zork.pl> wrote:
>>
>>> Hi,
>>>
>>> Is it possible to upgrade only vdo and stick with CentOS 7.5.1804 for
>>> rest of packages?
>>>
>>> Regards,
>>> Łukasz
>>>
>>> On 05/11/2020 16.17, Sweet Tea Dorminy wrote:
>>>
>>> Greetings Łukasz;
>>>
>>> I think this may be a instance of BZ 1821275
>>> <https://bugzilla.redhat.com/show_bug.cgi?id=1821275>, fixed in
>>> 6.1.3.23. Is it feasible to restart the machine (unfortunately there's no
>>> other way to stop a presumably hung attempt to start VDO), upgrade to at
>>> least that version, and try again?
>>>
>>> Thanks!
>>>
>>> Sweet Tea Dorminy
>>>
>>>
>>> On Thu, Nov 5, 2020 at 9:54 AM Łukasz Michalski <lm at zork.pl> wrote:
>>>
>>>> Details below.
>>>>
>>>> Now I see that I was looking at wrong block device, My vdo is on
>>>> /dev/sda and atop shows no activity for it.
>>>>
>>>> Thanks,
>>>> Łukasz
>>>>
>>>> On 05/11/2020 15.26, Andrew Walsh wrote:
>>>>
>>>> Hi Lukasz,
>>>>
>>>> Can you please confirm a few details?  These will help us understand
>>>> what may be going on.  We may end up needing additional information, but
>>>> this will help us identify a starting point for the investigation.
>>>>
>>>> **Storage Stack Configuration:**
>>>> High Level Configuration: [e.g. SSD -> MD RAID 5 -> VDO -> XFS]
>>>>
>>>> Two servers, on each:
>>>> Hardware RAID6, 54Tb -> LVM -> VDO -> GlusterFS (XFS for bricks) ->
>>>> Samba shares.
>>>> Currently samba and gluster are disabled.
>>>>
>>>> Output of `blockdev --report`:
>>>>
>>>> [root at ixmed1 /]# blockdev --report
>>>>
>>>> RO    RA   SSZ   BSZ   StartSec            Size   Device
>>>> rw   256   512  4096          0  59999990579200   /dev/sda
>>>> rw   256   512  4096          0    238999830528   /dev/sdb
>>>> rw   256   512  4096       2048      1073741824   /dev/sdb1
>>>> rw   256   512  4096    2099200    216446009344   /dev/sdb2
>>>> rw   256   512  4096  424845312     21479030784   /dev/sdb3
>>>> rw   256   512  4096          0    119810293760   /dev/dm-0
>>>> rw   256   512  4096          0     21470642176   /dev/dm-1
>>>> rw   256   512  4096          0     32212254720   /dev/dm-2
>>>> rw   256   512  4096          0     42949672960   /dev/dm-3
>>>> rw   256   512  4096          0     21474836480   /dev/dm-4
>>>> rw   256   512  4096          0  21990232555520   /dev/dm-5
>>>> rw   256   512  4096          0     21474144256   /dev/drbd999
>>>>
>>>> Output of `lsblk -o name,maj:min,kname,type,fstype,state,sched,uuid`:
>>>>
>>>> [root at ixmed1 /]# lsblk -o
>>>> name,maj:min,kname,type,fstype,state,sched,uuid
>>>> lsblk: dm-6: failed to get device path
>>>> lsblk: dm-6: failed to get device path
>>>> NAME              MAJ:MIN KNAME   TYPE FSTYPE   STATE SCHED    UUID
>>>> sda                 8:0   sda     disk LVM2_mem runni deadline
>>>> ggCzji-1O8d-BWCa-XwLe-BJ94-fwHa-cOseC0
>>>> └─vgStorage-LV_vdo_Rada--ixmed
>>>>                   253:5   dm-5    lvm  vdo      runni
>>>> b668b2d9-96bf-4840-a43d-6b7ab0a7f235
>>>> sdb                 8:16  sdb     disk          runni deadline
>>>> ├─sdb1              8:17  sdb1    part xfs            deadline
>>>> f89ef6d8-d9f4-4061-8f48-3ffae8e23b1e
>>>> ├─sdb2              8:18  sdb2    part LVM2_mem       deadline
>>>> pHO0UQ-aGWu-Hg6g-siiq-TGPT-kw4B-gD0fgs
>>>> │ ├─vgSys-root    253:0   dm-0    lvm  xfs      runni
>>>> 4f48e2c7-6324-4465-953a-c1a9512ab782
>>>> │ ├─vgSys-swap    253:1   dm-1    lvm  swap     runni
>>>> 97234c91-7804-43b2-944f-0122c90fc962
>>>> │ ├─vgSys-cluster 253:2   dm-2    lvm  xfs      runni
>>>> 97b4c285-4bfe-4d4f-8c3c-ca716157bf52
>>>> │ └─vgSys-var     253:3   dm-3    lvm  xfs      runni
>>>> 6f5c860b-88e0-4d28-bc09-2e365299f86e
>>>> └─sdb3              8:19  sdb3    part LVM2_mem       deadline
>>>> nvBfNi-qm2u-bt5T-dyCL-3FgQ-DSic-z8dUDq
>>>>   └─vgSys-pgsql   253:4   dm-4    lvm  xfs      runni
>>>> 5c3e18cc-9e0f-4c81-906b-3e68f196cafe
>>>>     └─drbd999     147:999 drbd999 disk xfs
>>>> 5c3e18cc-9e0f-4c81-906b-3e68f196cafe
>>>>
>>>>
>>>> **Hardware Information:**
>>>>  - CPU: [e.g. 2x Intel Xeon E5-1650 v2 @ 3.5GHz]
>>>>  - Memory: [e.g. 128G]
>>>>  - Storage: [e.g. Intel Optane SSD 900P]
>>>>  - Other: [e.g. iSCSI backed storage]
>>>>
>>>> Huawei 5288 V5
>>>> 64GB RAM
>>>> 2 X Intel(R) Xeon(R) Silver 4116 CPU @ 2.10GHz
>>>> RAID: Symbios Logic MegaRAID SAS-3 3008 [Fury] (rev 02) (from lspci,
>>>> megaraid_sas driver)
>>>>
>>>>
>>>> **Distro Information:**
>>>>  - OS: [e.g. RHEL-7.5]
>>>>
>>>> CentOS Linux release 7.5.1804 (Core)
>>>>
>>>>  - Architecture: [e.g. x86_64]
>>>>
>>>> x86_64
>>>>
>>>>  - Kernel: [e.g. kernel-3.10.0-862.el7]
>>>>
>>>> 3.10.0-862.el7
>>>>
>>>>  - VDO Version: [e.g. vdo-6.2.0.168-18.el7, or a commit hash]
>>>>  - KVDO Version: [e.g. kmod-kvdo6.2.0.153-15.el7, or a commit hash]
>>>>
>>>> [root at ixmed1 /]# yum list |grep vdo
>>>> kmod-kvdo.x86_64                          6.1.0.168-16.el7_5
>>>> @updates
>>>> vdo.x86_64                                6.1.0.168-18
>>>> @updates
>>>>
>>>>  - LVM Version: [e.g. 2.02.177-4.el7]
>>>>
>>>> 2.02.177(2)-RHEL7 (2018-01-22
>>>>
>>>>  - Output of `uname -a`: [e.g. Linux localhost.localdomain
>>>> 3.10.0-862.el7.x86_64 #1 SMP Wed Mar 21 18:14:51 EDT 2018 x86_64 x86_64
>>>> x86_64 GNU/Linux]
>>>>
>>>> Linux ixmed1 3.10.0-862.el7.x86_64 #1 SMP Fri Apr 20 16:44:24 UTC 2018
>>>> x86_64 x86_64 x86_64 GNU/Linux
>>>>
>>>>
>>>> On Thu, Nov 5, 2020 at 6:49 AM Łukasz Michalski <lm at zork.pl> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I have two 20T two devices that was crashed during power outage - on
>>>>> two servers.
>>>>>
>>>>> After server restart I see in logs on the first server:
>>>>>
>>>>> [root at ixmed1 /]# dmesg |grep vdo
>>>>> [   11.223770] kvdo: modprobe: loaded version 6.1.0.168
>>>>> [   11.904949] kvdo0:dmsetup: starting device 'vdo_test' device
>>>>> instantiation 0 write policy auto
>>>>> [   11.904979] kvdo0:dmsetup: underlying device, REQ_FLUSH: not
>>>>> supported, REQ_FUA: not supported
>>>>> [   11.904985] kvdo0:dmsetup: Using mode sync automatically.
>>>>> [   11.905017] kvdo0:dmsetup: zones: 1 logical, 1 physical, 1 hash;
>>>>> base threads: 5
>>>>> [   11.966414] kvdo0:journalQ: Device was dirty, rebuilding reference
>>>>> counts
>>>>> [   12.452589] kvdo0:logQ0: Finished reading recovery journal
>>>>> [   12.458550] kvdo0:logQ0: Highest-numbered recovery journal block
>>>>> has sequence number 70548140, and the highest-numbered usable block is
>>>>> 70548140
>>>>> [   12.458556] kvdo0:logQ0: Replaying entries into slab journals
>>>>> [   13.538099] kvdo0:logQ0: Replayed 5568767 journal entries into slab
>>>>> journals
>>>>> [   14.174984] kvdo0:logQ0: Recreating missing journal entries
>>>>> [   14.175025] kvdo0:journalQ: Synthesized 0 missing journal entries
>>>>> [   14.177768] kvdo0:journalQ: Saving recovery progress
>>>>> [   14.636416] kvdo0:logQ0: Replaying 2528946 recovery entries into
>>>>> block map
>>>>>
>>>>> [root at ixmed1 /]# uptime
>>>>>  12:41:33 up 1 day,  4:07,  2 users,  load average: 1.06, 1.05, 1.16
>>>>>
>>>>> [root at ixmed1 /]# ps ax |grep vdo
>>>>>   1135 ?        Ss     0:00 /usr/bin/python /usr/bin/vdo start --all
>>>>> --confFile /etc/vdoconf.yml
>>>>>   1210 ?        R    21114668:39 dmsetup create vdo_Rada-ixmed --uuid
>>>>> VDO-b668b2d9-96bf-4840-a43d-6b7ab0a7f235 --table 0 72301908952 vdo
>>>>> /dev/disk/by-id/dm-name-vgStorage-LV_test 4096 disabled 0 32768 16380 on
>>>>> auto vdo_test
>>>>> ack=1,bio=4,bioRotationInterval=64,cpu=2,hash=1,logical=1,physical=1
>>>>>   1213 ?        S      1:51 [kvdo0:dedupeQ]
>>>>>   1214 ?        S      1:51 [kvdo0:journalQ]
>>>>>   1215 ?        S      1:51 [kvdo0:packerQ]
>>>>>   1216 ?        S      1:51 [kvdo0:logQ0]
>>>>>   1217 ?        S      1:51 [kvdo0:physQ0]
>>>>>   1218 ?        S      1:50 [kvdo0:hashQ0]
>>>>>   1219 ?        S      1:52 [kvdo0:bioQ0]
>>>>>   1220 ?        S      1:51 [kvdo0:bioQ1]
>>>>>   1221 ?        S      1:51 [kvdo0:bioQ2]
>>>>>   1222 ?        S      1:51 [kvdo0:bioQ3]
>>>>>   1223 ?        S      1:48 [kvdo0:ackQ]
>>>>>   1224 ?        S      1:49 [kvdo0:cpuQ0]
>>>>>   1225 ?        S      1:49 [kvdo0:cpuQ1]
>>>>>
>>>>> The only activity I see is that there are small writes shown in 'atop'
>>>>> to vdo underlying device.
>>>>>
>>>>> On the first server dmsetup takes 100% cpu (one core), on the second
>>>>> server dmsetup seems to be idle.
>>>>>
>>>>> What should I do in this situation?
>>>>>
>>>>> Regards,
>>>>> Łukasz
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> vdo-devel mailing list
>>>>> vdo-devel at redhat.com
>>>>> https://www.redhat.com/mailman/listinfo/vdo-devel
>>>>>
>>>>
>>>> _______________________________________________
>>>> vdo-devel mailing list
>>>> vdo-devel at redhat.com
>>>> https://www.redhat.com/mailman/listinfo/vdo-devel
>>>>
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/vdo-devel/attachments/20201106/79311824/attachment.htm>


More information about the vdo-devel mailing list