[vdo-devel] dmsetup stuck for more than one day
Sweet Tea Dorminy
sweettea at redhat.com
Fri Nov 6 11:47:42 UTC 2020
Awesome, glad to hear it worked out!
On Fri, Nov 6, 2020 at 6:24 AM Łukasz Michalski <lm at zork.pl> wrote:
> Ok, no problem. I built 6.1.3.23 for CentOS 7.5, updated kmod-vdo and vdo
> and everything seems to be working now.
> I only had to install vdo with --no-deps and link python2.7 to
> /usr/libexec/platform-python, because CentOS 7.5 does not have
> platform-python concept.
>
> Many thanks for your support,
> Łukasz
>
> On 05/11/2020 18.55, Andrew Walsh wrote:
>
> Hi Lukasz,
>
> Version 6.1.3.7 is the latest available as of RHEL-7.8, and 6.1.3.23 is
> the latest available as of RHEL-7.9. Perhaps the CentOS repos haven't been
> updated to include RHEL-7.9 content just yet.
>
> Unfortunately the fix for the issue you encountered isn't available in
> 6.1.3.7 as it was actually fixed in 6.1.3.23.
>
> Andy Walsh
>
>
> On Thu, Nov 5, 2020 at 11:57 AM Łukasz Michalski <lm at zork.pl> wrote:
>
>> Hmmm looking at http://mirror.centos.org/centos/7/os/x86_64/Packages/ I
>> see kmod-kvdo-6.1.3.7-5.el7.x86_64.rp
>>
>> Is 6.1.3.23 available somewhere?
>>
>>
>> On 05/11/2020 17.50, Sweet Tea Dorminy wrote:
>>
>> No, I believe you'd need to update the kernel also to go along with the
>> updated kmod-kvdo.
>>
>> On Thu, Nov 5, 2020 at 10:21 AM Łukasz Michalski <lm at zork.pl> wrote:
>>
>>> Hi,
>>>
>>> Is it possible to upgrade only vdo and stick with CentOS 7.5.1804 for
>>> rest of packages?
>>>
>>> Regards,
>>> Łukasz
>>>
>>> On 05/11/2020 16.17, Sweet Tea Dorminy wrote:
>>>
>>> Greetings Łukasz;
>>>
>>> I think this may be a instance of BZ 1821275
>>> <https://bugzilla.redhat.com/show_bug.cgi?id=1821275>, fixed in
>>> 6.1.3.23. Is it feasible to restart the machine (unfortunately there's no
>>> other way to stop a presumably hung attempt to start VDO), upgrade to at
>>> least that version, and try again?
>>>
>>> Thanks!
>>>
>>> Sweet Tea Dorminy
>>>
>>>
>>> On Thu, Nov 5, 2020 at 9:54 AM Łukasz Michalski <lm at zork.pl> wrote:
>>>
>>>> Details below.
>>>>
>>>> Now I see that I was looking at wrong block device, My vdo is on
>>>> /dev/sda and atop shows no activity for it.
>>>>
>>>> Thanks,
>>>> Łukasz
>>>>
>>>> On 05/11/2020 15.26, Andrew Walsh wrote:
>>>>
>>>> Hi Lukasz,
>>>>
>>>> Can you please confirm a few details? These will help us understand
>>>> what may be going on. We may end up needing additional information, but
>>>> this will help us identify a starting point for the investigation.
>>>>
>>>> **Storage Stack Configuration:**
>>>> High Level Configuration: [e.g. SSD -> MD RAID 5 -> VDO -> XFS]
>>>>
>>>> Two servers, on each:
>>>> Hardware RAID6, 54Tb -> LVM -> VDO -> GlusterFS (XFS for bricks) ->
>>>> Samba shares.
>>>> Currently samba and gluster are disabled.
>>>>
>>>> Output of `blockdev --report`:
>>>>
>>>> [root at ixmed1 /]# blockdev --report
>>>>
>>>> RO RA SSZ BSZ StartSec Size Device
>>>> rw 256 512 4096 0 59999990579200 /dev/sda
>>>> rw 256 512 4096 0 238999830528 /dev/sdb
>>>> rw 256 512 4096 2048 1073741824 /dev/sdb1
>>>> rw 256 512 4096 2099200 216446009344 /dev/sdb2
>>>> rw 256 512 4096 424845312 21479030784 /dev/sdb3
>>>> rw 256 512 4096 0 119810293760 /dev/dm-0
>>>> rw 256 512 4096 0 21470642176 /dev/dm-1
>>>> rw 256 512 4096 0 32212254720 /dev/dm-2
>>>> rw 256 512 4096 0 42949672960 /dev/dm-3
>>>> rw 256 512 4096 0 21474836480 /dev/dm-4
>>>> rw 256 512 4096 0 21990232555520 /dev/dm-5
>>>> rw 256 512 4096 0 21474144256 /dev/drbd999
>>>>
>>>> Output of `lsblk -o name,maj:min,kname,type,fstype,state,sched,uuid`:
>>>>
>>>> [root at ixmed1 /]# lsblk -o
>>>> name,maj:min,kname,type,fstype,state,sched,uuid
>>>> lsblk: dm-6: failed to get device path
>>>> lsblk: dm-6: failed to get device path
>>>> NAME MAJ:MIN KNAME TYPE FSTYPE STATE SCHED UUID
>>>> sda 8:0 sda disk LVM2_mem runni deadline
>>>> ggCzji-1O8d-BWCa-XwLe-BJ94-fwHa-cOseC0
>>>> └─vgStorage-LV_vdo_Rada--ixmed
>>>> 253:5 dm-5 lvm vdo runni
>>>> b668b2d9-96bf-4840-a43d-6b7ab0a7f235
>>>> sdb 8:16 sdb disk runni deadline
>>>> ├─sdb1 8:17 sdb1 part xfs deadline
>>>> f89ef6d8-d9f4-4061-8f48-3ffae8e23b1e
>>>> ├─sdb2 8:18 sdb2 part LVM2_mem deadline
>>>> pHO0UQ-aGWu-Hg6g-siiq-TGPT-kw4B-gD0fgs
>>>> │ ├─vgSys-root 253:0 dm-0 lvm xfs runni
>>>> 4f48e2c7-6324-4465-953a-c1a9512ab782
>>>> │ ├─vgSys-swap 253:1 dm-1 lvm swap runni
>>>> 97234c91-7804-43b2-944f-0122c90fc962
>>>> │ ├─vgSys-cluster 253:2 dm-2 lvm xfs runni
>>>> 97b4c285-4bfe-4d4f-8c3c-ca716157bf52
>>>> │ └─vgSys-var 253:3 dm-3 lvm xfs runni
>>>> 6f5c860b-88e0-4d28-bc09-2e365299f86e
>>>> └─sdb3 8:19 sdb3 part LVM2_mem deadline
>>>> nvBfNi-qm2u-bt5T-dyCL-3FgQ-DSic-z8dUDq
>>>> └─vgSys-pgsql 253:4 dm-4 lvm xfs runni
>>>> 5c3e18cc-9e0f-4c81-906b-3e68f196cafe
>>>> └─drbd999 147:999 drbd999 disk xfs
>>>> 5c3e18cc-9e0f-4c81-906b-3e68f196cafe
>>>>
>>>>
>>>> **Hardware Information:**
>>>> - CPU: [e.g. 2x Intel Xeon E5-1650 v2 @ 3.5GHz]
>>>> - Memory: [e.g. 128G]
>>>> - Storage: [e.g. Intel Optane SSD 900P]
>>>> - Other: [e.g. iSCSI backed storage]
>>>>
>>>> Huawei 5288 V5
>>>> 64GB RAM
>>>> 2 X Intel(R) Xeon(R) Silver 4116 CPU @ 2.10GHz
>>>> RAID: Symbios Logic MegaRAID SAS-3 3008 [Fury] (rev 02) (from lspci,
>>>> megaraid_sas driver)
>>>>
>>>>
>>>> **Distro Information:**
>>>> - OS: [e.g. RHEL-7.5]
>>>>
>>>> CentOS Linux release 7.5.1804 (Core)
>>>>
>>>> - Architecture: [e.g. x86_64]
>>>>
>>>> x86_64
>>>>
>>>> - Kernel: [e.g. kernel-3.10.0-862.el7]
>>>>
>>>> 3.10.0-862.el7
>>>>
>>>> - VDO Version: [e.g. vdo-6.2.0.168-18.el7, or a commit hash]
>>>> - KVDO Version: [e.g. kmod-kvdo6.2.0.153-15.el7, or a commit hash]
>>>>
>>>> [root at ixmed1 /]# yum list |grep vdo
>>>> kmod-kvdo.x86_64 6.1.0.168-16.el7_5
>>>> @updates
>>>> vdo.x86_64 6.1.0.168-18
>>>> @updates
>>>>
>>>> - LVM Version: [e.g. 2.02.177-4.el7]
>>>>
>>>> 2.02.177(2)-RHEL7 (2018-01-22
>>>>
>>>> - Output of `uname -a`: [e.g. Linux localhost.localdomain
>>>> 3.10.0-862.el7.x86_64 #1 SMP Wed Mar 21 18:14:51 EDT 2018 x86_64 x86_64
>>>> x86_64 GNU/Linux]
>>>>
>>>> Linux ixmed1 3.10.0-862.el7.x86_64 #1 SMP Fri Apr 20 16:44:24 UTC 2018
>>>> x86_64 x86_64 x86_64 GNU/Linux
>>>>
>>>>
>>>> On Thu, Nov 5, 2020 at 6:49 AM Łukasz Michalski <lm at zork.pl> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I have two 20T two devices that was crashed during power outage - on
>>>>> two servers.
>>>>>
>>>>> After server restart I see in logs on the first server:
>>>>>
>>>>> [root at ixmed1 /]# dmesg |grep vdo
>>>>> [ 11.223770] kvdo: modprobe: loaded version 6.1.0.168
>>>>> [ 11.904949] kvdo0:dmsetup: starting device 'vdo_test' device
>>>>> instantiation 0 write policy auto
>>>>> [ 11.904979] kvdo0:dmsetup: underlying device, REQ_FLUSH: not
>>>>> supported, REQ_FUA: not supported
>>>>> [ 11.904985] kvdo0:dmsetup: Using mode sync automatically.
>>>>> [ 11.905017] kvdo0:dmsetup: zones: 1 logical, 1 physical, 1 hash;
>>>>> base threads: 5
>>>>> [ 11.966414] kvdo0:journalQ: Device was dirty, rebuilding reference
>>>>> counts
>>>>> [ 12.452589] kvdo0:logQ0: Finished reading recovery journal
>>>>> [ 12.458550] kvdo0:logQ0: Highest-numbered recovery journal block
>>>>> has sequence number 70548140, and the highest-numbered usable block is
>>>>> 70548140
>>>>> [ 12.458556] kvdo0:logQ0: Replaying entries into slab journals
>>>>> [ 13.538099] kvdo0:logQ0: Replayed 5568767 journal entries into slab
>>>>> journals
>>>>> [ 14.174984] kvdo0:logQ0: Recreating missing journal entries
>>>>> [ 14.175025] kvdo0:journalQ: Synthesized 0 missing journal entries
>>>>> [ 14.177768] kvdo0:journalQ: Saving recovery progress
>>>>> [ 14.636416] kvdo0:logQ0: Replaying 2528946 recovery entries into
>>>>> block map
>>>>>
>>>>> [root at ixmed1 /]# uptime
>>>>> 12:41:33 up 1 day, 4:07, 2 users, load average: 1.06, 1.05, 1.16
>>>>>
>>>>> [root at ixmed1 /]# ps ax |grep vdo
>>>>> 1135 ? Ss 0:00 /usr/bin/python /usr/bin/vdo start --all
>>>>> --confFile /etc/vdoconf.yml
>>>>> 1210 ? R 21114668:39 dmsetup create vdo_Rada-ixmed --uuid
>>>>> VDO-b668b2d9-96bf-4840-a43d-6b7ab0a7f235 --table 0 72301908952 vdo
>>>>> /dev/disk/by-id/dm-name-vgStorage-LV_test 4096 disabled 0 32768 16380 on
>>>>> auto vdo_test
>>>>> ack=1,bio=4,bioRotationInterval=64,cpu=2,hash=1,logical=1,physical=1
>>>>> 1213 ? S 1:51 [kvdo0:dedupeQ]
>>>>> 1214 ? S 1:51 [kvdo0:journalQ]
>>>>> 1215 ? S 1:51 [kvdo0:packerQ]
>>>>> 1216 ? S 1:51 [kvdo0:logQ0]
>>>>> 1217 ? S 1:51 [kvdo0:physQ0]
>>>>> 1218 ? S 1:50 [kvdo0:hashQ0]
>>>>> 1219 ? S 1:52 [kvdo0:bioQ0]
>>>>> 1220 ? S 1:51 [kvdo0:bioQ1]
>>>>> 1221 ? S 1:51 [kvdo0:bioQ2]
>>>>> 1222 ? S 1:51 [kvdo0:bioQ3]
>>>>> 1223 ? S 1:48 [kvdo0:ackQ]
>>>>> 1224 ? S 1:49 [kvdo0:cpuQ0]
>>>>> 1225 ? S 1:49 [kvdo0:cpuQ1]
>>>>>
>>>>> The only activity I see is that there are small writes shown in 'atop'
>>>>> to vdo underlying device.
>>>>>
>>>>> On the first server dmsetup takes 100% cpu (one core), on the second
>>>>> server dmsetup seems to be idle.
>>>>>
>>>>> What should I do in this situation?
>>>>>
>>>>> Regards,
>>>>> Łukasz
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> vdo-devel mailing list
>>>>> vdo-devel at redhat.com
>>>>> https://www.redhat.com/mailman/listinfo/vdo-devel
>>>>>
>>>>
>>>> _______________________________________________
>>>> vdo-devel mailing list
>>>> vdo-devel at redhat.com
>>>> https://www.redhat.com/mailman/listinfo/vdo-devel
>>>>
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/vdo-devel/attachments/20201106/79311824/attachment.htm>
More information about the vdo-devel
mailing list