[linux-lvm] lv raid - how to read this?

Fri Sep 8 09:54:03 UTC 2017

Dne 8.9.2017 v 11:39 lejeczek napsal(a):
> 
> 
> On 08/09/17 10:34, Zdenek Kabelac wrote:
>> Dne 8.9.2017 v 11:22 lejeczek napsal(a):
>>>
>>>
>>> On 08/09/17 09:49, Zdenek Kabelac wrote:
>>>> Dne 7.9.2017 v 15:12 lejeczek napsal(a):
>>>>>
>>>>>
>>>>> On 07/09/17 10:16, Zdenek Kabelac wrote:
>>>>>> Dne 7.9.2017 v 10:06 lejeczek napsal(a):
>>>>>>> hi fellas
>>>>>>>
>>>>>>> I'm setting up a lvm raid0, 4 devices, I want raid0 and I understand & 
>>>>>>> expect - there will be four stripes, all I care of is speed.
>>>>>>> I do:
>>>>>>> $ lvcreate --type raid0 -i 4 -I 16 -n 0 -l 96%pv intel.raid0-0 
>>>>>>> /dev/sd{c..f} # explicitly four stripes
>>>>>>>
>>>>>>> I see:
>>>>>>> $ mkfs.xfs /dev/mapper/intel.sataA-0 -f
>>>>>>> meta-data=/dev/mapper/intel.sataA-0 isize=512 agcount=32, 
>>>>>>> agsize=30447488 blks
>>>>>>>           =                       sectsz=512 attr=2, projid32bit=1
>>>>>>>           =                       crc=1 finobt=0, sparse=0
>>>>>>> data     =                       bsize=4096 blocks=974319616, imaxpct=5
>>>>>>>           =                       sunit=4 swidth=131076 blks
>>>>>>> naming   =version 2              bsize=4096 ascii-ci=0 ftype=1
>>>>>>> log      =internal log           bsize=4096 blocks=475744, version=2
>>>>>>>           =                       sectsz=512 sunit=4 blks, lazy-count=1
>>>>>>> realtime =none                   extsz=4096 blocks=0, rtextents=0
>>>>>>>
>>>>>>> What puzzles me is xfs's:
>>>>>>>   sunit=4      swidth=131076 blks
>>>>>>> and I think - what the hexx?
>>>>>>
>>>>>>
>>>>>> Unfortunatelly  'swidth'  in XFS has different meaning than lvm2's  
>>>>>> stripe size parameter.
>>>>>>
>>>>>> In lvm2 -
>>>>>>
>>>>>>
>>>>>> -i | --stripes    - how many disks
>>>>>> -I | --stripesize    - how much data before using next disk.
>>>>>>
>>>>>> So  -i 4  & -I 16 gives  64KB  total stripe width
>>>>>>
>>>>>> ----
>>>>>>
>>>>>> XFS meaning:
>>>>>>
>>>>>> suinit = <RAID controllers stripe size in BYTES (or KiBytes when used 
>>>>>> with k)>
>>>>>> swidth = <# of data disks (don't count parity disks)>
>>>>>>
>>>>>> ----
>>>>>>
>>>>>> ---- so real-world example ----
>>>>>>
>>>>>> # lvcreate --type striped -i4 -I16 -L1G -n r0 vg
>>>>>>
>>>>>> or
>>>>>>
>>>>>> # lvcreate --type raid0  -i4 -I16 -L1G -n r0 vg
>>>>>>
>>>>>> # mkfs.xfs  /dev/vg/r0 -f
>>>>>> meta-data=/dev/vg/r0             isize=512 agcount=8, agsize=32764 blks
>>>>>>          =                       sectsz=512   attr=2, projid32bit=1
>>>>>>          =                       crc=1 finobt=1, sparse=0, rmapbt=0, 
>>>>>> reflink=0
>>>>>> data     =                       bsize=4096 blocks=262112, imaxpct=25
>>>>>>          =                       sunit=4 swidth=16 blks
>>>>>> naming   =version 2              bsize=4096 ascii-ci=0 ftype=1
>>>>>> log      =internal log           bsize=4096 blocks=552, version=2
>>>>>>          =                       sectsz=512   sunit=4 blks, lazy-count=1
>>>>>> realtime =none                   extsz=4096 blocks=0, rtextents=0
>>>>>>
>>>>>>
>>>>>> ---- and we have ----
>>>>>>
>>>>>> sunit=4         ...  4 * 4096 = 16KiB        (matching lvm2 -I16 here)
>>>>>> swidth=16 blks  ... 16 * 4096 = 64KiB
>>>>>>    so we have  64 as total width / size of single strip (sunit) ->  4 disks
>>>>>>    (matching  lvm2 -i4 option here)
>>>>>>
>>>>>> Yep complex, don't ask... ;)
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> In a LVM non-raid stripe scenario I've always remember it was: swidth = 
>>>>>>> sunit * Y where Y = number of stripes, right?
>>>>>>>
>>>>>>> I'm hoping some expert could shed some light, help me(maybe others too) 
>>>>>>> understand what LVM is doing there? I'd appreciate.
>>>>>>> many thanks, L.
>>>>>>
>>>>>>
>>>>>> We in the first place there is major discrepancy in the naming:
>>>>>>
>>>>>> You use intel.raid0-0   VG name
>>>>>> and then you mkfs device: /dev/mapper/intel.sataA-0  ??
>>>>>>
>>>>>> While you should be accessing: /dev/intel.raid0/0
>>>>>>
>>>>>> Are you sure you are not trying to overwrite some unrelated device here?
>>>>>>
>>>>>> (As your shown numbers looks unrelated, or you have buggy kernel or 
>>>>>> blkid....)
>>>>>>
>>>>>
>>>>> hi,
>>>>> I renamed VG in the meantime,
>>>>> I get xfs intricacy..
>>>>> so.. question still stands..
>>>>> why xfs format does not do what I remember always did in the past(on lvm 
>>>>> non-raid but stripped), like in your example
>>>>>
>>>>>           =                       sunit=4 swidth=16 blks
>>>>> but I see instead:
>>>>>
>>>>>           =                       sunit=4 swidth=4294786316 blks
>>>>>
>>>>> a whole lot:
>>>>>
>>>>> $ xfs_info /__.aLocalStorages/0
>>>>> meta-data=/dev/mapper/intel.raid0--0-0 isize=512 agcount=32, 
>>>>> agsize=30768000 blks
>>>>>           =                       sectsz=512   attr=2, projid32bit=1
>>>>>           =                       crc=1        finobt=0 spinodes=0
>>>>> data     =                       bsize=4096 blocks=984576000, imaxpct=5
>>>>>           =                       sunit=4 swidth=4294786316 blks
>>>>> naming   =version 2              bsize=4096 ascii-ci=0 ftype=1
>>>>> log      =internal               bsize=4096 blocks=480752, version=2
>>>>>           =                       sectsz=512   sunit=4 blks, lazy-count=1
>>>>> realtime =none                   extsz=4096   blocks=0, rtextents=0
>>>>>
>>>>> $ lvs -a -o +segtype,stripe_size,stripes,devices intel.raid0-0
>>>>>    LV           VG            Attr       LSize   Pool Origin Data% Meta%  
>>>>> Move Log Cpy%Sync Convert Type Stripe #Str Devices
>>>>>    0            intel.raid0-0 rwi-aor--- 3.67t raid0 16.00k    4 
>>>>> 0_rimage_0(0),0_rimage_1(0),0_rimage_2(0),0_rimage_3(0)
>>>>>    [0_rimage_0] intel.raid0-0 iwi-aor--- 938.96g linear 0     1 /dev/sdc(0)
>>>>>    [0_rimage_1] intel.raid0-0 iwi-aor--- 938.96g linear 0     1 /dev/sdd(0)
>>>>>    [0_rimage_2] intel.raid0-0 iwi-aor--- 938.96g linear 0     1 /dev/sde(0)
>>>>>    [0_rimage_3] intel.raid0-0 iwi-aor--- 938.96g linear 0     1 /dev/sdf(0)
>>>>>
>>>>
>>>>
>>>> Hi
>>>>
>>>> I've checked even 128TiB sized device with mkfs.xfs with -i4 -I16
>>>>
>>>> # lvs -a vg
>>>>
>>>>   LV             VG             Attr       LSize   Pool Origin Data%  
>>>> Meta% Move Log Cpy%Sync Convert
>>>>   LV1            vg rwi-a-r--- 128.00t
>>>>   [LV1_rimage_0] vg iwi-aor---  32.00t
>>>>   [LV1_rimage_1] vg iwi-aor---  32.00t
>>>>   [LV1_rimage_2] vg iwi-aor---  32.00t
>>>>   [LV1_rimage_3] vg iwi-aor---  32.00t
>>>>
>>>> # mkfs.xfs -f /dev/vg/LV1
>>>> meta-data=/dev/vg/LV1 isize=512  agcount=128, agsize=268435452 blks
>>>>          =                       sectsz=512   attr=2, projid32bit=1
>>>>          =                       crc=1        finobt=1, sparse=0, 
>>>> rmapbt=0, reflink=0
>>>> data     =                       bsize=4096 blocks=34359737856, imaxpct=1
>>>>          =                       sunit=4      swidth=16 blks
>>>> naming   =version 2              bsize=4096   ascii-ci=0 ftype=1
>>>> log      =internal log           bsize=4096 blocks=521728, version=2
>>>>          =                       sectsz=512   sunit=4 blks, lazy-count=1
>>>> realtime =none                   extsz=4096   blocks=0, rtextents=0
>>>>
>>>>
>>>>
>>>> and all seems to be working just about right.
>>>> From your 'swidth' number it looks like some 32bit overflow ?
>>>>
>>>> So aren't you using some ancient kernel/lvm2 version ?
>>>>
>>>
>>> hi guys, not ancient, on the contrary I'd like to think.
>>>
>>> $ lvm version
>>>    LVM version:     2.02.166(2)-RHEL7 (2016-11-16)
>>>    Library version: 1.02.135-RHEL7 (2016-11-16)
>>>    Driver version:  4.34.0
>>>
>>> but perhaps a bug, if yes then heads-up for kernel-lt which I got from elrepo:
>>>
>>> $ rpm -qa kernel-lt
>>> kernel-lt-4.4.81-1.el7.elrepo.x86_64
>>> kernel-lt-4.4.83-1.el7.elrepo.x86_64
>>> kernel-lt-4.4.82-1.el7.elrepo.x86_64
>>> kernel-lt-4.4.84-1.el7.elrepo.x86_64
>>>
>>> everything else is centos 7.3
>>>
>>
>> Hi
>>
>> I assume you can retry with original Centos kernel then ?
>> Eventually try some latest/greatest upstream  (4.13).
>>
> 
> I can try but I'll have to still to those kernel versions.
> For you guys it should be worth investigating as this is long-term support 
> kernel, no?
> 

Investigation was done long time ago - and resolution was to NOT use 4.4 with 
md-raid, sorry...

And yes - we provide support - but simply for different kernels....
We can't be fixing every possibly combination of linux kernel in Universe - so 
my best advice - simply start using fixed kernel version.

Regards

Zdenek