[linux-lvm] lvm metadata sequence number reverts

Aaron Young aaron.young at ctl.io
Thu Sep 17 15:39:09 UTC 2015


Actually, I made a mistake and failed to drop the system cache between dds
when generating the comparison. There is one difference, in sector 2056 of
the device! This must be the key.



On Wed, 16 Sep 2015 at 16:31 Aaron Young <aaron.young at ctl.io> wrote:

> Yes, I have lots of data to share, I thought first to open at high level.
> This is all happening inside a single VM. Archive is available, I will post
> them shortly. No lvmetad. No errors that I can tell (at least not on
> console or syslog).
>
> root at VA1CTLT-SRN2-03:/etc/lvm/archive# grep seqno test_dvol-13-vg_00*
> test_dvol-13-vg_00261-1410850844.vg: seqno = 0  <---- before vgcreate
> test_dvol-13-vg_00262-1188507802.vg: seqno = 1   <-- before lvcreate 1
> test_dvol-13-vg_00263-1818746321.vg: seqno = 2   <---- before lvcreate 2
> test_dvol-13-vg_00264-1122545952.vg: seqno = 3   <--- before lvcreate 3
> test_dvol-13-vg_00265-1497145254.vg: seqno = 4  <---- before lvcreate 4
> test_dvol-13-vg_00266-1300493675.vg: seqno = 5  <--- before lvs
> test_dvol-13-vg_00267-490193445.vg: seqno = 4   <----- disabled device
> cache, lvs
> test_dvol-13-vg_00268-2051497792.vg: seqno = 4  <----- disabled device
> cache, lvs
> test_dvol-13-vg_00269-370016695.vg: seqno = 5   <---- enabled device
> cache, lvs
>
> The contents of the metadata area seems to be the same (both contain seqno
> 5):
>
> dd if=/dev/sbd13 bs=1M count=1 skip=1 of=sbd13.nocache
> dd if=/dev/sbd13 bs=1M count=1 skip=1 of=sbd13.cache
>
> cmp sbd13.nocache sbd13.cache
>
> I tracked down these sectors by running strace on
> pvcreate/vgcreate/lvcreate. As far as I can tell, all the sectors involved
> are being written correctly.
>
> Random facts:
> 1. Devicemapper still correctly lists the logical volume that is missing
> from lvs
> 2. 3.13.0-44-generic, Ubuntu 14.04
> 3. LVM version: 2.02.98(2) (2012-10-15) Library version: 1.02.77
> (2012-10-15) Driver version: 4.27.0
>
> Random suspicious snippet generated by lvscan -vvv
>
> /dev/mapper/sbd13p1: lvm2 label detected at sector 1
> lvmcache: /dev/mapper/sbd13p1: now in VG #orphans_lvm2 (#orphans_lvm2)
> with 1 mdas
> /dev/mapper/sbd13p1: Found metadata at 8704 size 1749 (in area at 4096
> size 1044480) for test_dvol-13-vg (DFvQDG-nYVS-QQlT-Uv35-aPr4-2pY0-zMQ0dr)
> lvmcache: /dev/mapper/sbd13p1: now in VG test_dvol-13-vg with 1 mdas
> lvmcache: /dev/mapper/sbd13p1: setting test_dvol-13-vg VGID to
> DFvQDGnYVSQQlTUv35aPr42pY0zMQ0dr
> lvmcache: /dev/mapper/sbd13p1: VG test_dvol-13-vg: Set creation host to
> VA1CTLT-SRN2-03. Allocated VG test_dvol-13-vg at 0x257bc00.
> Using cached label for /dev/mapper/sbd13p1
> Read test_dvol-13-vg metadata (4) from /dev/mapper/sbd13p1 at 8704 size
> 1749
> /dev/mapper/sbd13p1 0: 0 19: VM-test_dvol-13-0-hard-drive-0(0:0)
> /dev/mapper/sbd13p1 1: 19 19: VM-test_dvol-13-0-hard-drive-1(0:0)
> /dev/mapper/sbd13p1 2: 38 19: VM-test_dvol-13-1-hard-drive-0(0:0)
> /dev/mapper/sbd13p1 3: 57 42: NULL(0:0) *<----missing logical volume*
>
> I don't understand how this is possible if that sector (8704) is identical
> in both cases.
>
> Attached are two verbose straces of vgdisplay, one of which discovered 3
> logical volumes and one of that discovers 4.
> I am looking for insight into the disk contents that are necessary for
> this discovery. Thank you very much.
>
> Aaron
>
>
>
> On Wed, 16 Sep 2015 at 03:05 Zdenek Kabelac <zkabelac at redhat.com> wrote:
>
>> Dne 15.9.2015 v 23:18 Aaron Young napsal(a):
>> > Hello, I'm deep into debugging an issue we have with a disk driver of
>> ours and
>> > LVM.
>> >
>> > Long story short:
>> >
>> > create vg -> seqno 1
>> > create lv1 -> seqno 2
>> > create lv2 -> seqno 3
>> > create lv3 -> seqno 4
>> > create lv4 -> seqno 5
>> > <clear our device cache> (note, this generates no IO)
>> > vgdisplay: seqno = 4, lv4 is missing
>> >
>> > * This happens only after dozens to hundreds of iterations. Most of the
>> time
>> > it is fine.
>> >
>> > I dd all the metadata blocks off of the pv, yep, seqno5 is on disk
>> metadata
>> > area perfectly fine. But the system believes 4 is the current version.
>> > Shouldn't the system be using the highest value? Or is it stored
>> somewhere?
>> > What mechanism is responsible for changing the seqno? And where does it
>> change
>> > it? (Not the metadata contents, just the number)
>>
>>
>> Hi
>>
>> Your email is quite 'mystic' - I'd need lots of crystal balls to see your
>> surrounding conditions.
>>
>>
>> 1.) Is this 'clustered' environment or a  'single' host setup ?
>>
>> 2.) Do you have 'archive' backup enabled  - can you check what are last
>> operations in history before problem happens?
>>
>> 3.) Are you using 'lvmetad' ? (if so, try  use_lvmetad=0 )
>>
>> 4.) Kernel version,  lvm2  version ?
>>
>> 5.) Was there any lvm2 command error  ?
>> (as vgdisplay may just do a backup of most recent metadata in case they
>> are
>> are missing after some command failure)
>>
>> Zdenek
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-lvm/attachments/20150917/1610e24e/attachment.htm>


More information about the linux-lvm mailing list