[linux-lvm] lvm2 cluster aware

su liu liusu8788 at gmail.com
Thu Aug 25 08:49:25 UTC 2016


hi digimer , Thanks your replay explanation.

My use case is:

One admin node and some compute nodes, All the nodes shares a storage (eg.
FCSAN).
I create a vg on the shared storage, and the vg can be seen on all the
nodes.
The admin node's  responsibility is to manage the logical volume it
created, and the other compute node can attach these lvs to VM directly,
not through admin node.
Under the scene of LVM driver of OpenStack Cinder project, The lvm volumes
are attached to VMs through Cinder node via ISCSI. So I want to make sure
whether I can attach lvm volumes to VMs directly.

To achieve this goal, Should I use the method you mentiond before?

Thanks very much!


2016-08-25 15:46 GMT+08:00 <linux-lvm-request at redhat.com>:

> Send linux-lvm mailing list submissions to
>         linux-lvm at redhat.com
>
> To subscribe or unsubscribe via the World Wide Web, visit
>         https://www.redhat.com/mailman/listinfo/linux-lvm
> or, via email, send a message with subject or body 'help' to
>         linux-lvm-request at redhat.com
>
> You can reach the person managing the list at
>         linux-lvm-owner at redhat.com
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of linux-lvm digest..."
>
>
> Today's Topics:
>
>    1. sata cable disconnect + hotplug after (Xen)
>    2. Re: lvm2 raid volumes (Xen)
>    3. Re: Snapshots & data security (Zdenek Kabelac)
>    4. lvm2 cluster aware (su liu)
>    5. Re: lvm2 cluster aware (Digimer)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Tue, 09 Aug 2016 11:58:35 +0200
> From: Xen <list at xenhideout.nl>
> To: Linux lvm <linux-lvm at redhat.com>
> Subject: [linux-lvm] sata cable disconnect + hotplug after
> Message-ID: <16a575c3fb60f2579ab1687540cea01e at dds.nl>
> Content-Type: text/plain; charset=US-ASCII; format=flowed
>
> What am I supposed to do when a sata cable disconnects and reconnects as
> another device?
>
> I had a disk at probably /dev/sda.
>
> At a certain point the filesystem had become read-only. I realized the
> cable must have disconnected and after fixing it the same device was now
> at /dev/sdf.
>
> Now the device had gone missing from the system but it would not refind
> it.
>
> # pvscan
>    WARNING: Device for PV fEGbBn-tbIp-rL7y-m22b-1rQh-r9i5-Qwlqz7 not
> found or rejected by a filter.
>    PV unknown device                VG xenpc1          lvm2 [600,00 GiB /
> 158,51 GiB free]
>
> pvck clearly found it and lvmdiskscan also found it. Nothing happened
> until I did pvscan --cache /dev/sdf:
>
> # pvscan --cache /dev/sdf
> # vgscan
>    Reading all physical volumes.  This may take a while...
>    Duplicate of PV fEGbBn-tbIp-rL7y-m22b-1rQh-r9i5-Qwlqz7 dev /dev/sdf
> exists on unknown device 8:0
>    Found volume group "xenpc1" using metadata type lvm2
>
> Now I was able to activate it again and it was no longer flagged as
> partial (but now it is duplicate).
>
> The unknown device 8:0 is clearly going to be /dev/sda, which is no
> longer there.
>
> How can I dump this reference to 8:0 or should something else be done?
>
> Oh right.... pvscan --cache without a parameter....
>
> I wonder if I can run this while the thing is still activated....
>
> I was beginning to think there'd be some hidden filter rule, but it was
> "just" the cache.
>
> Should this thing be automatically resolved? Is running pvscan --cache
> enough?
>
>
>
> ------------------------------
>
> Message: 2
> Date: Mon, 15 Aug 2016 15:38:06 +0200
> From: Xen <list at xenhideout.nl>
> To: linux-lvm at redhat.com
> Subject: Re: [linux-lvm] lvm2 raid volumes
> Message-ID: <e2f9476f73b7ab05cc1e7a21643ed29e at dds.nl>
> Content-Type: text/plain; charset=US-ASCII; format=flowed
>
> Heinz Mauelshagen schreef op 03-08-2016 15:10:
>
> > The Cyp%Sync field tells you about the resynchronization progress,
> > i.e. the initial mirroring of
> > all data blocks in a raid1/10 or the initial calculation and storing
> > of parity blocks in raid4/5/6.
>
> Heinz, can I perhaps ask you here. If I can.
>
> I have put a root volume on raid 1. Maybe "of course" the second disk
> (LVM volumes) are not available at system boot:
>
> aug 15 14:09:19 xenpc2 kernel: device-mapper: raid: Loading target
> version 1.7.0
> aug 15 14:09:19 xenpc2 kernel: device-mapper: raid: Failed to read
> superblock of device at position 1
> aug 15 14:09:19 xenpc2 kernel: md/raid1:mdX: active with 1 out of 2
> mirrors
> aug 15 14:09:19 xenpc2 kernel: created bitmap (15 pages) for device mdX
> aug 15 14:09:19 xenpc2 kernel: mdX: bitmap initialized from disk: read 1
> pages, set 19642 of 30040 bits
> aug 15 14:09:19 xenpc2 kernel: EXT4-fs (dm-6): mounted filesystem with
> ordered data mode. Opts: (null)
>
>
> This could be because I am using PV directly on disk (no partition
> table) for *some* volumes (actually the first disk, that is booted
> from), however, I force a start of LVM2 service by enabling it in
> SystemD:
>
> aug 15 14:09:19 xenpc2 systemd[1]: Starting LVM2...
>
> This is further down the log, so LVM is actually started after the RAID
> is loading.
>
> At that point normally, from my experience, only the root LV is
> available.
>
> Then at a certain point more devices become available:
>
> aug 15 14:09:22 xenpc2 systemd[1]: Found device /dev/mapper/msata-boot.
> aug 15 14:09:22 xenpc2 systemd[1]: Started LVM2.
>
> aug 15 14:09:22 xenpc2 systemd[1]: Found device /dev/raid/tmp.
> aug 15 14:09:22 xenpc2 systemd[1]: Found device /dev/raid/swap.
> aug 15 14:09:22 xenpc2 systemd[1]: Found device /dev/raid/var.
>
> But just before that happens, there are some more RAID1 errors:
>
> aug 15 14:09:22 xenpc2 kernel: device-mapper: raid: Failed to read
> superblock of device at position 1
> aug 15 14:09:22 xenpc2 kernel: md/raid1:mdX: active with 1 out of 2
> mirrors
> aug 15 14:09:22 xenpc2 kernel: created bitmap (1 pages) for device mdX
> aug 15 14:09:22 xenpc2 kernel: mdX: bitmap initialized from disk: read 1
> pages, set 320 of 480 bits
> aug 15 14:09:22 xenpc2 kernel: device-mapper: raid: Failed to read
> superblock of device at position 1
> aug 15 14:09:22 xenpc2 kernel: md/raid1:mdX: active with 1 out of 2
> mirrors
> aug 15 14:09:22 xenpc2 kernel: created bitmap (15 pages) for device mdX
> aug 15 14:09:22 xenpc2 kernel: mdX: bitmap initialized from disk: read 1
> pages, set 19642 of 30040 bits
>
> Well small wonder if the device isn't there yet. There are no messages
> for it, but I will assume the mirror LVs came online at the same time as
> the other "raid" volume group LVs, which means the RAID errors preceded
> that.
>
> Hence, no secondary mirror volumes available, cannot start the raid,
> right.
>
> However after logging in, the Cpy%Sync behaviour seems normal:
>
>    boot     msata  rwi-aor--- 240,00m
> 100,00
>    root     msata  rwi-aor---  14,67g
> 100,00
>
> Devices are shown as:
>
>    boot msata rwi-aor--- 240,00m
> 100,00           boot_rimage_0(0),boot_rimage_1(0)
>    root msata rwi-aor---  14,67g
> 100,00           root_rimage_0(0),root_rimage_1(0)
>
> dmsetup table seems normal:
>
> # dmsetup table | grep msata | sort
> coll-msata--lv: 0 60620800 linear 8:36 2048
> msata-boot: 0 491520 raid raid1 3 0 region_size 1024 2 252:14 252:15 - -
> msata-boot_rimage_0: 0 491520 linear 8:16 4096
> msata-boot_rimage_1: 0 491520 linear 252:12 10240
> msata-boot_rimage_1-missing_0_0: 0 491520 error
> msata-boot_rmeta_0: 0 8192 linear 8:16 495616
> msata-boot_rmeta_1: 0 8192 linear 252:12 2048
> msata-boot_rmeta_1-missing_0_0: 0 8192 error
> msata-root: 0 30760960 raid raid1 3 0 region_size 1024 2 252:0 252:1 - -
> msata-root_rimage_0: 0 30760960 linear 8:16 512000
> msata-root_rimage_1: 0 30760960 linear 252:12 509952
> msata-root_rimage_1-missing_0_0: 0 30760960 error
> msata-root_rmeta_0: 0 8192 linear 8:16 503808
> msata-root_rmeta_1: 0 8192 linear 252:12 501760
> msata-root_rmeta_1-missing_0_0: 0 8192 error
>
> But actually it's not because it should reference 4 devices, not two.
> Apologies.
>
> It only references the volumes of the first disk (image and meta).
>
> E.g. 252:0 and 252:1 are:
>
> lrwxrwxrwx 1 root root       7 aug 15 14:09 msata-root_rmeta_0 ->
> ../dm-0
> lrwxrwxrwx 1 root root       7 aug 15 14:09 msata-root_rimage_0 ->
> ../dm-1
>
> Whereas the volumes from the other disk are:
>
> lrwxrwxrwx 1 root root       7 aug 15 14:09 msata-root_rmeta_1 ->
> ../dm-3
> lrwxrwxrwx 1 root root       7 aug 15 14:09 msata-root_rimage_1 ->
> ../dm-5
>
> If I dismount /boot, lvchange -an msata/boot, lvchange -ay msata/boot,
> it loads correctly:
>
> aug 15 14:56:23 xenpc2 kernel: md/raid1:mdX: active with 1 out of 2
> mirrors
> aug 15 14:56:23 xenpc2 kernel: created bitmap (1 pages) for device mdX
> aug 15 14:56:23 xenpc2 kernel: mdX: bitmap initialized from disk: read 1
> pages, set 320 of 480 bits
> aug 15 14:56:23 xenpc2 kernel: RAID1 conf printout:
> aug 15 14:56:23 xenpc2 kernel:  --- wd:1 rd:2
> aug 15 14:56:23 xenpc2 kernel:  disk 0, wo:0, o:1, dev:dm-15
> aug 15 14:56:23 xenpc2 kernel:  disk 1, wo:1, o:1, dev:dm-19
> aug 15 14:56:23 xenpc2 kernel: RAID1 conf printout:
> aug 15 14:56:23 xenpc2 kernel:  --- wd:1 rd:2
> aug 15 14:56:23 xenpc2 kernel:  disk 0, wo:0, o:1, dev:dm-15
> aug 15 14:56:23 xenpc2 kernel:  disk 1, wo:1, o:1, dev:dm-19
> aug 15 14:56:23 xenpc2 kernel: md: recovery of RAID array mdX
> aug 15 14:56:23 xenpc2 kernel: md: minimum _guaranteed_  speed: 1000
> KB/sec/disk.
> aug 15 14:56:23 xenpc2 kernel: md: using maximum available idle IO
> bandwidth (but not more than 200000 KB/sec) for recovery.
> aug 15 14:56:23 xenpc2 kernel: md: using 128k window, over a total of
> 245760k.
> aug 15 14:56:23 xenpc2 systemd[1]: Starting File System Check on
> /dev/mapper/msata-boot...
> aug 15 14:56:23 xenpc2 systemd[1]: Started File System Check Daemon to
> report status.
> aug 15 14:56:23 xenpc2 systemd-fsck[6938]: /dev/mapper/msata-boot:
> clean, 310/61440 files, 121269/245760 blocks
> aug 15 14:56:23 xenpc2 systemd[1]: Started File System Check on
> /dev/mapper/msata-boot.
> aug 15 14:56:23 xenpc2 systemd[1]: Mounting /boot...
> aug 15 14:56:23 xenpc2 kernel: EXT4-fs (dm-20): mounting ext2 file
> system using the ext4 subsystem
> aug 15 14:56:23 xenpc2 kernel: EXT4-fs (dm-20): mounted filesystem
> without journal. Opts: (null)
> aug 15 14:56:23 xenpc2 systemd[1]: Mounted /boot.
> aug 15 14:56:26 xenpc2 kernel: md: mdX: recovery done.
> aug 15 14:56:26 xenpc2 kernel: RAID1 conf printout:
> aug 15 14:56:26 xenpc2 kernel:  --- wd:2 rd:2
> aug 15 14:56:26 xenpc2 kernel:  disk 0, wo:0, o:1, dev:dm-15
> aug 15 14:56:26 xenpc2 kernel:  disk 1, wo:0, o:1, dev:dm-19
>
>
>
> Maybe this whole thing is just caused by the first disk being
> partitionless.
>
> msata-boot: 0 491520 raid raid1 3 0 region_size 1024 2 252:14 252:15
> 252:17 252:19
>
>
> I don't know how LVM is activated under SystemD. This is currently the
> initial startup:
>
>    dev-mapper-msata\x2droot.device (2.474s)
>    init.scope
>    -.mount
>    -.slice
>    swap.target
>    dm-event.socket
>    system.slice
>    lvm2-lvmpolld.socket
>    systemd-udevd-kernel.socket
>    systemd-initctl.socket
>    systemd-journald-audit.socket
>    lvm2-lvmetad.socket
>    systemd-journald.socket
>    systemd-modules-load.service (199ms)
>    dev-mqueue.mount (66ms)
>    ufw.service (69ms)
>    systemd-fsckd.socket
>    proc-sys-fs-binfmt_misc.automount
>    lvm2.service (3.237s)
>
> I have had to enable the SysV lvm2 service and replace vgchange -aay
> --sysinit with just vgchange -aay or it wouldn't work.
>
> lvmetad is started later, but without the manual lvm2 activation, my
> devices just wouldn't get loaded (on multiple systems, ie. when using a
> PV directly on disk to boot from).
>
> No one else boots directly from PV, so I may be the only one to ever
> have experienced this.
>
>
>
> But at this point, my mirrors don't work. They do work when not started
> as the main system. So If I boot from another disk, they work.
>
> At least I think they do.
>
>
> Basically all my LVM volumes are not loaded fast enough before the RAID
> is getting started.
>
> LVM itself gives no indication of error. lvchange --syncaction check
> msata/root does not produce any data. It seems it doesn't notice that
> the RAID hasn't been started.
>
> Again, there is zero output from commands such as lvs -a -o+lv_all and
> yet this is the output from dmsetup table:
>
> msata-root: 0 30760960 raid raid1 3 0 region_size 1024 2 252:0 252:1 - -
>
>
> So my first question is really: can I restore the RAID array while the
> system is running? E.g. while root is mounted?
>
> I haven't explored the initramfs yet to see why my LVM volumes are not
> getting loaded.
>
> Regards.
>
>
>
> ------------------------------
>
> Message: 3
> Date: Tue, 16 Aug 2016 11:44:23 +0200
> From: Zdenek Kabelac <zdenek.kabelac at gmail.com>
> To: LVM general discussion and development <linux-lvm at redhat.com>,
>         stuart at gathman.org
> Subject: Re: [linux-lvm] Snapshots & data security
> Message-ID: <56821cec-70f8-086a-7e5b-6db8f0a9b780 at gmail.com>
> Content-Type: text/plain; charset=windows-1252; format=flowed
>
> Dne 27.7.2016 v 21:17 Stuart Gathman napsal(a):
> > On 07/19/2016 11:28 AM, Scott Sullivan wrote:
> >>
> >> Could someone please clarify if there is a legitimate reason to worry
> >> about data security of a old (removed) LVM snapshot?
> >>
> >> For example, when you lvremove a LVM snapshot, is it possible for data
> >> to be recovered if you create another LVM and it happens to go into
> >> the same area as the old snapshot we lvremoved?
> >>
> >> If this helps clarify, do we have to worry about security scrubbing a
> >> LVM snapshot for data security ?
> >>
> > Another idea: if your VG is on SSD, and properly aligned, then DISCARD
> > on the new LV will effectively zero it as far as any guest VMs are
> > concerned.  (The data is still on the flash until erased by the
> > firmware, of course.)  If VG and PE size do not align with the SSD erase
> > block, then you can still zero the "edges" of the new LV, which is much
> > faster (and less wear on the SSD) than zeroing the whole thing.  You
> > could always read-verify that the data is actually all zero.
>
> Yes - as already suggested -  once you create a new LV -
> you can  'blkdicard /dev/vg/lv'
>
> Note - SSD may not always ensure blocks are zeroed - they could just
> move trimmed block into reuse list with undefined content.
>
> Anyway - lvm2 is not tool for data protection and it's upto system admin
> to ensure there are no data leaks.
>
> So pick the solution which fits best your needs - lvm2 provides all the
> tooling for it.
>
> Regards
>
> Zdenek
>
>
>
> ------------------------------
>
> Message: 4
> Date: Thu, 25 Aug 2016 09:50:24 +0800
> From: su liu <liusu8788 at gmail.com>
> To: linux-lvm <linux-lvm at redhat.com>
> Subject: [linux-lvm] lvm2 cluster aware
> Message-ID:
>         <CAN2gjWSEzL+M45UGFEkEA_i5bgNpKLXTQ+DAMPJjJJvBBvaTKQ@
> mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> I have a question about lvm2 cluster, The scene is that I try to imitate
> FCSAN by mapping a rbd volume to two compute node, Then I using the rbd
> volume to  create a PV and VG.I stoped the lvmetad daemon on the compute
> nodes. Then I find that when I operating the VG on one compute node, the
> changes can also be aware on another compute nodes.
>
> But this docment(http://www.tldp.org/HOWTO/LVM-HOWTO/sharinglvm1.html?
> says
> that "*LVM is not cluster aware".*
>
> My question is that can I use the method to achieve the case that I create
> or delete lv on one node whlie other compute node can using the lvs?
>
> Can anybody explain this?
>
> thx very much!
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <https://www.redhat.com/archives/linux-lvm/
> attachments/20160825/e6fb6630/attachment.html>
>
> ------------------------------
>
> Message: 5
> Date: Thu, 25 Aug 2016 03:37:08 -0400
> From: Digimer <lists at alteeve.ca>
> To: LVM general discussion and development <linux-lvm at redhat.com>
> Subject: Re: [linux-lvm] lvm2 cluster aware
> Message-ID: <a0c7bedf-23d1-26c5-3853-0bd36f528185 at alteeve.ca>
> Content-Type: text/plain; charset=UTF-8
>
> On 24/08/16 09:50 PM, su liu wrote:
> > I have a question about lvm2 cluster, The scene is that I try to imitate
> > FCSAN by mapping a rbd volume to two compute node, Then I using the rbd
> > volume to  create a PV and VG.I stoped the lvmetad daemon on the compute
> > nodes. Then I find that when I operating the VG on one compute node, the
> > changes can also be aware on another compute nodes.
> >
> > But this docment(http://www.tldp.org/HOWTO/LVM-HOWTO
> > /sharinglvm1.html? says that "*LVM is not cluster aware".*
> >
> > My question is that can I use the method to achieve the case that I
> > create or delete lv on one node whlie other compute node can using the
> lvs?
> >
> > Can anybody explain this?
> >
> > thx very much!
>
> You can use DLM (distributed lock manager) to provide cluster locking to
> LVM (locking_type = 3 via clvmd). This requires a cluster though, which
> you can get with corosync + pacemaker (and use stonith!).
>
> Further details/advice requires more understanding of your environment
> and use-case. Feel free to pop over to Clusterlabs[1], as many of us do
> this setup regularly. I use clustered LVM in all our HA setups and have
> done so successfully for years.
>
> digimer
>
> 1. http://clusterlabs.org/mailman/listinfo/users - freenode #clusterlabs
>
> --
> Digimer
> Papers and Projects: https://alteeve.ca/w/
> What if the cure for cancer is trapped in the mind of a person without
> access to education?
>
>
>
> ------------------------------
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
>
> End of linux-lvm Digest, Vol 150, Issue 2
> *****************************************
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-lvm/attachments/20160825/6f86af06/attachment.htm>


More information about the linux-lvm mailing list