[lvm-devel] master - man: add lvmraid(7)

David Teigland teigland at fedoraproject.org
Tue Sep 6 19:50:37 UTC 2016


Gitweb:        http://git.fedorahosted.org/git/?p=lvm2.git;a=commitdiff;h=a1fb7b51b799f8169a4dc03c7e414c9c88934f2d
Commit:        a1fb7b51b799f8169a4dc03c7e414c9c88934f2d
Parent:        c8a14a29cdcf7e800af16f9b6a8fa9ed49250d30
Author:        David Teigland <teigland at redhat.com>
AuthorDate:    Tue Sep 6 14:50:08 2016 -0500
Committer:     David Teigland <teigland at redhat.com>
CommitterDate: Tue Sep 6 14:50:08 2016 -0500

man: add lvmraid(7)

---
 man/lvmraid.7.in | 1314 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 1314 insertions(+), 0 deletions(-)

diff --git a/man/lvmraid.7.in b/man/lvmraid.7.in
new file mode 100644
index 0000000..9725cbf
--- /dev/null
+++ b/man/lvmraid.7.in
@@ -0,0 +1,1314 @@
+.TH "LVMRAID" "7" "LVM TOOLS #VERSION#" "Red Hat, Inc" "\""
+
+.SH NAME
+lvmraid \(em LVM RAID
+
+.SH DESCRIPTION
+
+LVM RAID is a way to create logical volumes (LVs) that use multiple physical
+devices to improve performance or tolerate device failure.  How blocks of
+data in an LV are placed onto physical devices is determined by the RAID
+level.  RAID levels are commonly referred to by number, e.g. raid1, raid5.
+Selecting a RAID level involves tradeoffs among physical device
+requirements, fault tolerance, and performance.  A description of the RAID
+levels can be found at
+.br
+www.snia.org/sites/default/files/SNIA_DDF_Technical_Position_v2.0.pdf
+
+LVM RAID uses both Device Mapper (DM) and Multiple Device (MD) drivers
+from the Linux kernel.  DM is used to create and manage visible LVM
+devices, and MD is used to place data on physical devices.
+
+.SH Create a RAID LV
+
+To create a RAID LV, use lvcreate and specify an LV type.
+The LV type corresponds to a RAID level.
+The basic RAID levels that can be used are:
+.B raid0, raid1, raid4, raid5, raid6, raid10.
+
+.B lvcreate \-\-type
+.I RaidLevel
+[\fIOPTIONS\fP]
+.B \-\-name
+.I Name
+.B \-\-size
+.I Size
+.I VG
+[\fIPVs\fP]
+
+To display the LV type of an existing LV, run:
+
+.B lvs -o name,segtype
+\fIVG\fP/\fILV\fP
+
+(The LV type is also referred to as "segment type" or "segtype".)
+
+LVs can be created with the following types:
+
+.SS raid0
+
+\&
+
+Also called striping, raid0 spreads LV data across multiple devices in
+units of stripe size.  This is used to increase performance.  LV data will
+be lost if any of the devices fail.
+
+.B lvcreate \-\-type raid0
+[\fB\-\-stripes\fP \fINumber\fP \fB\-\-stripesize\fP \fISize\fP]
+\fIVG\fP
+[\fIPVs\fP]
+
+.HP
+.B \-\-stripes
+specifies the number of devices to spread the LV across.
+
+.HP
+.B \-\-stripesize
+specifies the size of each stripe in kilobytes.  This is the amount of
+data that is written to one device before moving to the next.
+.P
+
+\fIPVs\fP specifies the devices to use.  If not specified, lvm will choose
+\fINumber\fP devices, one for each stripe.
+
+.SS raid1
+
+\&
+
+Also called mirroring, raid1 uses multiple devices to duplicate LV data.
+The LV data remains available if all but one of the devices fail.
+The minimum number of devices required is 2.
+
+.B lvcreate \-\-type raid1
+[\fB\-\-mirrors\fP \fINumber\fP]
+\fIVG\fP
+[\fIPVs\fP]
+
+.HP
+.B \-\-mirrors
+specifies the number of mirror images in addition to the original LV
+image, e.g. \-\-mirrors 1 means there are two images of the data, the
+original and one mirror image.
+.P
+
+\fIPVs\fP specifies the devices to use.  If not specified, lvm will choose
+\fINumber\fP devices, one for each image.
+
+.SS raid4
+
+\&
+
+raid4 is a form of striping that uses an extra device dedicated to storing
+parity blocks.  The LV data remains available if one device fails.  The
+parity is used to recalculate data that is lost from a single device.  The
+minimum number of devices required is 3.
+
+.B lvcreate \-\-type raid4
+[\fB\-\-stripes\fP \fINumber\fP \fB\-\-stripesize\fP \fISize\fP]
+\fIVG\fP
+[\fIPVs\fP]
+
+.HP
+.B \-\-stripes
+specifies the number of devices to use for LV data.  This does not include
+the extra device lvm adds for storing parity blocks.  \fINumber\fP stripes
+requires \fINumber\fP+1 devices.  \fINumber\fP must be 2 or more.
+
+.HP
+.B \-\-stripesize
+specifies the size of each stripe in kilobytes.  This is the amount of
+data that is written to one device before moving to the next.
+.P
+
+\fIPVs\fP specifies the devices to use.  If not specified, lvm will choose
+\fINumber\fP+1 separate devices.
+
+raid4 is called non-rotating parity because the parity blocks are always
+stored on the same device.
+
+.SS raid5
+
+\&
+
+raid5 is a form of striping that uses an extra device for storing parity
+blocks.  LV data and parity blocks are stored on each device.  The LV data
+remains available if one device fails.  The parity is used to recalculate
+data that is lost from a single device.  The minimum number of devices
+required is 3.
+
+.B lvcreate \-\-type raid5
+[\fB\-\-stripes\fP \fINumber\fP \fB\-\-stripesize\fP \fISize\fP]
+\fIVG\fP
+[\fIPVs\fP]
+
+.HP
+.B \-\-stripes
+specifies the number of devices to use for LV data.  This does not include
+the extra device lvm adds for storing parity blocks.  \fINumber\fP stripes
+requires \fINumber\fP+1 devices.  \fINumber\fP must be 2 or more.
+
+.HP
+.B \-\-stripesize
+specifies the size of each stripe in kilobytes.  This is the amount of
+data that is written to one device before moving to the next.
+.P
+
+\fIPVs\fP specifies the devices to use.  If not specified, lvm will choose
+\fINumber\fP+1 separate devices.
+
+raid5 is called rotating parity because the parity blocks are placed on
+different devices in a round-robin sequence.  There are variations of
+raid5 with different algorithms for placing the parity blocks.  The
+default variant is raid5_ls (raid5 left symmetric, which is a rotating
+parity 0 with data restart.)  See \fBRAID5 variants\fP below.
+
+.SS raid6
+
+\&
+
+raid6 is a form of striping like raid5, but uses two extra devices for
+parity blocks.  LV data and parity blocks are stored on each device.  The
+LV data remains available if up to two devices fail.  The parity is used
+to recalculate data that is lost from one or two devices.  The minimum
+number of devices required is 5.
+
+.B lvcreate \-\-type raid6
+[\fB\-\-stripes\fP \fINumber\fP \fB\-\-stripesize\fP \fISize\fP]
+\fIVG\fP
+[\fIPVs\fP]
+
+.HP
+.B \-\-stripes
+specifies the number of devices to use for LV data.  This does not include
+the extra two devices lvm adds for storing parity blocks.  \fINumber\fP
+stripes requires \fINumber\fP+2 devices.  \fINumber\fP must be 3 or more.
+
+.HP
+.B \-\-stripesize
+specifies the size of each stripe in kilobytes.  This is the amount of
+data that is written to one device before moving to the next.
+.P
+
+\fIPVs\fP specifies the devices to use.  If not specified, lvm will choose
+\fINumber\fP+2 separate devices.
+
+Like raid5, there are variations of raid6 with different algorithms for
+placing the parity blocks.  The default variant is raid6_zr (raid6 zero
+restart, aka left symmetric, which is a rotating parity 0 with data
+restart.)  See \fBRAID6 variants\fP below.
+
+.SS raid10
+
+\&
+
+raid10 is a combination of raid1 and raid0, striping data across mirrored
+devices.  LV data remains available if one or more devices remains in each
+mirror set.  The minimum number of devices required is 4.
+
+.B lvcreate \-\-type raid10
+.RS
+[\fB\-\-mirrors\fP \fINumberMirrors\fP]
+.br
+[\fB\-\-stripes\fP \fINumberStripes\fP \fB\-\-stripesize\fP \fISize\fP]
+.br
+\fIVG\fP
+[\fIPVs\fP]
+.RE
+
+.HP
+.B \-\-mirrors
+specifies the number of mirror images within each stripe.  e.g.
+\-\-mirrors 1 means there are two images of the data, the original and one
+mirror image.
+
+.HP
+.B \-\-stripes
+specifies the total number of devices to use in all raid1 images (not the
+number of raid1 devices to spread the LV across, even though that is the
+effective result).  The number of devices in each raid1 mirror will be
+NumberStripes/(NumberMirrors+1), e.g. mirrors 1 and stripes 4 will stripe
+data across two raid1 mirrors, where each mirror is devices.
+
+.HP
+.B \-\-stripesize
+specifies the size of each stripe in kilobytes.  This is the amount of
+data that is written to one device before moving to the next.
+.P
+
+\fIPVs\fP specifies the devices to use.  If not specified, lvm will choose
+the necessary devices.  Devices are used to create mirrors in the
+order listed, e.g. for mirrors 1, stripes 2, listing PV1 PV2 PV3 PV4
+results in mirrors PV1/PV2 and PV3/PV4.
+
+RAID10 is not mirroring on top of stripes, which would be RAID01, which is
+less tolerant of device failures.
+
+
+.SH Synchronization
+
+Synchronization makes all the devices in a RAID LV consistent with each
+other.
+
+In a RAID1 LV, all mirror images should have the same data.  When a new
+mirror image is added, or a mirror image is missing data, then images need
+to be synchronized.  Data blocks are copied from an existing image to a
+new or outdated image to make them match.
+
+In a RAID 4/5/6 LV, parity blocks and data blocks should match based on
+the parity calculation.  When the devices in a RAID LV change, the data
+and parity blocks can become inconsistent and need to be synchronized.
+Correct blocks are read, parity is calculated, and recalculated blocks are
+written.
+
+The RAID implementation keeps track of which parts of a RAID LV are
+synchronized.  This uses a bitmap saved in the RAID metadata.  The bitmap
+can exclude large parts of the LV from synchronization to reduce the
+amount of work.  Without this, the entire LV would need to be synchronized
+every time it was activated.  When a RAID LV is first created and
+activated the first synchronization is called initialization.
+
+Automatic synchronization happens when a RAID LV is activated, but it is
+usually partial because the bitmaps reduce the areas that are checked.
+A full sync may become necessary when devices in the RAID LV are changed.
+
+The synchronization status of a RAID LV is reported by the
+following command, where "image synced" means sync is complete:
+
+.B lvs -a -o name,sync_percent
+
+
+.SS Scrubbing
+
+Scrubbing is a full scan/synchronization of the RAID LV requested by a user.
+Scrubbing can find problems that are missed by partial synchronization.
+
+Scrubbing assumes that RAID metadata and bitmaps may be inaccurate, so it
+verifies all RAID metadata, LV data, and parity blocks.  Scrubbing can
+find inconsistencies caused by hardware errors or degradation.  These
+kinds of problems may be undetected by automatic synchronization which
+excludes areas outside of the RAID write-intent bitmap.
+
+The command to scrub a RAID LV can operate in two different modes:
+
+.B lvchange \-\-syncaction
+.BR check | repair
+.IR VG / LV
+
+.HP
+.B check
+Check mode is read\-only and only detects inconsistent areas in the RAID
+LV, it does not correct them.
+
+.HP
+.B repair
+Repair mode checks and writes corrected blocks to synchronize any
+inconsistent areas.
+
+.P
+
+Scrubbing can consume a lot of bandwidth and slow down application I/O on
+the RAID LV.  To control the I/O rate used for scrubbing, use:
+
+.HP
+.B \-\-maxrecoveryrate
+.BR \fIRate [ b | B | s | S | k | K | m | M | g | G ]
+.br
+Sets the maximum recovery rate for a RAID LV.  \fIRate\fP is specified as
+an amount per second for each device in the array.  If no suffix is given,
+then KiB/sec/device is assumed.  Setting the recovery rate to \fB0\fP
+means it will be unbounded.
+
+.HP
+.BR \-\-minrecoveryrate
+.BR \fIRate [ b | B | s | S | k | K | m | M | g | G ]
+.br
+Sets the minimum recovery rate for a RAID LV.  \fIRate\fP is specified as
+an amount per second for each device in the array.  If no suffix is given,
+then KiB/sec/device is assumed.  Setting the recovery rate to \fB0\fP
+means it will be unbounded.
+
+.P
+
+To display the current scrubbing in progress on an LV, including
+the syncaction mode and percent complete, run:
+
+.B lvs -a -o name,raid_sync_action,sync_percent
+
+After scrubbing is complete, to display the number of inconsistent blocks
+found, run:
+
+.B lvs -o name,raid_mismatch_count
+
+Also, if mismatches were found, the lvs attr field will display the letter
+"m" (mismatch) in the 9th position, e.g.
+
+.nf
+# lvs -o name,vgname,segtype,attr vg/lvol0
+  LV    VG   Type  Attr
+  lvol0 vg   raid1 Rwi-a-r-m- 
+.fi
+
+
+.SS Scrubbing Limitations
+
+The \fBcheck\fP mode can only report the number of inconsistent blocks, it
+cannot report which blocks are inconsistent.  This makes it impossible to
+know which device has errors, or if the errors affect file system data,
+metadata or nothing at all.
+
+The \fBrepair\fP mode can make the RAID LV data consistent, but it does
+not know which data is correct.  The result may be consistent but
+incorrect data.  When two different blocks of data must be made
+consistent, it chooses the block from the device that would be used during
+RAID intialization.  However, if the PV holding corrupt data is known,
+lvchange \-\-rebuild can be used to reconstruct the data on the bad
+device.
+
+Future developments might include:
+
+Allowing a user to choose the correct version of data during repair.
+
+Using a majority of devices to determine the correct version of data to
+use in a three-way RAID1 or RAID6 LV.
+
+Using a checksumming device to pin-point when and where an error occurs,
+allowing it to be rewritten.
+
+
+.SH SubLVs
+
+An LV is often a combination of other hidden LVs called SubLVs.  The
+SubLVs either use physical devices, or are built from other SubLVs
+themselves.  SubLVs hold LV data blocks, RAID parity blocks, and RAID
+metadata.  SubLVs are generally hidden, so the lvs \-a option is required
+display them:
+
+.B lvs -a -o name,segtype,devices
+
+SubLV names begin with the visible LV name, and have an automatic suffix
+indicating its role:
+
+.IP \(bu 3
+SubLVs holding LV data or parity blocks have the suffix _rimage_#.
+These SubLVs are sometimes referred to as DataLVs.
+
+.IP \(bu 3
+SubLVs holding RAID metadata have the suffix _rmeta_#.  RAID metadata
+includes superblock information, RAID type, bitmap, and device health
+information.  These SubLVs are sometimes referred to as MetaLVs.
+
+.P
+
+SubLVs are an internal implementation detail of LVM.  The way they are
+used, constructed and named may change.
+
+The following examples show the SubLV arrangement for each of the basic
+RAID LV types, using the fewest number of devices allowed for each.
+
+.SS Examples
+
+.B raid0
+.br
+Each rimage SubLV holds a portion of LV data.  No parity is used.
+No RAID metadata is used.
+
+.nf
+lvcreate --type raid0 --stripes 2 --name lvr0 ...
+
+lvs -a -o name,segtype,devices
+  lvr0            raid0  lvr0_rimage_0(0),lvr0_rimage_1(0)
+  [lvr0_rimage_0] linear /dev/sda(...)
+  [lvr0_rimage_1] linear /dev/sdb(...)
+.fi
+
+.B raid1
+.br
+Each rimage SubLV holds a complete copy of LV data.  No parity is used.
+Each rmeta SubLV holds RAID metadata.
+
+.nf
+lvcreate --type raid1 --mirrors 1 --name lvr1 ...
+
+lvs -a -o name,segtype,devices
+  lvr1            raid1  lvr1_rimage_0(0),lvr1_rimage_1(0)
+  [lvr1_rimage_0] linear /dev/sda(...)
+  [lvr1_rimage_1] linear /dev/sdb(...)
+  [lvr1_rmeta_0]  linear /dev/sda(...)
+  [lvr1_rmeta_1]  linear /dev/sdb(...)
+.fi
+
+.B raid4
+.br
+Two rimage SubLVs each hold a portion of LV data and one rimage SubLV
+holds parity.  Each rmeta SubLV holds RAID metadata.
+
+.nf
+lvcreate --type raid4 --stripes 2 --name lvr4 ...
+
+lvs -a -o name,segtype,devices
+  lvr4            raid4  lvr4_rimage_0(0),\\
+                         lvr4_rimage_1(0),\\
+                         lvr4_rimage_2(0)
+  [lvr4_rimage_0] linear /dev/sda(...)
+  [lvr4_rimage_1] linear /dev/sdb(...)
+  [lvr4_rimage_2] linear /dev/sdc(...)
+  [lvr4_rmeta_0]  linear /dev/sda(...)
+  [lvr4_rmeta_1]  linear /dev/sdb(...)
+  [lvr4_rmeta_2]  linear /dev/sdc(...)
+.fi
+
+.B raid5
+.br
+Three rimage SubLVs each hold a portion of LV data and parity.
+Each rmeta SubLV holds RAID metadata.
+
+.nf
+lvcreate --type raid5 --stripes 2 --name lvr5 ...
+
+lvs -a -o name,segtype,devices
+  lvr5            raid5  lvr5_rimage_0(0),\\
+                         lvr5_rimage_1(0),\\
+                         lvr5_rimage_2(0)
+  [lvr5_rimage_0] linear /dev/sda(...)                                     
+  [lvr5_rimage_1] linear /dev/sdb(...)                           
+  [lvr5_rimage_2] linear /dev/sdc(...)                                      
+  [lvr5_rmeta_0]  linear /dev/sda(...)                                     
+  [lvr5_rmeta_1]  linear /dev/sdb(...)                           
+  [lvr5_rmeta_2]  linear /dev/sdc(...)                                      
+.fi
+
+.B raid6
+.br
+Six rimage SubLVs each hold a portion of LV data and parity.
+Each rmeta SubLV holds RAID metadata.
+
+.nf
+lvcreate --type raid6 --stripes 3 --name lvr6
+
+lvs -a -o name,segtype,devices
+  lvr6            raid6  lvr6_rimage_0(0),\\
+                         lvr6_rimage_1(0),\\
+                         lvr6_rimage_2(0),\\
+                         lvr6_rimage_3(0),\\
+                         lvr6_rimage_4(0),\\
+                         lvr6_rimage_5(0)
+  [lvr6_rimage_0] linear /dev/sda(...)
+  [lvr6_rimage_1] linear /dev/sdb(...)
+  [lvr6_rimage_2] linear /dev/sdc(...)
+  [lvr6_rimage_3] linear /dev/sdd(...)
+  [lvr6_rimage_4] linear /dev/sde(...)
+  [lvr6_rimage_5] linear /dev/sdf(...)
+  [lvr6_rmeta_0]  linear /dev/sda(...)
+  [lvr6_rmeta_1]  linear /dev/sdb(...)
+  [lvr6_rmeta_2]  linear /dev/sdc(...)
+  [lvr6_rmeta_3]  linear /dev/sdd(...)
+  [lvr6_rmeta_4]  linear /dev/sde(...)
+  [lvr6_rmeta_5]  linear /dev/sdf(...)
+
+.B raid10
+.br
+Four rimage SubLVs each hold a portion of LV data.  No parity is used.
+Each rmeta SubLV holds RAID metadata.
+
+.nf
+lvcreate --type raid10 --stripes 2 --mirrors 1 --name lvr10
+
+lvs -a -o name,segtype,devices
+  lvr10            raid10 lvr10_rimage_0(0),\\
+                          lvr10_rimage_1(0),\\
+                          lvr10_rimage_2(0),\\
+                          lvr10_rimage_3(0)
+  [lvr10_rimage_0] linear /dev/sda(...)
+  [lvr10_rimage_1] linear /dev/sdb(...)
+  [lvr10_rimage_2] linear /dev/sdc(...)
+  [lvr10_rimage_3] linear /dev/sdd(...)
+  [lvr10_rmeta_0]  linear /dev/sda(...)
+  [lvr10_rmeta_1]  linear /dev/sdb(...)
+  [lvr10_rmeta_2]  linear /dev/sdc(...)
+  [lvr10_rmeta_3]  linear /dev/sdd(...)
+.fi
+
+
+.SH Device Failure
+
+Physical devices in a RAID LV can fail or be lost for multiple reasons.
+A device could be disconnected, permanently failed, or temporarily
+disconnected.  The purpose of RAID LVs (levels 1 and higher) is to
+continue operating in a degraded mode, without losing LV data, even after
+a device fails.  The number of devices that can fail without the loss of
+LV data depends on the RAID level:
+
+.IP \[bu] 3
+RAID0 (striped) LVs cannot tolerate losing any devices.  LV data will be
+lost if any devices fail.
+
+.IP \[bu] 3
+RAID1 LVs can tolerate losing all but one device without LV data loss.
+
+.IP \[bu] 3
+RAID4 and RAID5 LVs can tolerate losing one device without LV data loss.
+
+.IP \[bu] 3
+RAID6 LVs can tolerate losing two devices without LV data loss.
+
+.IP \[bu] 3
+RAID10 is variable, and depends on which devices are lost.  It can
+tolerate losing all but one device in a single raid1 mirror without
+LV data loss.
+
+.P
+
+If a RAID LV is missing devices, or has other device-related problems, lvs
+reports this in the health_status (and attr) fields:
+
+.B lvs -o name,lv_health_status
+
+.B partial
+.br
+Devices are missing from the LV.  This is also indicated by the letter "p"
+(partial) in the 9th position of the lvs attr field.
+
+.B refresh needed
+.br
+A device was temporarily missing but has returned.  The LV needs to be
+refreshed to use the device again (which will usually require
+partial synchronization).  This is also indicated by the letter "r" (refresh
+needed) in the 9th position of the lvs attr field.  See
+\fBRefreshing an LV\fP.  This could also indicate a problem with the
+device, in which case it should be be replaced, see
+\fBReplacing Devices\fP.
+
+.B mismatches exist
+.br
+See
+.BR Scrubbing .
+
+Most commands will also print a warning if a device is missing, e.g.
+.br
+.nf
+WARNING: Device for PV uItL3Z-wBME-DQy0-... not found or rejected ...
+.fi
+
+This warning will go away if the device returns or is removed from the
+VG (see \fBvgreduce \-\-removemissing\fP).
+
+
+.SS Activating an LV with missing devices
+
+A RAID LV that is missing devices may be activated or not, depending on
+the "activation mode" used in lvchange:
+
+.B lvchange \-ay \-\-activationmode
+.RB { complete | degraded | partial }
+.IR VG / LV
+
+.B complete
+.br
+The LV is only activated if all devices are present.
+
+.B degraded
+.br
+The LV is activated with missing devices if the RAID level can
+tolerate the number of missing devices without LV data loss.
+
+.B partial
+.br
+The LV is always activated, even if portions of the LV data are missing
+because of the missing device(s).  This should only be used to perform
+recovery or repair operations.
+
+.BR lvm.conf (5)
+.B activation/activation_mode
+.br
+controls the activation mode when not specified by the command.
+
+The default value is printed by:
+.nf
+lvmconfig --type default activation/activation_mode
+.fi
+
+.SS Replacing Devices
+
+Devices in a RAID LV can be replaced with other devices in the VG.  When
+replacing devices that are no longer visible on the system, use lvconvert
+\-\-repair.  When replacing devices that are still visible, use lvconvert
+\-\-replace.  The repair command will attempt to restore the same number
+of data LVs that were previously in the LV.  The replace option can be
+repeated to replace multiple PVs.  Replacement devices can be optionally
+listed with either option.
+
+.B lvconvert \-\-repair
+.IR VG / LV
+[\fINewPVs\fP]
+
+.B lvconvert \-\-replace
+\fIOldPV\fP
+.IR VG / LV
+[\fINewPV\fP]
+
+.B lvconvert
+.B \-\-replace
+\fIOldPV1\fP
+.B \-\-replace
+\fIOldPV2\fP
+...
+.IR VG / LV
+[\fINewPVs\fP]
+
+New devices require synchronization with existing devices, see
+.BR Synchronization .
+
+.SS Refreshing an LV
+
+Refreshing a RAID LV clears any transient device failures (device was
+temporarily disconnected) and returns the LV to its fully redundant mode.
+Restoring a device will usually require at least partial synchronization
+(see \fBSynchronization\fP).  Failure to clear a transient failure results
+in the RAID LV operating in degraded mode until it is reactivated.  Use
+the lvchange command to refresh an LV:
+
+.B lvchange \-\-refresh
+.IR VG / LV
+
+.nf
+# lvs -o name,vgname,segtype,attr,size vg
+  LV    VG   Type  Attr       LSize
+  raid1 vg   raid1 Rwi-a-r-r- 100.00g
+
+# lvchange --refresh vg/raid1
+
+# lvs -o name,vgname,segtype,attr,size vg
+  LV    VG   Type  Attr       LSize
+  raid1 vg   raid1 Rwi-a-r--- 100.00g
+.fi
+
+.SS Automatic repair
+
+If a device in a RAID LV fails, device-mapper in the kernel notifies the
+.BR dmeventd (8)
+monitoring process (see \fBMonitoring\fP).
+dmeventd can be configured to automatically respond using:
+
+.BR lvm.conf (5)
+.B activation/raid_fault_policy
+
+Possible settings are:
+
+.B warn
+.br
+A warning is added to the system log indicating that a device has
+failed in the RAID LV.  It is left to the user to repair the LV, e.g.
+replace failed devices.
+
+.B allocate
+.br
+dmeventd automatically attempts to repair the LV using spare devices
+in the VG.  Note that even a transient failure is handled as a permanent
+failure; a new device is allocated and full synchronization is started.
+
+The specific command run by dmeventd to warn or repair is:
+.br
+.B lvconvert \-\-repair \-\-use\-policies
+.IR VG / LV
+
+
+.SS Corrupted Data
+
+Data on a device can be corrupted due to hardware errors, without the
+device ever being disconnected, and without any fault in the software.
+This should be rare, and can be detected (see \fBScrubbing\fP).
+
+
+.SS Rebuild specific PVs
+
+If specific PVs in a RAID LV are known to have corrupt data, the data on
+those PVs can be reconstructed with:
+
+.B lvchange \-\-rebuild PV
+.IR VG / LV
+
+The rebuild option can be repeated with different PVs to replace the data
+on multiple PVs.
+
+For example, in a raid1 LV, the master mirror image on PV1 may have
+corrupt data to due a transient disk error.  In this case, \-\-rebuild PV1
+reconstructs data on the master image rather than rebuilding all other
+images from the master image.
+
+
+.SH Monitoring
+
+When a RAID LV is activated the \fBdmeventd\fP(8) process is started to
+monitor the health of the LV.  Various events detected in the kernel can
+cause a notification to be sent from device-mapper to the monitoring
+process, including device failures and synchronization completion (e.g.
+for initialization or scrubbing).
+
+The LVM configuration file contains options that affect how the monitoring
+process will respond to failure events (e.g. raid_fault_policy).  It is
+possible to turn on and off monitoring with lvchange, but it is not
+recommended to turn this off unless you have a thorough knowledge of the
+consequences.
+
+
+.SH Configuration Options
+
+There are a number of options in the LVM configuration file that affect
+the behavior of RAID LVs.  The tunable options are listed
+below.  A detailed description of each can be found in the LVM
+configuration file itself.
+.br
+        mirror_segtype_default
+.br
+        raid10_segtype_default
+.br
+        raid_region_size
+.br
+        raid_fault_policy
+.br
+        activation_mode
+
+
+.SH RAID1 Tuning
+
+A RAID1 LV can be tuned so that certain devices are avoided for reading
+while all devices are still written to.
+
+.B lvchange
+.BR \-\- [ raid ] writemostly
+.BR \fIPhysicalVolume [ : { y | n | t }]
+.IR VG / LV
+
+The specified device will be marked as "write mostly", which means that
+reading from this device will be avoided, and other devices will be
+preferred for reading (unless no other devices are available.)  This
+minimizes the I/O to the specified device.
+
+If the PV name has no suffix, the write mostly attribute is set.  If the
+PV name has the suffix \fB:n\fP, the write mostly attribute is cleared,
+and the suffix \fB:t\fP toggles the current setting.
+
+The write mostly option can be repeated on the command line to change
+multiple devices at once.
+
+To report the current write mostly setting, the lvs attr field will show
+the letter "w" in the 9th position when write mostly is set:
+
+.B lvs -a -o name,attr
+
+When a device is marked write mostly, the maximum number of outstanding
+writes to that device can be configured.  Once the maximum is reached,
+further writes become synchronous.  When synchronous, a write to the LV
+will not complete until writes to all the mirror images are complete.
+
+.B lvchange
+.BR \-\- [ raid ] writebehind
+.IR IOCount
+.IR VG / LV
+
+To report the current write behind setting, run:
+
+.B lvs -o name,raid_write_behind
+
+When write behind is not configured, or set to 0, all LV writes are
+synchronous.
+
+
+.SH RAID Takeover
+
+RAID takeover is converting a RAID LV from one RAID level to another, e.g.
+raid5 to raid6.  Changing the RAID level is usually done to increase or
+decrease resilience to device failures.  This is done using lvconvert and
+specifying the new RAID level as the LV type:
+
+.B lvconvert --type
+.I RaidLevel
+\fIVG\fP/\fILV\fP
+[\fIPVs\fP]
+
+The most common and recommended RAID takeover conversions are:
+
+.HP
+\fBlinear\fP to \fBraid1\fP
+.br
+Linear is a single image of LV data, and
+converting it to raid1 adds a mirror image which is a direct copy of the
+original linear image.
+
+.HP
+\fBstriped\fP/\fBraid0\fP to \fBraid4/5/6\fP
+.br
+Adding parity devices to a
+striped volume results in raid4/5/6.
+
+.P
+
+Unnatural conversions that are not recommended include converting between
+striped and non-striped types.  This is because file systems often
+optimize I/O patterns based on device striping values.  If those values
+change, it can decrease performance.
+
+Converting to a higher RAID level requires allocating new SubLVs to hold
+RAID metadata, and new SubLVs to hold parity blocks for LV data.
+Converting to a lower RAID level removes the SubLVs that are no longer
+needed.
+
+Conversion often requires full synchronization of the RAID LV (see
+\fBSynchronization\fP).  Converting to RAID1 requires copying all LV data
+blocks to a new image on a new device.  Converting to a parity RAID level
+requires reading all LV data blocks, calculating parity, and writing the
+new parity blocks.  Synchronization can take a long time and degrade
+performance (rate controls also apply to conversion, see
+\fB\-\-maxrecoveryrate\fP.)
+
+.P
+
+The following takeover conversions are currently possible:
+.br
+.IP \(bu 3
+between linear and raid1.
+.IP \(bu 3
+between striped and raid4.
+
+.SS Examples
+
+1. Converting an LV from \fBlinear\fP to \fBraid1\fP.
+
+.nf
+# lvs -a -o name,segtype,size vg
+  LV   Type   LSize
+  lv   linear 300.00g
+
+# lvconvert --type raid1 --mirrors 1 vg/lv
+
+# lvs -a -o name,segtype,size vg
+  LV            Type   LSize
+  lv            raid1  300.00g
+  [lv_rimage_0] linear 300.00g
+  [lv_rimage_1] linear 300.00g
+  [lv_rmeta_0]  linear   3.00m
+  [lv_rmeta_1]  linear   3.00m
+.fi
+
+2. Converting an LV from \fBmirror\fP to \fBraid1\fP.
+
+.nf
+# lvs -a -o name,segtype,size vg
+  LV            Type   LSize
+  lv            mirror 100.00g
+  [lv_mimage_0] linear 100.00g
+  [lv_mimage_1] linear 100.00g
+  [lv_mlog]     linear   3.00m
+
+# lvconvert --type raid1 vg/lv
+
+# lvs -a -o name,segtype,size vg
+  LV            Type   LSize
+  lv            raid1  100.00g
+  [lv_rimage_0] linear 100.00g
+  [lv_rimage_1] linear 100.00g
+  [lv_rmeta_0]  linear   3.00m
+  [lv_rmeta_1]  linear   3.00m
+.fi
+
+3. Converting an LV from \fBstriped\fP (with 4 stripes) to \fBraid6_nc\fP.
+
+.nf
+Start with a striped LV:
+
+# lvcreate --stripes 4 -L64M -n my_lv vg
+
+Convert the striped LV to raid6_nc:
+
+# lvconvert --type raid6_nc vg/my_lv
+
+# lvs -a -o lv_name,segtype,sync_percent,data_copies
+  LV               Type      Cpy%Sync #Cpy
+  my_lv            raid6_n_6 100.00      3
+  [my_lv_rimage_0] linear
+  [my_lv_rimage_1] linear
+  [my_lv_rimage_2] linear
+  [my_lv_rimage_3] linear
+  [my_lv_rimage_4] linear
+  [my_lv_rimage_5] linear
+  [my_lv_rmeta_0]  linear
+  [my_lv_rmeta_1]  linear
+  [my_lv_rmeta_2]  linear
+  [my_lv_rmeta_3]  linear
+  [my_lv_rmeta_4]  linear
+  [my_lv_rmeta_5]  linear
+.fi
+
+This convert begins by allocating MetaLVs (rmeta_#) for each of the
+existing stripe devices.  It then creates 2 additional MetaLV/DataLV pairs
+(rmeta_#/rimage_#) for dedicated raid6 parity.
+
+If rotating data/parity is required, such as with raid6_nr, it must be
+done by reshaping (see below).
+
+4. Converting an LV from \fBlinear\fP to \fBraid1\fP (with 3 images).
+
+.nf
+Start with a linear LV:
+
+# lvcreate -L1G -n my_lv vg
+
+Convert the linear LV to raid1 with three images
+(original linear image plus 2 mirror images):
+
+# lvconvert --type raid1 --mirrors 2 vg/my_lv
+.fi
+
+
+.SH RAID Reshaping
+
+RAID reshaping is changing attributes of a RAID LV while keeping the same
+RAID level, i.e. changes that do not involve changing the number of
+devices.  This includes changing RAID layout, stripe size, or number of
+stripes.
+
+When changing the RAID layout or stripe size, no new SubLVs (MetaLVs or
+DataLVs) need to be allocated, but DataLVs are extended by a small amount
+(typically 1 extent).  The extra space allows blocks in a stripe to be
+updated safely, and not corrupted in case of a crash.  If a crash occurs,
+reshaping can just be restarted.
+
+(If blocks in a stripe were updated in place, a crash could leave them
+partially updated and corrupted.  Instead, an existing stripe is quiesced,
+read, changed in layout, and the new stripe written to free space.  Once
+that is done, the new stripe is unquiesced and used.)
+
+.SS Examples
+
+1. Converting raid6_n_6 to raid6_nr with rotating data/parity.
+
+This conversion naturally follows a previous conversion from striped to
+raid6_n_6 (shown above).  It completes the transition to a more
+traditional RAID6.
+
+.nf
+# lvs -o lv_name,segtype,sync_percent,data_copies
+  LV               Type      Cpy%Sync #Cpy
+  my_lv            raid6_n_6 100.00      3
+  [my_lv_rimage_0] linear
+  [my_lv_rimage_1] linear
+  [my_lv_rimage_2] linear
+  [my_lv_rimage_3] linear
+  [my_lv_rimage_4] linear
+  [my_lv_rimage_5] linear
+  [my_lv_rmeta_0]  linear
+  [my_lv_rmeta_1]  linear
+  [my_lv_rmeta_2]  linear
+  [my_lv_rmeta_3]  linear
+  [my_lv_rmeta_4]  linear
+  [my_lv_rmeta_5]  linear
+
+# lvconvert --type raid6_nr vg/my_lv
+
+# lvs -a -o lv_name,segtype,sync_percent,data_copies
+  LV               Type     Cpy%Sync #Cpy
+  my_lv            raid6_nr 100.00      3
+  [my_lv_rimage_0] linear
+  [my_lv_rimage_0] linear
+  [my_lv_rimage_1] linear
+  [my_lv_rimage_1] linear
+  [my_lv_rimage_2] linear
+  [my_lv_rimage_2] linear
+  [my_lv_rimage_3] linear
+  [my_lv_rimage_3] linear
+  [my_lv_rimage_4] linear
+  [my_lv_rimage_5] linear
+  [my_lv_rmeta_0]  linear
+  [my_lv_rmeta_1]  linear
+  [my_lv_rmeta_2]  linear
+  [my_lv_rmeta_3]  linear
+  [my_lv_rmeta_4]  linear
+  [my_lv_rmeta_5]  linear
+.fi
+
+The DataLVs are larger (additional segment in each) which provides space
+for out-of-place reshaping.  The result is:
+
+FIXME: did the lv name change from my_lv to r?
+.br
+FIXME: should we change device names in the example to sda,sdb,sdc?
+.br
+FIXME: include -o devices or seg_pe_ranges above also?
+
+.nf
+# lvs -a -o lv_name,segtype,seg_pe_ranges,dataoffset
+  LV           Type     PE Ranges          data
+  r            raid6_nr r_rimage_0:0-32 \\
+                        r_rimage_1:0-32 \\
+                        r_rimage_2:0-32 \\
+                        r_rimage_3:0-32
+  [r_rimage_0] linear   /dev/sda:0-31      2048
+  [r_rimage_0] linear   /dev/sda:33-33
+  [r_rimage_1] linear   /dev/sdaa:0-31     2048
+  [r_rimage_1] linear   /dev/sdaa:33-33
+  [r_rimage_2] linear   /dev/sdab:1-33     2048
+  [r_rimage_3] linear   /dev/sdac:1-33     2048
+  [r_rmeta_0]  linear   /dev/sda:32-32
+  [r_rmeta_1]  linear   /dev/sdaa:32-32
+  [r_rmeta_2]  linear   /dev/sdab:0-0
+  [r_rmeta_3]  linear   /dev/sdac:0-0
+.fi
+
+All segments with PE ranges '33-33' provide the out-of-place reshape space.
+The dataoffset column shows that the data was moved from initial offset 0 to
+2048 sectors on each component DataLV.
+
+
+.SH RAID5 Variants
+
+raid5_ls
+.br
+\[bu]
+RAID5 left symmetric
+.br
+\[bu]
+Rotating parity N with data restart
+
+raid5_la
+.br
+\[bu]
+RAID5 left symmetric
+.br
+\[bu]
+Rotating parity N with data continuation
+
+raid5_rs
+.br
+\[bu]
+RAID5 right symmetric
+.br
+\[bu]
+Rotating parity 0 with data restart
+
+raid5_ra
+.br
+\[bu]
+RAID5 right asymmetric
+.br
+\[bu]
+Rotating parity 0 with data continuation
+
+raid5_n
+.br
+\[bu]
+RAID5 striping
+.br
+\[bu]
+Same layout as raid4 with a dedicated parity N with striped data.
+.br
+\[bu]
+Used for
+.B RAID Takeover
+
+.SH RAID6 Variants
+
+raid6
+.br
+\[bu]
+RAID6 zero restart (aka left symmetric)
+.br
+\[bu]
+Rotating parity 0 with data restart
+.br
+\[bu]
+Same as raid6_zr
+
+raid6_zr
+.br
+\[bu]
+RAID6 zero restart (aka left symmetric)
+.br
+\[bu]
+Rotating parity 0 with data restart
+
+raid6_nr
+.br
+\[bu]
+RAID6 N restart (aka right symmetric)
+.br
+\[bu]
+Rotating parity N with data restart
+
+raid6_nc
+.br
+\[bu]
+RAID6 N continue
+.br
+\[bu]
+Rotating parity N with data continuation
+
+raid6_n_6
+.br
+\[bu]
+RAID6 N continue
+.br
+\[bu]
+Fixed P-Syndrome N-1 and Q-Syndrome N with striped data
+.br
+\[bu]
+Used for
+.B RAID Takeover
+
+raid6_ls_6
+.br
+\[bu]
+RAID6 N continue
+.br
+\[bu]
+Same as raid5_ls for N-1 disks with fixed Q-Syndrome N
+.br
+\[bu]
+Used for
+.B RAID Takeover
+
+raid6_la_6
+.br
+\[bu]
+RAID6 N continue
+.br
+\[bu]
+Same as raid5_la for N-1 disks with fixed Q-Syndrome N
+.br
+\[bu]
+Used for
+.B RAID Takeover
+
+raid6_rs_6
+.br
+\[bu]
+RAID6 N continue
+.br
+\[bu]
+Same as raid5_rs for N-1 disks with fixed Q-Syndrome N
+.br
+\[bu]
+Used for
+.B RAID Takeover
+
+raid6_ra_6
+.br
+\[bu]
+RAID6 N continue
+.br
+\[bu]
+Same as raid5_ra for N-1 disks with fixed Q-Syndrome N
+.br
+\[bu]
+Used for
+.B RAID Takeover
+
+
+
+.SH RAID Duplication
+
+RAID LV conversion (takeover or reshaping) can be done out\-of\-place by
+copying the LV data onto new devices while changing the RAID properties.
+Copying avoids modifying the original LV but requires additional devices.
+Once the LV data has been copied/converted onto the new devices, there are
+multiple options:
+
+1. The RAID LV can be switched over to run from just the new devices, and
+the original copy of the data removed.  The converted LV then has the new
+RAID properties, and exists on new devices.  The old devices holding the
+original data can be removed or reused.
+
+2. The new copy of the data can be dropped, leaving the original RAID LV
+unchanged and using its original devices.
+
+3. The new copy of the data can be separated and used as a new independent
+LV, leaving the original RAID LV unchanged on its original devices.
+
+The command to start duplication is:
+
+.B lvconvert \-\-type
+.I RaidLevel
+[\fB\-\-stripes\fP \fINumber\fP \fB\-\-stripesize\fP \fISize\fP]
+.RS
+.B \-\-duplicate
+.IR VG / LV
+[\fIPVs\fP]
+.RE
+
+.HP
+.B \-\-duplicate
+.br
+Specifies that the LV conversion should be done out\-of\-place, copying
+LV data to new devices while converting. 
+
+.HP
+.BR \-\-type , \-\-stripes , \-\-stripesize
+.br
+Specifies the RAID properties to use when creating the copy.
+
+.P
+\fIPVs\fP specifies the new devices to use.
+
+The steps in the duplication process:
+
+.IP \(bu 3
+LVM creates a new LV on new devices using the specified RAID properties
+(type, stripes, etc) and optionally specified devices.
+
+.IP \(bu 3
+LVM changes the visible RAID LV to type raid1, making the original LV the
+first raid1 image (SubLV 0), and the new LV the second raid1 image
+(SubLV 1).
+
+.IP \(bu 3
+The RAID1 synchronization process copies data from the original LV
+image (SubLV 0) to the new LV image (SubLV 1).
+
+.IP \(bu 3
+When synchronization is complete, the original and new LVs are
+mirror images of each other and can be separated.
+
+.P
+
+The duplication process retains both the original and new LVs (both
+SubLVs) until an explicit unduplicate command is run to separate them.  The
+unduplicate command specifies if the original LV should use the old
+devices (SubLV 0) or the new devices (SubLV 1).
+
+To make the RAID LV use the data on the old devices, and drop the copy on
+the new devices, specify the name of SubLV 0 (suffix _dup_0):
+
+.B lvconvert \-\-unduplicate
+.BI \-\-name
+.IB LV _dup_0
+.IR VG / LV
+
+To make the RAID LV use the data copy on the new devices, and drop the old
+devices, specify the name of SubLV 1 (suffix _dup_1):
+
+.B lvconvert \-\-unduplicate
+.BI \-\-name
+.IB LV _dup_1
+.IR VG / LV
+
+FIXME: To make the LV use the data on the original devices, but keep the
+data copy as a new LV, ...
+
+FIXME: include how splitmirrors can be used.
+
+
+.SH RAID1E
+
+TODO
+
+.SH History
+
+The 2.6.38-rc1 version of the Linux kernel introduced a device-mapper
+target to interface with the software RAID (MD) personalities.  This
+provided device-mapper with RAID 4/5/6 capabilities and a larger
+development community.  Later, support for RAID1, RAID10, and RAID1E (RAID
+10 variants) were added.  Support for these new kernel RAID targets was
+added to LVM version 2.02.87.  The capabilities of the LVM \fBraid1\fP
+type have surpassed the old \fBmirror\fP type.  raid1 is now recommended
+instead of mirror.  raid1 became the default for mirroring in LVM version
+2.02.100.
+




More information about the lvm-devel mailing list