[lvm-devel] [RFC PATCH v2] change default alignment of pe_start to 1MB
Mike Snitzer
snitzer at redhat.com
Fri Aug 6 04:11:39 UTC 2010
The switch to a 1MB default alignment causes various tests in the LVM2
testsuite to fail -- not a big deal but the tests would need updating.
Of more concern is that the existing LVM2 set_pe_align() code doesn't
always properly respect the alignment determined from
'devices/md_chunk_alignment' or 'devices/data_alignment_detection'.
With the previous default alignment of 64k it would generally do the
right thing -- use the detected values. But switching the default to
the larger value exposes the fact that MAX() of the MD or I/O Topology
detected values will generally always be 1MB -- when they are compared
to 1MB.
The following revised patch changes the LVM alignment detection
semantics to model what fdisk has elected to do:
- If the default value (1MB) is a multiple of the specified/detected
alignment then just use the default.
- Otherwise, use the specified/detected value.
In practice this means we'll almost always use 1MB -- that is unless:
- the specified --dataalignment, MD's full stripe width, or the
optimal_io_size exceeds 1MB
- the specified/detected value is not a power-of-2
NOTE: even with a default of 64k the old set_pe_align code would result
in incorrect alignment if a value < 64k were used for --dataalignment
---
doc/example.conf.in | 2 +-
lib/metadata/metadata.c | 30 ++++++++++++++++++------------
2 files changed, 19 insertions(+), 13 deletions(-)
diff --git a/doc/example.conf.in b/doc/example.conf.in
index 850b7e2..f7dcc63 100644
--- a/doc/example.conf.in
+++ b/doc/example.conf.in
@@ -113,7 +113,7 @@ devices {
# Alignment (in KB) of start of data area when creating a new PV.
# If a PV is placed directly upon an md device and md_chunk_alignment or
# data_alignment_detection is enabled this parameter is ignored.
- # Set to 0 for the default alignment of 64KB or page size, if larger.
+ # Set to 0 for the default alignment of 1MB or page size, if larger.
data_alignment = 0
# By default, the start of the PV's aligned data area will be shifted by
diff --git a/lib/metadata/metadata.c b/lib/metadata/metadata.c
index d7edf54..469473a 100644
--- a/lib/metadata/metadata.c
+++ b/lib/metadata/metadata.c
@@ -64,13 +64,16 @@ const char _really_init[] =
unsigned long set_pe_align(struct physical_volume *pv, unsigned long data_alignment)
{
+ unsigned long temp_pe_align, default_pe_align = 2048;
+
if (pv->pe_align)
goto out;
if (data_alignment)
pv->pe_align = data_alignment;
else
- pv->pe_align = MAX(65536UL, lvm_getpagesize()) >> SECTOR_SHIFT;
+ pv->pe_align = MAX((default_pe_align << SECTOR_SHIFT),
+ lvm_getpagesize()) >> SECTOR_SHIFT;
if (!pv->dev)
goto out;
@@ -79,10 +82,11 @@ unsigned long set_pe_align(struct physical_volume *pv, unsigned long data_alignm
* Align to stripe-width of underlying md device if present
*/
if (find_config_tree_bool(pv->fmt->cmd, "devices/md_chunk_alignment",
- DEFAULT_MD_CHUNK_ALIGNMENT))
- pv->pe_align = MAX(pv->pe_align,
- dev_md_stripe_width(pv->fmt->cmd->sysfs_dir,
- pv->dev));
+ DEFAULT_MD_CHUNK_ALIGNMENT)) {
+ temp_pe_align = dev_md_stripe_width(pv->fmt->cmd->sysfs_dir, pv->dev);
+ if (temp_pe_align && (default_pe_align % temp_pe_align))
+ pv->pe_align = temp_pe_align;
+ }
/*
* Align to topology's minimum_io_size or optimal_io_size if present
@@ -94,13 +98,15 @@ unsigned long set_pe_align(struct physical_volume *pv, unsigned long data_alignm
if (find_config_tree_bool(pv->fmt->cmd,
"devices/data_alignment_detection",
DEFAULT_DATA_ALIGNMENT_DETECTION)) {
- pv->pe_align = MAX(pv->pe_align,
- dev_minimum_io_size(pv->fmt->cmd->sysfs_dir,
- pv->dev));
-
- pv->pe_align = MAX(pv->pe_align,
- dev_optimal_io_size(pv->fmt->cmd->sysfs_dir,
- pv->dev));
+ temp_pe_align = dev_minimum_io_size(pv->fmt->cmd->sysfs_dir,
+ pv->dev);
+ if (temp_pe_align && (default_pe_align % temp_pe_align))
+ pv->pe_align = temp_pe_align;
+
+ temp_pe_align = dev_optimal_io_size(pv->fmt->cmd->sysfs_dir,
+ pv->dev);
+ if (temp_pe_align && (default_pe_align % temp_pe_align))
+ pv->pe_align = temp_pe_align;
}
log_very_verbose("%s: Setting PE alignment to %lu sectors.",
More information about the lvm-devel
mailing list