[linux-lvm] [RFC] dmraid design 1.0.3
mauelshagen at redhat.com
Fri May 28 15:19:53 UTC 2004
Attached is an RFC on the design of my dmraid tool/lib which read-only
supports (discover, activate, deactivate, display properties, ...)
various RAID devices (eg, ATARAID) in Linux 2.6 using the generic
Read-write support of such devices is subject to future extensions.
FYI: Implementation takes advantage of Søren Schmidt's work in freebsd
and Carl-Daniel Hailfinger's on raiddetect; thanks guys :)
Any helpful comments appreciated. (please cc me, i'm not subscribed)
Code to comment on will follow ASAP.
Heinz -- The LVM Guy --
Heinz Mauelshagen Red Hat GmbH
Consulting Development Engineer Am Sonnenhang 11
Mauelshagen at RedHat.com +49 2626 141200
-------------- next part --------------
dmraid tool design document v1.0.3 Heinz Mauelshagen 2004.05.26
The dmraid tool supports RAID devices (RD) such as ATARAID with
device-mapper (dm) in Linux 2.6 avoiding the need to install a
vendor specific (binary) driver to access them.
It supports multiple on-disk RAID metadata formats and is open for
extension with new ones.
First drop aims to support RDs read-only and doesn't support
*updates* of the ondisk metadata (eg, to record disk failures).
See future enhancements at the end.
1. dmraid must be able to read multiple vendor specific ondisk
RAID metadata formats:
- Highpoint 37x/45x
- LSI Logic MegaRaid
- Silicon Image
- Promise FastTrak
2. dmraid shall be open to future extensions by other ondisk RAID formats:
o Intel ICHraid (ATARAID solution on mainboard)
o SNIA DDF
3. dmraid shall generate the necessary dm table(s) defining
the needed mappings to address the particular data.
4. Device discovery, activation, deactivation and property display
shall be supported.
5. Spanning of disks, RAID0, RAID1 and RAID10 shall be supported
(in order to be able to support SNIA DDF, higher raid levels need
implementing in form of respective dm targets; eg, RAID5);
Some vendors do have support for RAID5 already which is outside the scope
of dmraid because of the lag of a RAID5 target in device-mapper!
Feature set definition:
Feature set summarizes as: Discover, Activate, Deactivate, Display.
o Discover (1-n RD)
1 scan active disk devices identifying RD
2 try to find an RD signature and if recognized add the device to the list
of RDs found
o Activate (1-n RD)
This shall be achieved by abstracting the internal metadata describing
the RAID layout and translating the vendor specific representation
into such abstracted form.
1 group devices into sets conforming to their respective layout
(SPAN, RAID0, RAID1, RAID10).
2 generate dm mapping tables for a/those set(s).
3 create multiple/a dm device(s) for each set to activate and
load the generated table(s) into the device.
o Deactivate (1-n RD)
1 remove the dm device(s) making up an RD; can be a hierachy of devices
(eg, RAID10: RAID1 on top of n RAID0 devices).
o Display (1-n RD)
1 display RAID properties of the device
(eg, display information kept with RAID sets such as size and type)
o RAID metadata format handler
Tool calls the following function to register a vendor specific
format handler; in case of success, a new instance with methods is
accessible to the high level metadata handling functions (see below):
- int register_format(struct dmraid_format *dmraid_format);
x returns !0 on successfull format handler registration
x returns 0 on failure.
- Format handler methods:
x struct dmraid_dev *(read)(struct disk_info* disk_info);
- returns 'struct dmraid_dev *' describing the RD (eg, offset, length)
- returns NULL on error
x struct dmraid_set (*add)(struct dmraid_dev *dmraid_dev)
- returns pointer to RAID set structure on success
- returns NULL on error
x int (*check)(struct dmraid_set *dmraid_set)
- returns !0 in case raid set is consitent
- returns 0 on inconsistency
1 retrieve block device information from sysfs for all disk
devices by scanning /SYSFS_MOUNTPOINT/block/[sh]d*;
keep information about the device path, size and the disk geometry which
is the base to find the RAID signature on the device in a linked list
of type 'struct disk_info *'.
(FIXME: bogus Linux 2.6 disk geometry reported)
2 walk the list and try to read RAID signature off the device trying vendor
specific read methods (eg, Highpoint...) in turn; library exposes interface
to register format handlers for vendor specific RAID formats in order
to be open for future extensions (see register_format() above).
Tool calls the following high level function which hides
the iteration through all the registered format handler methods:
x struct dmraid_dev *dmraid_read(char disk_info *disk_info);
- returns 'struct dmraid_dev *' in case of an RAID device hit;
'struct dmraid_dev *' contains information such as the data area start
and length, the name of the RAID device and its status
(operational etc.), the sequence # of the device in the set and
the layout (eg, SPAN, RAID0, ...) with layout specifics
(eg, stride size in case of RAID); shall be linkable
to an ordered list which makes up the RAID set
- returns NULL if no RAID disk device discovered
o Activate 1
x struct dmraid_set *dmraid_add(struct dmraid_dev* dmraid);
- returns pointer to the RAID set structure on success;
RAID device got added to an existing set or a new set got
created on the fly
- returns NULL on error
x struct dmraid_set *get_set(void);
- get a RAID set off the list of created sets using an iterator;
set is defined as an ordered linked list of the devices making
up the set; in case of RAID10 a 2 level set hierarchy is used.
- returns NULL in case list is empty
x void rewind_set(void);
- rewind the list iterator;
next call to get_set() will return the first set on the list
o Activate 2+3
- for non-RAID1 devices which have an invalid set check result
- create the ASCII dm mapping table by iterating through the list
of RD in a particular set, retrieving the layout (SPAN, ...)
the device path, the offset into the device and the length to map
and the stripe size in case of RAID
- create a unique device_name
- call device-mapper library to create the mapped device and load
the mapping table
x int activate_set(struct dmraid_set *dmraid_set);
- returns 1 in case of successfull RAID set activation
- returns 0 on error
- check if the RAID set is actiove and call device-mapper library to
remove the mapped device (recursively in case of a mapped-device hierarchy)
- list all block devices found
- list all (in)active RD
- display properties of a particular/all RD devices
(eg, members of the set by block device name and offset/length mapped
Code directory tree:
| |-/format ---/ataraid
o write support to update ondisk metadata
- to initialize RAID disks
- to record disk failures
o support to log state (eg, sector failures) in ondisk logs
o status daemon to keep track of RAID set sanity
(eg, disk failure, hot spare rebuild, ...) and
frontend with CLI
o do we need to support partitions on RAID sets ?
o do we need to prioritize on device-mapper targets for higher RAID levels
(in particular we'ld need RAID5 to support some ATARAID formats) ?
More information about the linux-lvm