[dm-devel] Bug in dm-stripe.c driver
Wood, Brian J
brian.j.wood at intel.com
Wed Nov 21 19:15:52 UTC 2007
Hello everyone,
I believe that I've found a bug in the dm-stripe.c driver. I've been
working on getting event alerting put into the driver so that it matches
what is found in the dm-raid1.c driver. During my implementation efforts
I ran into a problem where the kernel would oops out when I tried to
call the queue_work() function. After putting some printk's in the
driver to monitor the pointer addresses of the stripe_c structure
members I finally think I've got an answer. I found that in its
declaration the struct stripe stripe[0] is only accounting for a stripe
with one drive present; in my case I have 2 drives in the stripe volume.
Here is the stripe struct (and as you can see it only allows for 1
drive):
struct stripe_c {
uint32_t stripes;
/* The size of this target / num. stripes */
sector_t stripe_width;
/* stripe chunk size */
uint32_t chunk_shift;
sector_t chunk_mask;
/***** Here's the problem spot *****/
struct stripe stripe[0];
/***** New member elements below problem declaration *****/
/* Needed for handling events */
struct dm_target *ti;
/* New work_queue setup for triggering events*/
struct work_struct kstriped_ws;
};
So, by adding the new declarations of member elements after that
stripe[0] declaration there is the possibility of overwriting memory
addresses when the processing code for the get_stripe() function is ran.
Here's a snippet of the printk code I used to figure this out and its
resulting syslog data:
/*
* Construct a striped mapping.
* <number of stripes> <chunk size (2^^n)> [<dev_path> <offset>]+
*/
static int stripe_ctr(struct dm_target *ti, unsigned int argc, char
**argv)
{
.
.
.
/*
* Get the stripe destinations.
*/
for (i = 0; i < stripes; i++) {
argv += 2;
printk("EVENT: 7:%d Address of sc->kstriped_ws.entry=%p, next=%p,
prev=%p\n", i, &sc->kstriped_ws.entry, sc->kstriped_ws.entry.next,
sc->kstriped_ws.entry.prev);
r = get_stripe(ti, sc, i, argv);
printk("EVENT: 8:%d Address of sc->kstriped_ws.entry=%p, next=%p,
prev=%p\n", i, &sc->kstriped_ws.entry, sc->kstriped_ws.entry.next,
sc->kstriped_ws.entry.prev);
if (r < 0) {
ti->error = "Couldn't parse stripe destination";
while (i--)
dm_put_device(ti, sc->stripe[i].dev);
kfree(sc);
return r;
}
}
.
.
.
}
/*
* Parse a single <dev> <sector> pair
*/
static int get_stripe(struct dm_target *ti, struct stripe_c *sc,
unsigned int stripe, char **argv)
{
unsigned long long start;
if (sscanf(argv[1], "%llu", &start) != 1)
return -EINVAL;
printk("EVENT: 7:%d%d.1 Address of sc->kstriped_ws.entry=%p, next=%p,
prev=%p\n", stripe, stripe, &sc->kstriped_ws.entry,
sc->kstriped_ws.entry.next, sc->kstriped_ws.entry.prev);
printk(" argv[0]=%s, stripe=%d, sc->stripe_width=%u\n",
argv[0], stripe, (uint32_t)sc->stripe_width);
if (dm_get_device(ti, argv[0], start, sc->stripe_width,
dm_table_get_mode(ti->table),
&sc->stripe[stripe].dev))
return -ENXIO;
printk("EVENT: 7:%d%d.2 Address of sc->kstriped_ws.entry=%p, next=%p,
prev=%p\n",stripe, stripe, &sc->kstriped_ws.entry,
sc->kstriped_ws.entry.next, sc->kstriped_ws.entry.prev);
sc->stripe[stripe].physical_start = start;
printk("EVENT: 7:%d%d.3 Address of sc->kstriped_ws.entry=%p, next=%p,
prev=%p\n",stripe, stripe, &sc->kstriped_ws.entry,
sc->kstriped_ws.entry.next, sc->kstriped_ws.entry.prev);
return 0;
}
Nov 21 01:55:19 dmraid-devhost kernel: EVENT: 7:0 Address of
sc->kstriped_ws.entry=ffff81001d8ca5b0, next=ffff81001d8ca5b0,
prev=ffff81001d8ca5b0
Nov 21 01:55:19 dmraid-devhost kernel: EVENT: 7:00.1 Address of
sc->kstriped_ws.entry=ffff81001d8ca5b0, next=ffff81001d8ca5b0,
prev=ffff81001d8ca5b0
Nov 21 01:55:19 dmraid-devhost kernel: argv[0]=/dev/sdb,
stripe=0, sc->stripe_width=488391936
Nov 21 01:55:19 dmraid-devhost kernel: EVENT: 7:00.2 Address of
sc->kstriped_ws.entry=ffff81001d8ca5b0, next=ffff81001d8ca5b0,
prev=ffff81001d8ca5b0
Nov 21 01:55:19 dmraid-devhost kernel: EVENT: 7:00.3 Address of
sc->kstriped_ws.entry=ffff81001d8ca5b0, next=ffff81001d8ca5b0,
prev=ffff81001d8ca5b0
Nov 21 01:55:19 dmraid-devhost kernel: EVENT: 8:0 Address of
sc->kstriped_ws.entry=ffff81001d8ca5b0, next=ffff81001d8ca5b0,
prev=ffff81001d8ca5b0
Nov 21 01:55:19 dmraid-devhost kernel: EVENT: 7:1 Address of
sc->kstriped_ws.entry=ffff81001d8ca5b0, next=ffff81001d8ca5b0,
prev=ffff81001d8ca5b0
Nov 21 01:55:19 dmraid-devhost kernel: EVENT: 7:11.1 Address of
sc->kstriped_ws.entry=ffff81001d8ca5b0, next=ffff81001d8ca5b0,
prev=ffff81001d8ca5b0
Nov 21 01:55:19 dmraid-devhost kernel: argv[0]=/dev/sdc,
stripe=1, sc->stripe_width=488391936
Nov 21 01:55:19 dmraid-devhost kernel: EVENT: 7:11.2 Address of
sc->kstriped_ws.entry=ffff81001d8ca5b0, next=ffff81003c47dec0,
prev=ffff81001d8ca5b0
Nov 21 01:55:19 dmraid-devhost kernel: EVENT: 7:11.3 Address of
sc->kstriped_ws.entry=ffff81001d8ca5b0, next=ffff81003c47dec0,
prev=0000000000000000
Nov 21 01:55:19 dmraid-devhost kernel: EVENT: 8:1 Address of
sc->kstriped_ws.entry=ffff81001d8ca5b0, next=ffff81003c47dec0,
prev=0000000000000000
As you can see from the output when it hits the second drive of the
stripe it is overwriting the memory addresses for the work_struct
list_head "prev" pointer; leading to my oops condition.
To verify this I decided to up the number in stripe[] to match the
number of drives present:
struct stripe stripe[1];
After this change the memory corruption is gone:
Nov 21 02:21:45 dmraid-devhost kernel: EVENT: 7:0 Address of
sc->kstriped_ws.entry=ffff81001f4d8040, next=ffff81001f4d8040,
prev=ffff81001f4d8040
Nov 21 02:21:45 dmraid-devhost kernel: EVENT: 7:00.1 Address of
sc->kstriped_ws.entry=ffff81001f4d8040, next=ffff81001f4d8040,
prev=ffff81001f4d8040
Nov 21 02:21:45 dmraid-devhost kernel: argv[0]=/dev/sdb,
stripe=0, sc->stripe_width=488391936
Nov 21 02:21:45 dmraid-devhost kernel: EVENT: 7:00.2 Address of
sc->kstriped_ws.entry=ffff81001f4d8040, next=ffff81001f4d8040,
prev=ffff81001f4d8040
Nov 21 02:21:45 dmraid-devhost kernel: EVENT: 7:00.3 Address of
sc->kstriped_ws.entry=ffff81001f4d8040, next=ffff81001f4d8040,
prev=ffff81001f4d8040
Nov 21 02:21:45 dmraid-devhost kernel: EVENT: 8:0 Address of
sc->kstriped_ws.entry=ffff81001f4d8040, next=ffff81001f4d8040,
prev=ffff81001f4d8040
Nov 21 02:21:45 dmraid-devhost kernel: EVENT: 7:1 Address of
sc->kstriped_ws.entry=ffff81001f4d8040, next=ffff81001f4d8040,
prev=ffff81001f4d8040
Nov 21 02:21:45 dmraid-devhost kernel: EVENT: 7:11.1 Address of
sc->kstriped_ws.entry=ffff81001f4d8040, next=ffff81001f4d8040,
prev=ffff81001f4d8040
Nov 21 02:21:45 dmraid-devhost kernel: argv[0]=/dev/sdc,
stripe=1, sc->stripe_width=488391936
Nov 21 02:21:45 dmraid-devhost kernel: EVENT: 7:11.2 Address of
sc->kstriped_ws.entry=ffff81001f4d8040, next=ffff81001f4d8040,
prev=ffff81001f4d8040
Nov 21 02:21:45 dmraid-devhost kernel: EVENT: 7:11.3 Address of
sc->kstriped_ws.entry=ffff81001f4d8040, next=ffff81001f4d8040,
prev=ffff81001f4d8040
Nov 21 02:21:45 dmraid-devhost kernel: EVENT: 8:1 Address of
sc->kstriped_ws.entry=ffff81001f4d8040, next=ffff81001f4d8040,
prev=ffff81001f4d8040
My question now is how should this be patched? Do I just use some
arbitrary number in the stripe[] declaration? Say something like 6
(since that is the maximum number of on-board SATA ports for an Intel
based controller at the moment). (Probably not, since this is making it
platform specific.) How about maybe changing this to an
array-of-pointers to a "struct stripe" and setting this to something
like "struct stripe *stripe[256]"? This would only be wasting memory for
the unused pointers which is comparatively small.
I look forward to hearing everyone's ideas on how this should be solved,
thank you.
Brian Wood
Software Engineer
Intel Corp., Manageability & Platform Software Division
brian.j.wood at intel.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/dm-devel/attachments/20071121/f6e4847b/attachment.htm>
More information about the dm-devel
mailing list