[dm-devel] Bug in dm-stripe.c driver

Wood, Brian J brian.j.wood at intel.com
Wed Nov 21 19:15:52 UTC 2007


Hello everyone, 

 

I believe that I've found a bug in the dm-stripe.c driver. I've been
working on getting event alerting put into the driver so that it matches
what is found in the dm-raid1.c driver. During my implementation efforts
I ran into a problem where the kernel would oops out when I tried to
call the queue_work() function. After putting some printk's in the
driver to monitor the pointer addresses of the stripe_c structure
members I finally think I've got an answer. I found that in its
declaration the struct stripe stripe[0] is only accounting for a stripe
with one drive present; in my case I have 2 drives in the stripe volume.
Here is the stripe struct (and as you can see it only allows for 1
drive):

 

struct stripe_c {

       uint32_t stripes;

 

       /* The size of this target / num. stripes */

       sector_t stripe_width;

 

       /* stripe chunk size */

       uint32_t chunk_shift;

       sector_t chunk_mask;

 

/***** Here's the problem spot *****/ 

       struct stripe stripe[0];

       

/***** New member elements below problem declaration *****/

       /* Needed for handling events */

       struct dm_target *ti;

       

       /* New work_queue setup for triggering events*/

       struct work_struct kstriped_ws;

 

};

 

So, by adding the new declarations of member elements after that
stripe[0] declaration there is the possibility of overwriting memory
addresses when the processing code for the get_stripe() function is ran.
Here's a snippet of the printk code I used to figure this out and its
resulting syslog data:

 

/*

 * Construct a striped mapping.

 * <number of stripes> <chunk size (2^^n)> [<dev_path> <offset>]+

 */

static int stripe_ctr(struct dm_target *ti, unsigned int argc, char
**argv)

{

.

.

.

       /*

        * Get the stripe destinations.

        */

       for (i = 0; i < stripes; i++) {

              argv += 2;

printk("EVENT: 7:%d Address of sc->kstriped_ws.entry=%p, next=%p,
prev=%p\n", i, &sc->kstriped_ws.entry, sc->kstriped_ws.entry.next,
sc->kstriped_ws.entry.prev);

              r = get_stripe(ti, sc, i, argv);

printk("EVENT: 8:%d Address of sc->kstriped_ws.entry=%p, next=%p,
prev=%p\n", i, &sc->kstriped_ws.entry, sc->kstriped_ws.entry.next,
sc->kstriped_ws.entry.prev);

              if (r < 0) {

                     ti->error = "Couldn't parse stripe destination";

                     while (i--)

                             dm_put_device(ti, sc->stripe[i].dev);

                     kfree(sc);

                     return r;

              }

       }

.

.

.

}

       

/*

 * Parse a single <dev> <sector> pair

 */

static int get_stripe(struct dm_target *ti, struct stripe_c *sc,

                    unsigned int stripe, char **argv)

{

       unsigned long long start;

 

       if (sscanf(argv[1], "%llu", &start) != 1)

              return -EINVAL;

 

printk("EVENT: 7:%d%d.1 Address of sc->kstriped_ws.entry=%p, next=%p,
prev=%p\n", stripe, stripe, &sc->kstriped_ws.entry,
sc->kstriped_ws.entry.next, sc->kstriped_ws.entry.prev);

printk("           argv[0]=%s, stripe=%d, sc->stripe_width=%u\n",
argv[0], stripe, (uint32_t)sc->stripe_width);      

       if (dm_get_device(ti, argv[0], start, sc->stripe_width,

                       dm_table_get_mode(ti->table),

                       &sc->stripe[stripe].dev))

              return -ENXIO;

printk("EVENT: 7:%d%d.2 Address of sc->kstriped_ws.entry=%p, next=%p,
prev=%p\n",stripe, stripe, &sc->kstriped_ws.entry,
sc->kstriped_ws.entry.next, sc->kstriped_ws.entry.prev);

 

       sc->stripe[stripe].physical_start = start;

printk("EVENT: 7:%d%d.3 Address of sc->kstriped_ws.entry=%p, next=%p,
prev=%p\n",stripe, stripe, &sc->kstriped_ws.entry,
sc->kstriped_ws.entry.next, sc->kstriped_ws.entry.prev);

 

       return 0;

}

 

 

 

Nov 21 01:55:19 dmraid-devhost kernel: EVENT: 7:0 Address of
sc->kstriped_ws.entry=ffff81001d8ca5b0, next=ffff81001d8ca5b0,
prev=ffff81001d8ca5b0

Nov 21 01:55:19 dmraid-devhost kernel: EVENT: 7:00.1 Address of
sc->kstriped_ws.entry=ffff81001d8ca5b0, next=ffff81001d8ca5b0,
prev=ffff81001d8ca5b0

Nov 21 01:55:19 dmraid-devhost kernel:            argv[0]=/dev/sdb,
stripe=0, sc->stripe_width=488391936

Nov 21 01:55:19 dmraid-devhost kernel: EVENT: 7:00.2 Address of
sc->kstriped_ws.entry=ffff81001d8ca5b0, next=ffff81001d8ca5b0,
prev=ffff81001d8ca5b0

Nov 21 01:55:19 dmraid-devhost kernel: EVENT: 7:00.3 Address of
sc->kstriped_ws.entry=ffff81001d8ca5b0, next=ffff81001d8ca5b0,
prev=ffff81001d8ca5b0

Nov 21 01:55:19 dmraid-devhost kernel: EVENT: 8:0 Address of
sc->kstriped_ws.entry=ffff81001d8ca5b0, next=ffff81001d8ca5b0,
prev=ffff81001d8ca5b0

Nov 21 01:55:19 dmraid-devhost kernel: EVENT: 7:1 Address of
sc->kstriped_ws.entry=ffff81001d8ca5b0, next=ffff81001d8ca5b0,
prev=ffff81001d8ca5b0

Nov 21 01:55:19 dmraid-devhost kernel: EVENT: 7:11.1 Address of
sc->kstriped_ws.entry=ffff81001d8ca5b0, next=ffff81001d8ca5b0,
prev=ffff81001d8ca5b0

Nov 21 01:55:19 dmraid-devhost kernel:            argv[0]=/dev/sdc,
stripe=1, sc->stripe_width=488391936

Nov 21 01:55:19 dmraid-devhost kernel: EVENT: 7:11.2 Address of
sc->kstriped_ws.entry=ffff81001d8ca5b0, next=ffff81003c47dec0,
prev=ffff81001d8ca5b0

Nov 21 01:55:19 dmraid-devhost kernel: EVENT: 7:11.3 Address of
sc->kstriped_ws.entry=ffff81001d8ca5b0, next=ffff81003c47dec0,
prev=0000000000000000

Nov 21 01:55:19 dmraid-devhost kernel: EVENT: 8:1 Address of
sc->kstriped_ws.entry=ffff81001d8ca5b0, next=ffff81003c47dec0,
prev=0000000000000000

 

As you can see from the output when it hits the second drive of the
stripe it is overwriting the memory addresses for the work_struct
list_head "prev" pointer; leading to my oops condition.

 

To verify this I decided to up the number in stripe[] to match the
number of drives present:

struct stripe stripe[1];

 

After this change the memory corruption is gone:

 

Nov 21 02:21:45 dmraid-devhost kernel: EVENT: 7:0 Address of
sc->kstriped_ws.entry=ffff81001f4d8040, next=ffff81001f4d8040,
prev=ffff81001f4d8040

Nov 21 02:21:45 dmraid-devhost kernel: EVENT: 7:00.1 Address of
sc->kstriped_ws.entry=ffff81001f4d8040, next=ffff81001f4d8040,
prev=ffff81001f4d8040

Nov 21 02:21:45 dmraid-devhost kernel:            argv[0]=/dev/sdb,
stripe=0, sc->stripe_width=488391936

Nov 21 02:21:45 dmraid-devhost kernel: EVENT: 7:00.2 Address of
sc->kstriped_ws.entry=ffff81001f4d8040, next=ffff81001f4d8040,
prev=ffff81001f4d8040

Nov 21 02:21:45 dmraid-devhost kernel: EVENT: 7:00.3 Address of
sc->kstriped_ws.entry=ffff81001f4d8040, next=ffff81001f4d8040,
prev=ffff81001f4d8040

Nov 21 02:21:45 dmraid-devhost kernel: EVENT: 8:0 Address of
sc->kstriped_ws.entry=ffff81001f4d8040, next=ffff81001f4d8040,
prev=ffff81001f4d8040

Nov 21 02:21:45 dmraid-devhost kernel: EVENT: 7:1 Address of
sc->kstriped_ws.entry=ffff81001f4d8040, next=ffff81001f4d8040,
prev=ffff81001f4d8040

Nov 21 02:21:45 dmraid-devhost kernel: EVENT: 7:11.1 Address of
sc->kstriped_ws.entry=ffff81001f4d8040, next=ffff81001f4d8040,
prev=ffff81001f4d8040

Nov 21 02:21:45 dmraid-devhost kernel:            argv[0]=/dev/sdc,
stripe=1, sc->stripe_width=488391936

Nov 21 02:21:45 dmraid-devhost kernel: EVENT: 7:11.2 Address of
sc->kstriped_ws.entry=ffff81001f4d8040, next=ffff81001f4d8040,
prev=ffff81001f4d8040

Nov 21 02:21:45 dmraid-devhost kernel: EVENT: 7:11.3 Address of
sc->kstriped_ws.entry=ffff81001f4d8040, next=ffff81001f4d8040,
prev=ffff81001f4d8040

Nov 21 02:21:45 dmraid-devhost kernel: EVENT: 8:1 Address of
sc->kstriped_ws.entry=ffff81001f4d8040, next=ffff81001f4d8040,
prev=ffff81001f4d8040

 

 

My question now is how should this be patched? Do I just use some
arbitrary number in the stripe[] declaration? Say something like 6
(since that is the maximum number of on-board SATA ports for an Intel
based controller at the moment). (Probably not, since this is making it
platform specific.) How about maybe changing this to an
array-of-pointers to a "struct stripe" and setting this to something
like "struct stripe *stripe[256]"? This would only be wasting memory for
the unused pointers which is comparatively small. 

 

I look forward to hearing everyone's ideas on how this should be solved,
thank you.

 

 

Brian Wood

Software Engineer

Intel Corp., Manageability & Platform Software Division

brian.j.wood at intel.com

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/dm-devel/attachments/20071121/f6e4847b/attachment.htm>


More information about the dm-devel mailing list