[dm-devel] raid 5 recovery kernel warnings

Fri Jul 6 18:35:40 UTC 2018

We hit a RAID 5 issue during failure testing that caused a flood of kernel
warnings and minor but problematic data corruption.

Setup: RAID 5 with 7 drives + 1 hot spare
OS: RHEL 7.5
Kernel: linux-3.10.0-862.3.3.el7

Scenario: We pulled a single data drive and the array automatically started
its recovery process using the hot spare. We immediately became overwhelmed
with the following kernel messages which choked the system.  Setting
kernel.printk="2 4 1 7" did nothing to stop the messages. Once the repair
was complete the machine became usable again. Once we were online again we
noticed we had some minor data corruption.

We have been unable to reproduce this issue again.

[258091.028244] Workqueue: raid5wq raid5_do_work [raid456]
[258091.028245] Call Trace:
[258091.028248]  [<ffffffff9fb0e78e>] dump_stack+0x19/0x1b
[258091.028250]  [<ffffffff9f491998>] __warn+0xd8/0x100
[258091.028253]  [<ffffffff9f491add>] warn_slowpath_null+0x1d/0x20
[258091.028257]  [<ffffffffc0834677>] handle_stripe+0x2367/0x23f0 [raid456]
[258091.028258] systemd-journald[166084]: /dev/kmsg buffer overrun, some
messages lost.
[258091.028262]  [<ffffffff9f72bad1>] ?
blk_mq_sched_dispatch_requests+0x181/0x1c0
[258091.028266]  [<ffffffffc0834aad>]
handle_active_stripes.isra.55+0x3ad/0x530 [raid456]
[258091.028271]  [<ffffffffc08354bf>] raid5_do_work+0x9f/0x150 [raid456]
[258091.028271] systemd-journald[166084]: /dev/kmsg buffer overrun, some
messages lost.
[258091.028274]  [<ffffffff9f4b312f>] process_one_work+0x17f/0x440
[258091.028276]  [<ffffffff9f4b3df6>] worker_thread+0x126/0x3c0
[258091.028279]  [<ffffffff9f4b3cd0>] ? manage_workers.isra.24+0x2a0/0x2a0
[258091.028280]  [<ffffffff9f4bb161>] kthread+0xd1/0xe0
[258091.028281] systemd-journald[166084]: /dev/kmsg buffer overrun, some
messages lost.
[258091.028284]  [<ffffffff9f4bb090>] ? insert_kthread_work+0x40/0x40
[258091.028287]  [<ffffffff9fb2065d>] ret_from_fork_nospec_begin+0x7/0x21
[258091.028290]  [<ffffffff9f4bb090>] ? insert_kthread_work+0x40/0x40
[258091.028291] systemd-journald[166084]: /dev/kmsg buffer overrun, some
messages lost.
[258091.028292] ---[ end trace 3232975a123b52bf ]---
[258091.028297] ------------[ cut here ]------------
[258091.028301] *WARNING: CPU: 4 PID: 155463 at drivers/md/raid5.c:4672*
handle_stripe+0x2367/0x23f0 [raid456]

# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] [raid0]
md126 : active raid0 md127[0]
      4657247232 blocks super 1.2 512k chunks

md127 : active raid5 dm-7[7](R) dm-6[6](F) dm-5[5] dm-4[4] dm-3[3] dm-2[2]
dm-1[1] dm-0[0]
      4657379328 blocks super 1.2 level 5, 16k chunk, algorithm 2 [7/6]
[UUUUUU_]
      [=>...................]  recovery =  9.6% (75100712/776229888)
finish=601.8min speed=19414K/sec
      bitmap: 1/6 pages [4KB], 65536KB chunk

unused devices: <none>

*drivers/md/raid5.c:4672*

    /* maybe we need to check and possibly fix the parity for this stripe
     * Any reads will already have been scheduled, so we just see if enough
     * data is available.  The parity check is held off while parity
     * dependent operations are in flight.
     */
    if (sh->check_state ||
        (s.syncing && s.locked == 0 &&
         !test_bit(STRIPE_COMPUTE_RUN, &sh->state) &&
         !test_bit(STRIPE_INSYNC, &sh->state))) {
        if (conf->level == 6)
            handle_parity_checks6(conf, sh, &s, disks);
        else
            handle_parity_checks5(conf, sh, &s, disks);
    }

    if ((s.replacing || s.syncing) && s.locked == 0
        && !test_bit(STRIPE_COMPUTE_RUN, &sh->state)
        && !test_bit(STRIPE_REPLACED, &sh->state)) {
        /* Write out to replacement devices where possible */
        for (i = 0; i < conf->raid_disks; i++)
            if (test_bit(R5_NeedReplace, &sh->dev[i].flags)) {
                *WARN_ON(!test_bit(R5_UPTODATE, &sh->dev[i].flags));*
                set_bit(R5_WantReplace, &sh->dev[i].flags);
                set_bit(R5_LOCKED, &sh->dev[i].flags);
                s.locked++;
            }
        if (s.replacing)
            set_bit(STRIPE_INSYNC, &sh->state);
        set_bit(STRIPE_REPLACED, &sh->state);
    }
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/dm-devel/attachments/20180706/291b7109/attachment.htm>