<html>
  <head>
    <meta content="text/html; charset=ISO-8859-1"
      http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <div class="moz-cite-prefix">I've run into some terrible performance
      when I've had a lot of add/remove actions on the filesystem in
      parallel.  They were mostly due to fragmentation.  Alas, XFS can
      get some horrid fragmentation.<br>
      <br>
      xfs_db -c frag -r /dev/<node><br>
      <br>
      should give you the stats on its fragmentation.<br>
      <br>
      I can't speak for others, but I've got 'xfs_fsr' linked into
      /etc/cron.weekly/ on my personal systems with large XFS
      filesystems.<br>
      <br>
      Pat<br>
      <br>
      <br>
      <br>
      <br>
      On 04/15/2013 07:58 AM, Daryl Herzmann wrote:<br>
    </div>
    <blockquote
cite="mid:CALOVXPd0SLWh6Li87HDpKeXWfYBH2RD8iS0DC0G1DsVJZ3N8Tg@mail.gmail.com"
      type="cite">
      <meta http-equiv="Content-Type" content="text/html;
        charset=ISO-8859-1">
      <div dir="ltr">Good morning,
        <div><br>
        </div>
        <div>Thanks for the response and the fun never stops!  This
          system crashed on Saturday morning with the following </div>
        <div><br>
        </div>
        <div>
          <div><4>------------[ cut here ]------------</div>
          <div><2>kernel BUG at include/linux/swapops.h:126!</div>
          <div><4>invalid opcode: 0000 [#1] SMP </div>
          <div><4>last sysfs file: /sys/kernel/mm/ksm/run</div>
          <div><4>CPU 7 </div>
          <div><4>Modules linked in: iptable_filter ip_tables nfsd
            nfs lockd fscache auth_rpcgss nfs_acl sunrpc bridge stp llc
            ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state
            nf_conntrack ip6table_filter ip6_tables ipv6 xfs exportfs
            vhost_net macvtap macvlan tun kvm_intel kvm raid456
            async_raid6_recov async_pq power_meter raid6_pq async_xor
            dcdbas xor microcode serio_raw async_memcpy async_tx
            iTCO_wdt iTCO_vendor_support i7core_edac edac_core sg bnx2
            ext4 mbcache jbd2 sr_mod cdrom sd_mod crc_t10dif pata_acpi
            ata_generic ata_piix wmi mpt2sas scsi_transport_sas
            raid_class dm_mirror dm_region_hash dm_log dm_mod [last
            unloaded: speedstep_lib]</div>
          <div><4></div>
          <div><4>Pid: 4581, comm: ssh Not tainted
            2.6.32-358.2.1.el6.x86_64 #1 Dell Inc. PowerEdge T410/0Y2G6P</div>
          <div><4>RIP: 0010:[<ffffffff8116c501>]
             [<ffffffff8116c501>] migration_entry_wait+0x181/0x190</div>
          <div><4>RSP: 0000:ffff8801c1703c88  EFLAGS: 00010246</div>
          <div><4>RAX: ffffea0000000000 RBX: ffffea0003bf6f58 RCX:
            ffff880236437580</div>
          <div><4>RDX: 00000000001121fd RSI: ffff8801c040e5d8 RDI:
            000000002243fa3e</div>
          <div><4>RBP: ffff8801c1703ca8 R08: ffff8801c040e5d8 R09:
            0000000000000029</div>
          <div><4>R10: ffff8801d6850200 R11: 00002ad7d96cbf5a R12:
            ffffea0007bdec18</div>
          <div><4>R13: 0000000236437580 R14: 0000000236437067 R15:
            00002ad7d76b0000</div>
          <div><4>FS:  00002ad7dace2880(0000)
            GS:ffff880028260000(0000) knlGS:0000000000000000</div>
          <div><4>CS:  0010 DS: 0000 ES: 0000 CR0:
            0000000080050033</div>
          <div><4>CR2: 00002ad7d76b0000 CR3: 00000001bb686000 CR4:
            00000000000007e0</div>
          <div><4>DR0: 0000000000000000 DR1: 0000000000000000 DR2:
            0000000000000000</div>
          <div><4>DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
            0000000000000400</div>
          <div><4>Process ssh (pid: 4581, threadinfo
            ffff8801c1702000, task ffff880261aa7500)</div>
          <div><4>Stack:</div>
          <div><4> ffff88024b5f22d8 0000000000000000
            000000002243fa3e ffff8801c040e5d8</div>
          <div><4><d> ffff8801c1703d88 ffffffff811441b8
            0000000000000000 ffff8801c1703d08</div>
          <div><4><d> ffff8801c1703eb8 ffff8801c1703dc8
            ffff880328cb48c0 0000000000000040</div>
          <div><4>Call Trace:</div>
          <div><4> [<ffffffff811441b8>]
            handle_pte_fault+0xb48/0xb50</div>
          <div><4> [<ffffffff81437dbb>] ?
            sock_aio_write+0x19b/0x1c0</div>
          <div><4> [<ffffffff8112c6d4>] ?
            __pagevec_free+0x44/0x90</div>
          <div><4> [<ffffffff811443fa>]
            handle_mm_fault+0x23a/0x310</div>
          <div><4> [<ffffffff810474c9>]
            __do_page_fault+0x139/0x480</div>
          <div><4> [<ffffffff81194fb2>] ?
            vfs_ioctl+0x22/0xa0</div>
          <div>
            <4> [<ffffffff811493a0>] ?
            unmap_region+0x110/0x130</div>
          <div><4> [<ffffffff81195154>] ?
            do_vfs_ioctl+0x84/0x580</div>
          <div><4> [<ffffffff8151339e>]
            do_page_fault+0x3e/0xa0</div>
          <div><4> [<ffffffff81510755>] page_fault+0x25/0x30</div>
          <div><4>Code: e8 f5 2f fc ff e9 59 ff ff ff 48 8d 53 08
            85 c9 0f 84 44 ff ff ff 8d 71 01 48 63 c1 48 63 f6 f0 0f b1
            32 39 c1 74 be 89 c1 eb e3 <0f> 0b eb fe 66 66 2e 0f
            1f 84 00 00 00 00 00 55 48 89 e5 48 83 </div>
          <div><1>RIP  [<ffffffff8116c501>]
            migration_entry_wait+0x181/0x190</div>
          <div><4> RSP <ffff8801c1703c88></div>
          <div><br>
          </div>
          <div style="">It rebooted itself, now I must have some
            filesytem corruption as this is being dumped frequently:</div>
          <div style=""><br>
          </div>
          <div style="">
            <div>XFS (md127): page discard on page ffffea0003c95018,
              inode 0x849ec442, offset 0.</div>
            <div>XFS: Internal error XFS_WANT_CORRUPTED_RETURN at line
              342 of file fs/xfs/xfs_alloc.c.  Caller 0xffffffffa02986c2</div>
            <div><br>
            </div>
            <div>Pid: 1304, comm: xfsalloc/7 Not tainted
              2.6.32-358.2.1.el6.x86_64 #1</div>
            <div>Call Trace:</div>
            <div> [<ffffffffa02c20cf>] ?
              xfs_error_report+0x3f/0x50 [xfs]</div>
            <div> [<ffffffffa02986c2>] ?
              xfs_alloc_ag_vextent_size+0x482/0x630 [xfs]</div>
            <div> [<ffffffffa0296a69>] ?
              xfs_alloc_lookup_eq+0x19/0x20 [xfs]</div>
            <div> [<ffffffffa0296d16>] ?
              xfs_alloc_fixup_trees+0x236/0x350 [xfs]</div>
            <div> [<ffffffffa02986c2>] ?
              xfs_alloc_ag_vextent_size+0x482/0x630 [xfs]</div>
            <div> [<ffffffffa029943d>] ?
              xfs_alloc_ag_vextent+0xad/0x100 [xfs]</div>
            <div> [<ffffffffa0299e8c>] ?
              xfs_alloc_vextent+0x2bc/0x610 [xfs]</div>
            <div> [<ffffffffa02a4587>] ?
              xfs_bmap_btalloc+0x267/0x700 [xfs]</div>
            <div> [<ffffffff8105e759>] ?
              find_busiest_queue+0x69/0x150</div>
            <div> [<ffffffffa02a4a2e>] ? xfs_bmap_alloc+0xe/0x10
              [xfs]</div>
            <div> [<ffffffffa02a4b0a>] ?
              xfs_bmapi_allocate_worker+0x4a/0x80 [xfs]</div>
            <div> [<ffffffffa02a4ac0>] ?
              xfs_bmapi_allocate_worker+0x0/0x80 [xfs]</div>
            <div> [<ffffffff81090ae0>] ? worker_thread+0x170/0x2a0</div>
            <div> [<ffffffff81096ca0>] ?
              autoremove_wake_function+0x0/0x40</div>
            <div> [<ffffffff81090970>] ? worker_thread+0x0/0x2a0</div>
            <div> [<ffffffff81096936>] ? kthread+0x96/0xa0</div>
            <div> [<ffffffff8100c0ca>] ? child_rip+0xa/0x20</div>
            <div> [<ffffffff810968a0>] ? kthread+0x0/0xa0</div>
            <div> [<ffffffff8100c0c0>] ? child_rip+0x0/0x20</div>
            <div>XFS (md127): page discard on page ffffea0003890fa0,
              inode 0x849ec441, offset 0.</div>
            <div><br>
            </div>
            <div style="">Anyway, to respond to your questions:</div>
          </div>
          <div class="gmail_extra"><br>
            <br>
            <div class="gmail_quote">On Mon, Apr 15, 2013 at 3:50 AM,
              Jussi Silvennoinen <span dir="ltr"><<a
                  moz-do-not-send="true"
                  href="mailto:jussi_rhel6@silvennoinen.net"
                  target="_blank">jussi_rhel6@silvennoinen.net</a>></span>
              wrote:<br>
              <blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
                <div class="im">
                  <blockquote class="gmail_quote" style="margin:0px 0px
                    0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">avg-cpu:
                     %user   %nice %system %iowait  %steal   %idle<br>
                              11.12    0.03    2.70    3.60    0.00  
                    82.56<br>
                    <br>
                    Device:            tps   Blk_read/s   Blk_wrtn/s  
                    Blk_read   Blk_wrtn<br>
                    md127           134.36     10336.87     11381.45
                    19674692141 21662893316<br>
                  </blockquote>
                  <br>
                </div>
                Do use iostat -x to see more details which will give a
                better indication how busy the disks are.</blockquote>
              <div><br>
              </div>
              <div>
                <div># iostat -x</div>
                <div>Linux 2.6.32-358.2.1.el6.x86_64 (iem21.local) <span
                    class="" style="white-space:pre"> </span>04/15/2013
                  <span class="" style="white-space:pre"> </span>_x86_64_<span
                    class="" style="white-space:pre"> </span>(16 CPU)</div>
                <div><br>
                </div>
                <div>avg-cpu:  %user   %nice %system %iowait  %steal  
                  %idle</div>
                <div>          10.33    0.00    3.31    2.24    0.00  
                  84.11</div>
                <div><br>
                </div>
                <div>Device:         rrqm/s   wrqm/s     r/s     w/s  
                  rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm
                   %util</div>
                <div>sda               3.48  1002.05   22.42   33.26
                   1162.56  8277.06   169.55     6.52  117.17   2.49
                   13.86</div>
                <div>sdc            3805.96   173.47  292.94   28.83
                  33747.35  1611.10   109.89     3.47   10.74   0.82
                   26.46</div>
                <div>sde            3814.91   174.53  285.98   29.92
                  33761.01  1628.96   112.03     5.70   17.97   0.97
                   30.63</div>
                <div>sdb            3813.98   173.45  284.85   28.66
                  33745.12  1609.93   112.77     4.07   12.94   0.91
                   28.48</div>
                <div>sdd            3805.78   174.18  294.19   29.35
                  33754.41  1621.14   109.34     3.81   11.73   0.84
                   27.32</div>
                <div>sdf            3813.80   173.68  285.46   29.04
                  33751.91  1614.36   112.45     4.70   14.91   0.93
                   29.17</div>
                <div>md127             0.00     0.00   21.75   45.85
                   4949.72  5919.63   160.78     0.00    0.00   0.00  
                  0.00</div>
              </div>
              <div><br>
              </div>
              <div style="">but I suspect this is inflated, since it
                just completed a raid5 resync.</div>
              <div><br>
              </div>
              <div> </div>
              <blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
                <div class="im">
                  <blockquote class="gmail_quote" style="margin:0px 0px
                    0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">I
                    have other similiar filesystems on ext4 with
                    similiar hardware and<br>
                    millions of small files as well.  I don't see such
                    sluggishness with small<br>
                    files and directories there.  I guess I picked XFS
                    for this filesystem<br>
                    initially because of its fast fsck times.<br>
                  </blockquote>
                  <br>
                </div>
                Are those other systems also employing software raid? In
                my experience, swraid is painfully slow with random
                writes. And your workload in this use case is exactly
                that.</blockquote>
              <div><br>
              </div>
              <div><br>
              </div>
              <div style="">
                Some of them are and some aren't.  I have an opportunity
                to move this workload to a hardware RAID5, so I may just
                do that and cut my losses :)</div>
              <div> </div>
              <blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
                <div class="im">
                  <blockquote class="gmail_quote" style="margin:0px 0px
                    0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">#
                    grep md127 /proc/mounts <br>
                    /dev/md127 /mesonet xfs<br>
                    rw,noatime,attr2,delaylog,sunit=1024,swidth=4096,noquota
                    0 0<br>
                  </blockquote>
                  <br>
                </div>
                inode64 is not used, I suspect it would have helped
                alot. Enabling it afterwards will not help for data
                which is already on the disk but it will help with new
                files.</blockquote>
              <div><br>
              </div>
              <div style="">Thanks for the tip, I'll try that out.</div>
              <div style=""><br>
              </div>
              <div style="">daryl</div>
              <div><br>
              </div>
            </div>
          </div>
        </div>
      </div>
      <br>
      <fieldset class="mimeAttachmentHeader"></fieldset>
      <br>
      <pre wrap="">_______________________________________________
rhelv6-list mailing list
<a class="moz-txt-link-abbreviated" href="mailto:rhelv6-list@redhat.com">rhelv6-list@redhat.com</a>
<a class="moz-txt-link-freetext" href="https://www.redhat.com/mailman/listinfo/rhelv6-list">https://www.redhat.com/mailman/listinfo/rhelv6-list</a></pre>
    </blockquote>
    <br>
    <br>
    <pre class="moz-signature" cols="78">-- 
Pat Riehecky

Scientific Linux developer
<a class="moz-txt-link-freetext" href="http://www.scientificlinux.org/">http://www.scientificlinux.org/</a></pre>
  </body>
</html>