Disk IO issues

Mon Jan 19 16:02:18 UTC 2009

On Wed, 31 Dec 2008, Mike McGrath wrote:

> Lets pool some knowledge together because at this point, I'm missing
> something.
>
> I've been doing all measurements with sar as bonnie, etc, causes builds to
> timeout.
>
> Problem: We're seeing slower then normal disk IO.  At least I think we
> are.  This is a PERC5/E and MD1000 array.
>
> When I try to do a normal copy "cp -adv /mnt/koji/packages /tmp/" I get
> around 4-6MBytes/s
>
> When I do a cp of a large file "cp /mnt/koji/out /tmp/" I get
> 30-40MBytes/s.
>
> Then I "dd if=/dev/sde of=/dev/null" I get around 60-70 MBytes/s read.
>
> If I "cat /dev/sde > /dev/null" I get between 225-300MBytes/s read.
>
> The above tests are pretty consistent.  /dev/sde is a raid5 array,
> hardware raid.
>
> So my question here is, wtf?  I've been working to do a backup which I
> would think would either cause network utilization to max out, or disk io
> to max out.  I'm not seeing either.  Sar says the disks are 100% utilized
> but I can cause major increases in actual disk reads and writes by just
> running additional commands.  Also, if the disks were 100% utilized I'd
> expect we would see lots more iowait.  We're not though, iowait on the box
> is only %0.06 today.
>
> So, long story short, we're seeing much better performance when just
> reading or writing lots of data (though dd is many times slower then cat).
> But with our real-world traffic, we're just seeing crappy crappy IO.
>
> Thoughts, theories or opinions?  Some of the sysadmin noc guys have access
> to run diagnostic commands, if you want more info about a setting, let me
> know.
>
> I should also mention there's lots going on with this box, for example its
> hardware raid, lvm and I've got xen running on it (though the tests above
> were not in a xen guest).
>

So we all talked about this quite a bit so I felt the need to let everyone
know the latest status.  One of our goals was to lower utilization on the
netapp.  While high utilization itself isn't a problem, its just a
measurement after all, we did decide other problems could be solved if we
could get utilization to go down.

So after a bunch of tweaking on the share and in the scripts we run,
average utilization has dropped significantly.  Take a look here:

http://mmcgrath.fedorapeople.org/util.html

Thats a latest 30 day view (from a couple days ago).  You'll notice it was
around 90-100% pretty much all the time.  That went on like that for
MONTHS.  Even christmas day was pretty busy even though that whole period
we generally saw low traffic everywhere else in Fedora.

Now we're sitting pretty with a 20% utilization average.  You'll also
notice generally our service time and await are lower.  I'm trying to get
a bigger view of those numbers over time so we'll see if thats an actual
trend or not.

The big changers?  1) Better use of the share in our scripts.  2) A larger
readahead value (blockdev)

Some smaller changes included changing from cfq to deadline (and now
noop).

In the future there are two things I'd still like to do long term.

1) Move our snapshots to different devices to lower our seeks
2) Full re-index of the filesystem (requiring around 24-36 hours of
downtime) but I'm going to schedule this sometime after the Alpha ships.

	-Mike