[dm-devel] [RFC] directio and aio-max-nr

Bryn M. Reeves bmr at redhat.com
Thu Jul 2 17:12:19 UTC 2009


The number of available aio contexts is limited by the fs.aio-max-nr
sysctl. This means it's possible for the io_setup() call that the
directio path checker uses in its libcheck_init() to fail with EAGAIN if
there are already aio-max-nr outstanding event contexts allocated.

On a system with several users of the aio interface it's possible for
this to happen, e.g. when running multipath -ll, leading to output like
this:

sdb: ownership set to 200a0d1c5f449bf01
sdb: not found in pathvec
sdb: mask = 0xc
sdb: path checker = directio (config file default)
io_setup failed
sda: ownership set to 200a0d1c5f449bf01
sda: not found in pathvec
sda: mask = 0xc
sda: path checker = directio (config file default)
io_setup failed
[...]
create: 200a0d1c5f449bf01 n/a Inventec,IX3150-FS200
[size=1.6T][features=0][hwhandler=0]
\_ round-robin 0 [prio=0][undef]
 \_ 1:0:0:0 sdb 8:16  [undef][faulty]
\_ round-robin 0 [prio=0][undef]
 \_ 0:0:0:0 sda 8:0   [undef][faulty]

There's nothing wrong with the paths but since the checker cannot
allocate an aio context it is unable to determine the path status and
reports them as faulty.

Of course, the administrator should be monitoring the use of aio on the
system and setting fs.aio-max-nr to an appropriate value but we're
seeing quite frequent reports of this problem (particularly for Oracle
users with NetApp storage as the NetApp defaults to the directio
checker) and the use of aio by the directio checker is not currently
documented.

I think it's worth documenting this interaction in the FAQ (patch
attached). It think it may also be worth trying to make this more robust
or at least warning the administrator of the problem.

So far, the reports I've seen of this problem have affected users
running multipath -ll manually since this will create new checkers for
each path. Since multipathd starts before most aio-using applications
will start it's able to grab the contexts that it needs. In principal
though it would be possible for multipathd to run into problems if for
e.g. a user added new storage to the system post-boot.

In this situation I'm not sure if it's best to keep trying to allocate
an aio context indefinitely in the checker thread or to fail immediately
and warn the administrator of the situation?

Regards,
Bryn.

-------------- next part --------------
An embedded message was scrubbed...
From: Bryn M. Reeves <bmr at redhat.com>
Subject: [PATCH] Document aio-max-nr in FAQ
Date: Thu, 2 Jul 2009 16:39:31 +0100
Size: 2199
URL: <http://listman.redhat.com/archives/dm-devel/attachments/20090702/45dd9943/attachment.eml>


More information about the dm-devel mailing list