[dm-devel] [RFC] Message ioctl direct to device

Fri Jul 9 16:52:00 UTC 2004

On Thursday 08 July 2004 9:34 pm, Daniel Phillips wrote:
> Hi all,
>
> The specific problem I need to solve was, userspace needs to pass a
> socket fd to my virtual device, so that it may connect (or reconnect) to
> a cluster.  I've played with a few different ways of doing this,
> including using the proposed libdevmapper DM_TARGET_MSG interface and
> hacking my own custom socket connection interface.
>
> The proposed DM_TARGET_MSG interface will in fact do what I want, but it
> has some irritations:
>
>   - Why bother ioctling the dm control device when we can ioctl
>     the virtual device directly?

We can certainly discuss adding ioctl support for the block-devices in 
addition to the control device. As Alasdair mentioned the other day, it could 
be handy when all you know about the device is the major:minor.

But, we definitely don't want to get rid of the char-device. You only need to 
look at some of the MD weirdness to understand how useful and simple a single 
char-device can be for ioctl management.

>   - What is the advantage of passing the sector address instead
>     a direct index into the target table?

Either way...doesn't matter much to me. In theory, using the "index" of the 
target is easier that figuring out the corresponding sector value. However, 
in practice, you don't often have devices that have multiple targets. When 
you do, they're usually just combintions of linear and/or striped mappings, 
which aren't likely to need any private messages. When you have something 
like multipath or snapshot (where the message ioctl is most likely to be 
used), the device almost always has a single target, which means you can just 
specify zero either for the target index or the sector offset.

>   - Not passing the length of the message string is evil

Perhaps....but that's the way all the other DM ioctls work. The message is 
just a string, so you'll know when you hit the end. Plus, the ioctl header 
(struct dm_ioctl) contains the full size of the data passed in, so you should 
be able to figure out if you've somehow overrun the end of the buffer.

>   - I don't see the purpose of the  __dev_status call in the
>     target_message function

Like in many of the other DM ioctls, it's there to make sure all the header 
fields are filled in (things like open-count and event-nr) so that user-space 
doesn't need to do another ioctl just to get the basic status. However, it 
might make more sense to make that call after the call to the target's 
message routine.

>   - That poor abused ioctl struct doesn't need more cruft grafted
>     onto it.

I disagree. I really like the way the ioctl interface works. It provides a 
nice header structure to give you all the basic information you need about 
the desired device, and allows for easily adding new ioctls without having to 
break the existing interface. If we decide to do block-device ioctls, I 
really think we need to keep this header structure for all the ioctls, or at 
least something very similar.

> I talked this over with Alasdair and, finding him receptive to at least
> some of these points, thought I might as well take a run at rerolling my
> custom socket interface as a more general message interface.  The result
> of that was only a few lines longer than the single-purpose original,
> and pleases me at least.  Let's see what you think of it.
>
> Some notes:
>
>   - The message is copied via the kernel stack, so it has to be
>     fairly small.  The current concensus is seems to be that 64
>     bytes isn't too abusive.  Otherwise we have to go kmalloc
>     things, which can get scary in PF_MEMALLOC situations, or
>     fuss around with preallocation.  In my opinion, if we have
>     more than 64 bytes to pass an ioctl is the wrong way to do
>     it, but you can always pass a user space pointerin the
>     message if you absolutely must.  I for one am completely
>     satisfied with a 64 byte message limit, and actually only
>     need 4 or 8 bytes depending on platform.

Embedded user-space pointers suck. Another thing I really like about the DM 
ioctls....only one copy_to/from_user is needed since everything is always in 
a flat buffer.

>   - Constructing a proper ioctl number is problematic since the
>     length of the passed structure is dynamic.  I used the length
>     of the fixed part of this ioctl, which is 3 ints.  The ioctl
>     number construction kit stuff is stupid anyway, but at least
>     I've made a valiant attempt to fit in.

The current ioctl interface mechanisms provide a simple way to determine the 
ioctl number, since the header is always a fixed size. And then the header 
specifies how much (if any) extra information follows the header struct. And 
there's no significant limit on how much extra info you can pass in after the 
header.

>   - I index the desired target directly instead of passing a
>     sector address as in the DM_TARGET_MSG interface, because
>     that way the application doesn't have to do arithmetic on
>     the device table to figure out how to message the right
>     target.  This is _much_ saner from the application's point
>     of view. In implementation, it's the same, one line either
>     way.

Again, either way is fine with me. It's certainly easier in general to 
determine the "index" of the target rather than the starting offset.

>   - I ioctl the virtual device directly, not the device mapper
>     control device.  This is much easier for the application,
>     more efficient and less racy.

What about the character interface is racy?

>   - This is 2.4 patch.  If I wasn't  on 2.4 at the time I
>     probably would have used the udm MSG interface, and just
>     put up with the warts, but I wasn't, so I decided to try
>     an alternate approach.  (Figuring out libdevmapper was,
>     um, fun.)
>
>   - For 2.6 I'd have to restore the virtual device ioctl hook
>     that got dropped somewhere along the way.  That's ok, it's
>     easy.

I don't believe the block-device ioctl stuff ever got dropped in 2.6, because 
it was probably never there in the first place. Once the block-layer starting 
handling the BLK* ioctls in common code (which was pretty early on in 2.5) 
there was no need for block-device ioctls in DM (except for the LV_BMAP mess 
for lilo, which also wasn't necessary once we found a way to make lilo talk 
to libdevmapper instead).

> Here is the prototype for the .message target method:
>
>   int (*dm_message_fn)(struct dm_target *target, int tag, void *msg, int
> len);
>
> Here is an example call from userspace.
>
>    ioctl(fd, DM_DEVMESSAGE, (int[4]){ 0, 0, sizeof(int), sock})

Ick. At the very least this should be using "uint32_t" instead of "int". 
Always use size-specific fields when doing ioctls. I have no desire to see 
any 32/64 ioctl-compat junk for DM.

> It's just one line, there's no need for a library.  The hardest part is
> finding the header file that defines DM_DEVMESSAGE.  The second 0 is
> the message tag.  Actually, I must pass two different kinds of sockets,
> so there's no getting away from some kind of tag.  The only question
> is, should it be in the message, or explicitly handled as a parameter
> to the target method.  If it were done the second way, the call would
> look like:
>
>    ioctl(fd, DM_DEVMESSAGE, (int[4]){ 0, 2*sizeof(int), 0, sock})
>
> So it is marginally nicer with the tag handled explicitly, and only a
> miniscule amount of extra code.  A message generally needs a tag, so...
>
> After I get my socket connected there's no further need for ioctl
> traffic to my device since further commands to go over the socket,
> except that if the socket breaks, it needs to be reconnected via the
> message ioctl.

Doing ioctls directly to the block-devices is fine, but I think we need to use 
the same/similar setup as with the char-device. I think Alasdair's 
target-message ioctl patch is the right way to go, or at least on the right 
track.

-- 
Kevin Corry
kevcorry at us.ibm.com
http://evms.sourceforge.net/