[dm-devel] [RFC] Message ioctl direct to device

Fri Jul 9 19:28:24 UTC 2004

On Friday 09 July 2004 12:52, Kevin Corry wrote:
> On Thursday 08 July 2004 9:34 pm, Daniel Phillips wrote:
> > Hi all,
> >
> > The specific problem I need to solve was, userspace needs to pass a
> > socket fd to my virtual device, so that it may connect (or
> > reconnect) to a cluster.  I've played with a few different ways of
> > doing this, including using the proposed libdevmapper DM_TARGET_MSG
> > interface and hacking my own custom socket connection interface.
> >
> > The proposed DM_TARGET_MSG interface will in fact do what I want,
> > but it has some irritations:
> >
> >   - Why bother ioctling the dm control device when we can ioctl
> >     the virtual device directly?
>
> We can certainly discuss adding ioctl support for the block-devices
> in addition to the control device. As Alasdair mentioned the other
> day, it could be handy when all you know about the device is the
> major:minor.
>
> But, we definitely don't want to get rid of the char-device.

I didn't suggest that.  For one thing, there has to be some way to 
create a device, you can't ioctl a device that doesn't exist yet.

> You only 
> need to look at some of the MD weirdness to understand how useful and
> simple a single char-device can be for ioctl management.

I haven't done so but it sounds like good sport, I'll go trolling 
through the archives.  Anything in particular I should be looking for?

> >   - What is the advantage of passing the sector address instead
> >     a direct index into the target table?
>
> Either way...doesn't matter much to me. In theory, using the "index"
> of the target is easier that figuring out the corresponding sector
> value. However, in practice, you don't often have devices that have
> multiple targets. When you do, they're usually just combintions of
> linear and/or striped mappings, which aren't likely to need any
> private messages. When you have something like multipath or snapshot
> (where the message ioctl is most likely to be used), the device
> almost always has a single target, which means you can just specify
> zero either for the target index or the sector offset.

Except that you can't just duck out and not handle the bizarre case 
where somebody does put two different snapshots on the same virtual 
device, or something equally silly.  This needs more research, see 
below.

> >   - Not passing the length of the message string is evil
>
> Perhaps....but that's the way all the other DM ioctls work. The
> message is just a string, so you'll know when you hit the end. Plus,
> the ioctl header (struct dm_ioctl) contains the full size of the data
> passed in, so you should be able to figure out if you've somehow
> overrun the end of the buffer.

It's also evil because you then force ascii conversions on both ends, 
appropriate or not.  Passing the length is just good and correct, 
having traditionally done it sloppily is not a powerful argument.  The 
existing 'create' interface is modeled on c mainline argument passing, 
this new one is not, and has no need to be.

> >   - I don't see the purpose of the  __dev_status call in the
> >     target_message function
>
> Like in many of the other DM ioctls, it's there to make sure all the
> header fields are filled in (things like open-count and event-nr) so
> that user-space doesn't need to do another ioctl just to get the
> basic status. However, it might make more sense to make that call
> after the call to the target's message routine.

Yes.  The normal errno return should provide everything needed for this 
interface.

> >   - That poor abused ioctl struct doesn't need more cruft grafted
> >     onto it.
>
> I disagree. I really like the way the ioctl interface works

Hmm, I'm having a tough time choosing between:

  A) then that makes you part of the problem

or

  B) an interface so ugly it could only be loved by its mother.

;-)

> It provides a nice header structure to give you all the basic
> information you need about the desired device, and allows for easily
> adding new ioctls without having to break the existing interface. If
> we decide to do block-device ioctls, I really think we need to keep
> this header structure for all the ioctls, or at least something very
> similar.

It adds only obfuscation for the message ioctl, particularly if we are 
not returning anything more elaborate than an error code.

> > I talked this over with Alasdair and, finding him receptive to at
> > least some of these points, thought I might as well take a run at
> > rerolling my custom socket interface as a more general message
> > interface.  The result of that was only a few lines longer than the
> > single-purpose original, and pleases me at least.  Let's see what
> > you think of it.
> >
> > Some notes:
> >
> >   - The message is copied via the kernel stack, so it has to be
> >     fairly small.  The current concensus is seems to be that 64
> >     bytes isn't too abusive.  Otherwise we have to go kmalloc
> >     things, which can get scary in PF_MEMALLOC situations, or
> >     fuss around with preallocation.  In my opinion, if we have
> >     more than 64 bytes to pass an ioctl is the wrong way to do
> >     it, but you can always pass a user space pointerin the
> >     message if you absolutely must.  I for one am completely
> >     satisfied with a 64 byte message limit, and actually only
> >     need 4 or 8 bytes depending on platform.
>
> Embedded user-space pointers suck. Another thing I really like about
> the DM ioctls....only one copy_to/from_user is needed since
> everything is always in a flat buffer.

In this case, the additional copy adds one line of code, saves a little 
copying work, and makes the user space code _much_ nicer.

Note that the uuid and name are entirely redundant if you ioctl the 
device directly, as is every other field in the dm_ioctl as far as I 
can see.  For example, we do not need to version the message interface 
if it's done correctly in the first place.  So what is the purpose of 
using the struct dm_ioctl when the interface works perfectly well 
without it, is perfectly well protected, and uses less code?

> >   - Constructing a proper ioctl number is problematic since the
> >     length of the passed structure is dynamic.  I used the length
> >     of the fixed part of this ioctl, which is 3 ints.  The ioctl
> >     number construction kit stuff is stupid anyway, but at least
> >     I've made a valiant attempt to fit in.
>
> The current ioctl interface mechanisms provide a simple way to
> determine the ioctl number, since the header is always a fixed size.
> And then the header specifies how much (if any) extra information
> follows the header struct. And there's no significant limit on how
> much extra info you can pass in after the header.

I didn't waste much time on that issue, my solution is exactly as 
useless as the mechanism itself is.

> >   - I index the desired target directly instead of passing a
> >     sector address as in the DM_TARGET_MSG interface, because
> >     that way the application doesn't have to do arithmetic on
> >     the device table to figure out how to message the right
> >     target.  This is _much_ saner from the application's point
> >     of view. In implementation, it's the same, one line either
> >     way.
>
> Again, either way is fine with me. It's certainly easier in general
> to determine the "index" of the target rather than the starting
> offset.

OK, but I need to quiz Alasdair about his statement that the index could 
quietly change on the fly.  If that's so, then the sector offset would 
definitely be better, but at this point, it's not clear to me why it 
would be so.

> >   - I ioctl the virtual device directly, not the device mapper
> >     control device.  This is much easier for the application,
> >     more efficient and less racy.
>
> What about the character interface is racy?

Somebody could remove and recreate the device between any two of your 
calls, there's nothing to prevent it.  Talking directly to the virtual 
device, the only unavoidable race is the one between creating the 
virtual device and opening it for ioctling: 

> > Here is the prototype for the .message target method:
> >
> >   int (*dm_message_fn)(struct dm_target *target, int tag, void
> > *msg, int len);
> >
> > Here is an example call from userspace.
> >
> >    ioctl(fd, DM_DEVMESSAGE, (int[4]){ 0, 0, sizeof(int), sock})
>
> Ick. At the very least this should be using "uint32_t" instead of
> "int". Always use size-specific fields when doing ioctls. I have no
> desire to see any 32/64 ioctl-compat junk for DM.

Yes, it should be unit32_t, and in the kernel patch as well, that was 
just sloppy.

> > It's just one line, there's no need for a library.  The hardest
> > part is finding the header file that defines DM_DEVMESSAGE.  The
> > second 0 is the message tag.  Actually, I must pass two different
> > kinds of sockets, so there's no getting away from some kind of tag.
> >  The only question is, should it be in the message, or explicitly
> > handled as a parameter to the target method.  If it were done the
> > second way, the call would look like:
> >
> >    ioctl(fd, DM_DEVMESSAGE, (int[4]){ 0, 2*sizeof(int), 0, sock})
> >
> > So it is marginally nicer with the tag handled explicitly, and only
> > a miniscule amount of extra code.  A message generally needs a tag,
> > so...
> >
> > After I get my socket connected there's no further need for ioctl
> > traffic to my device since further commands to go over the socket,
> > except that if the socket breaks, it needs to be reconnected via
> > the message ioctl.
>
> Doing ioctls directly to the block-devices is fine, but I think we
> need to use the same/similar setup as with the char-device.

OK, let's fix the char-device then ;-)

Seriously, struct dm_ioctl interface is more horrible than you know. 
Maybe I should demonstrate that by asking an innocent victim to write 
code to connect a socket to my device using both approaches.

That ioctl interface is just plain horrible, it should really stay out 
of view inside libdevmapper in case somebody notices.  And we do not 
want to make libdevmapper a dependency just to ioctl the virtual 
device.  Going through libdevmapper makes plenty of sense for things 
like suspend/resume where there is a lot of deadlock-prevention logic 
that has to be followed rigorously, but no sense whatsoever for what I 
use this interface, and what I expect other people might use it for.  
Of course, you are one of those "other people", I think, with 
multipath, so it would help if you tell me what you have in mind there.

> I think 
> Alasdair's target-message ioctl patch is the right way to go, or at
> least on the right track.

It has warts that are not justified by anything besides tradition, but 
there is no tradition.  The reason I will argue hard for a clean 
approach at this point is, the message ioctl to the device is probably 
going to stick around a lot longer and get used by more people than the 
struct dm_ioctl interface, which we can hide and hopefully replace by 
something nicer one day.  Please, let's have the direct device message 
ioctl clean and sane.

Regards,

Daniel