[libvirt] [Qemu-devel] [PATCH v5 6/6] block: Enable qemu_open/close to work with fd sets

Kevin Wolf kwolf at redhat.com
Wed Jul 25 08:22:46 UTC 2012


Am 25.07.2012 05:41, schrieb Corey Bryant:
>>> diff --git a/block/raw-posix.c b/block/raw-posix.c
>>> index a172de3..5d0a801 100644
>>> --- a/block/raw-posix.c
>>> +++ b/block/raw-posix.c
>>> @@ -271,7 +271,7 @@ static int raw_open_common(BlockDriverState *bs, const char *filename,
>>>   out_free_buf:
>>>       qemu_vfree(s->aligned_buf);
>>>   out_close:
>>> -    qemu_close(fd);
>>> +    qemu_close(fd, filename);
>>>       return -errno;
>>>   }
>>
>> Hm, not a nice interface where qemu_close() needs the filename and
>> (worse) could be given a wrong filename. Maybe it would be better to
>> maintain a list of fd -> fdset mappings in qemu_open/close?
>>
> 
> I agree, I don't really like it either.
> 
> We already have a list of fd -> fdset mappings (mon_fdset_fd_t -> 
> mon_fdset_t).  Would it be too costly to loop through all the fdsets/fds 
> at the beginning of every qemu_close()?

I don't think so. qemu_close() is not a fast path and happens almost
never, and the list is short enough that searching it isn't a problem
anyway.

>>> +            switch (flags & O_ACCMODE) {
>>> +            case O_RDWR:
>>> +                if ((mon_fd_flags & O_ACCMODE) == O_RDWR) {
>>> +                    return mon_fdset_fd->fd;
>>> +                }
>>> +                break;
>>> +            case O_RDONLY:
>>> +                if ((mon_fd_flags & O_ACCMODE) == O_RDONLY) {
>>> +                    return mon_fdset_fd->fd;
>>> +                }
>>> +                break;
>>> +            case O_WRONLY:
>>> +                if ((mon_fd_flags & O_ACCMODE) == O_WRONLY) {
>>> +                    return mon_fdset_fd->fd;
>>> +                }
>>> +                break;
>>> +            }
>>
>> I think you mean:
>>
>>    if ((flags & O_ACCMODE) == (mon_fd_flags & O_ACCMODE)) {
>>        return mon_fdset_fd->fd;
>>    }
> 
> Yes, that would be a bit simpler wouldn't it. :)
> 
>>
>> What about other flags that cannot be set with fcntl(), like O_SYNC on
>> older kernels or possibly non-Linux? (The block layer doesn't use it any
>> more, but I think we want to keep the function generally useful)
>>
> 
> I see what you're getting at here.  Basically you could have 2 fds in an 
> fdset with the same access mode flags, but one has O_SYNC on and the 
> other has O_SYNC off.  That seems like it would make sense to implement. 
>   As a work-around, I think a user could just create a separate fdset 
> for the same file with different O_SYNC value.  But from a client 
> perspective, it would be nicer to have this taken care of for you.

Now that the block layer doesn't use O_SYNC any more, it's more of a
theoretical point. I don't think there's any other place, where we'd
need to switch O_SYNC during runtime.

Taking it into consideration is complicated by the fact that some
kernels allow to fcntl() O_SYNC and others don't, so enforcing a match
here wouldn't feel completely right either.

Maybe just leave it as it is. :-)

> 
>>> +        }
>>> +        errno = EACCES;
>>> +        return -1;
>>> +    }
>>> +    errno = ENOENT;
>>> +    return -1;
>>> +}
>>> +
>>>   /* mon_cmds and info_cmds would be sorted at runtime */
>>>   static mon_cmd_t mon_cmds[] = {
>>>   #include "hmp-commands.h"
>>
>>> @@ -75,6 +76,79 @@ int qemu_madvise(void *addr, size_t len, int advice)
>>>   #endif
>>>   }
>>>
>>> +/*
>>> + * Dups an fd and sets the flags
>>> + */
>>> +static int qemu_dup(int fd, int flags)
>>> +{
>>> +    int i;
>>> +    int ret;
>>> +    int serrno;
>>> +    int dup_flags;
>>> +    int setfl_flags[] = { O_APPEND, O_ASYNC, O_DIRECT, O_NOATIME,
>>> +                          O_NONBLOCK, 0 };
>>> +
>>> +    if (flags & O_CLOEXEC) {
>>> +        ret = fcntl(fd, F_DUPFD_CLOEXEC, 0);
>>> +        if (ret == -1 && errno == EINVAL) {
>>> +            ret = dup(fd);
>>> +            if (ret != -1 && fcntl_setfl(ret, O_CLOEXEC) == -1) {
>>> +                goto fail;
>>> +            }
>>> +        }
>>> +    } else {
>>> +        ret = dup(fd);
>>> +    }
>>> +
>>> +    if (ret == -1) {
>>> +        goto fail;
>>> +    }
>>> +
>>> +    dup_flags = fcntl(ret, F_GETFL);
>>> +    if (dup_flags == -1) {
>>> +        goto fail;
>>> +    }
>>> +
>>> +    if ((flags & O_SYNC) != (dup_flags & O_SYNC)) {
>>> +        errno = EINVAL;
>>> +        goto fail;
>>> +    }
>>
>> It's worth trying to set it before failing, newer kernels can do it. But
>> as I said above, if you can fail here, it makes sense to consider O_SYNC
>> when selecting the right file descriptor from the fdset.
>>
> 
> I'm on a 3.4.4 Fedora kernel that doesn't appear to support 
> fcntl(O_SYNC), but perhaps I'm doing something wrong.  Here's my test 
> code (shortened for simplicty): [...]

Hm, true. So it seems that patch has never made it into the kernel, in
fact...

>>> +
>>> +    qemu_set_cloexec(ret);
>>
>> Wait... If O_CLOEXEC is set, you set the flag immediately and if it
>> isn't you set it at the end of the function? What's the intended use
>> case for not using O_CLOEXEC then?
>>
> 
> This is a mistake.  I think I just need to be using qemu_set_cloexec() 
> instead of fcntl_setfl() earlier in this function and get rid of this 
> latter call to qemu_set_cloexec().

Yes, probably. And in fact, I think this shouldn't even be conditional
on flags & O_CLOEXEC. The whole reason qemu_open() was introduced
originally was to always set O_CLOEXEC.

Kevin




More information about the libvir-list mailing list