[libvirt] [PATCH 3/3] virCommand: use procfs to learn opened FDs

Michal Prívozník mprivozn at redhat.com
Sat Jul 13 07:41:59 UTC 2019


On 7/12/19 6:55 PM, Eric Blake wrote:
> On 7/3/19 2:19 AM, Michal Privoznik wrote:
>> When spawning a child process, between fork() and exec() we close
>> all file descriptors and keep only those the caller wants us to
>> pass onto the child. The problem is how we do that. Currently, we
>> get the limit of opened files and then iterate through each one
>> of them and either close() it or make it survive exec(). This
>> approach is suboptimal (although, not that much in default
>> configurations where the limit is pretty low - 1024). We have
>> /proc where we can learn what FDs we hold open and thus we can
>> selectively close only those.
>>
>> Signed-off-by: Michal Privoznik <mprivozn at redhat.com>
>> ---
>>  src/util/vircommand.c | 78 ++++++++++++++++++++++++++++++++++++++-----
>>  1 file changed, 70 insertions(+), 8 deletions(-)
> 
>> +# ifdef __linux__
>> +/* On Linux, we can utilize procfs and read the table of opened
>> + * FDs and selectively close only those FDs we don't want to pass
>> + * onto child process (well, the one we will exec soon since this
>> + * is called from the child). */
>> +static int
>> +virCommandMassCloseGetFDsLinux(virCommandPtr cmd ATTRIBUTE_UNUSED,
>> +                               virBitmapPtr fds)
>> +{
>> +    DIR *dp = NULL;
>> +    struct dirent *entry;
>> +    const char *dirName = "/proc/self/fd";
>> +    int rc;
>> +    int ret = -1;
>> +
>> +    if (virDirOpen(&dp, dirName) < 0)
>> +        return -1;
> 
> Unfortunately, POSIX says that opendir()/readdir() are not async-safe
> (the list of async-safe functions is
> https://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_04;
> some implementations of opendir() call malloc(), and malloc() is
> definitely not async-safe - if you ever fork() in one thread while
> another thread is locked inside a malloc, your child process is
> deadlocked if it has to malloc because the forked process no longer has
> the thread to release the malloc lock).  It also says readdir() is not
> threadafe
> (https://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_09)
> 
> Therefore, opendir() (hidden in virDirOpen) and readdir() (hidden in
> virDirRead) are generally unsafe to be used between fork() of a
> multi-threaded app and the subsequent exec(). But maybe you are safe
> because this code is only compiled on Linux? Are we absolutely sure this
> can't deadlock, in spite of violating POSIX constraints on async-safety?
> 
> http://austingroupbugs.net/view.php?id=696 points out some interesting
> problems from the POSIX side - readdir_r() is absolutely broken, and
> readdir() being marked as thread-unsafe may be too strict.  But then
> again, http://austingroupbugs.net/view.php?id=75 states
>    "The application shall not modify the structure to which the
>      return value of readdir() points, nor any storage areas pointed to
>      by pointers within the structure. The returned pointer, and
>      pointers within the structure, might be invalidated or the
>      structure or the storage areas might be overwritten by a
>      subsequent call to readdir() on the same directory stream.
>      They shall not be affected by a call to readdir() on a different
>      directory stream."
> where it may be the intent that if opendir() doesn't take any locks or
> other async-unsafe actions, using opendir() after fork() may be safe
> after all (readdir on a DIR* that was opened in the parent is not, but
> readdir() on a fresh DIR* opened in the child might be safe, even if not
> currently strictly portable).

D'oh. POSIX and its limitations. I could allocate the bitmap in the
parent, sure. But avoiding opendir() is not possible (that's the whole
point of this patch). While testing this - on Linux - I did not run into
any deadlock, but maybe I was just lucky. On the other hand, this is
going to run only on Linux, on anything else we'll still iterate over
all FDs.

I find it rather surprising (in a disturbing way) that there is no
better solution to this problem than iterating over all FDs and closing
them. One by one.

Michal




More information about the libvir-list mailing list