[libvirt] [PATCHv3 1/2] util: refactor virFileOpenAs

Mon Jan 30 06:43:25 UTC 2012

On 01/27/2012 02:20 PM, Eric Blake wrote:
> On 01/27/2012 10:43 AM, Laine Stump wrote:
>> virFileOpenAs previously would only try opening a file as the current
>> user, or as a different user, but wouldn't try both methods in a
>> single call. This made it cumbersome to use as a replacement for
>> open(2). Additionally, it had a lot of historical baggage that led to
>> it being difficult to understand.
>>
>> This patch refactors virFileOpenAs in the following ways:
>>
>> V3 differences - Eric Blake wondered if fchmod() issued by uid 0 of a
>> file opened as a different user on a root-squashed NFS would actually
>> work. It turns out it won't, so I put fchmod back into the child
>> process. And since there are already so many other pieces of code
>> related with this that would try to log a message on failure
>> (e.g. virSetUIDGID()), I backed off on that as well, and have put the
>> log messages in child process code back in - that's a bigger problem
>> for another day.
> Sometimes it seems like those big nasty problems are there until someone
> actually hits them in practice.  Oh well.  Even if we don't make it 100%
> perfect, I still think that your rewrite makes it easier to understand,
> and therefore easier to maintain, should we come across the day where we
> need to fix the nasty problems.
>
> And for anyone in the future re-reading this thread while trying to fix
> the problems (hello there!), here's an idea - virSetUIDGID should be
> broken into two functions, virSetUIDGIDRaw, which only uses async-safe
> functions and returns -errno on failure, and virSetUIDGID that formats
> errors into log message for normal users.
>
>>       } else {
>> -        if ((fd = virFileOpenAs(path, oflags, S_IRUSR | S_IWUSR,
>> -                                uid, gid, 0))<  0) {
>> +        if ((fd = virFileOpenAs(path, oflags, S_IRUSR | S_IWUSR, uid, gid,
>> +                                vfoflags | VIR_FILE_OPEN_NOFORK))<  0) {
>>               /* If we failed as root, and the error was permission-denied
>>                  (EACCES or EPERM), assume it's on a network-connected share
>>                  where root access is restricted (eg, root-squashed NFS). If the
>> @@ -2385,7 +2387,7 @@ qemuOpenFile(struct qemud_driver *driver, const char *path, int oflags,
>>               if ((fd = virFileOpenAs(path, oflags,
>>                                       S_IRUSR|S_IWUSR|S_IRGRP|S_IWGRP,
>>                                       driver->user, driver->group,
>> -                                    VIR_FILE_OPEN_AS_UID))<  0) {
>> +                                    vfoflags | VIR_FILE_OPEN_FORK))<  0) {
> You weren't kidding about the difference in 0600 vs. 0660 permissions
> between the two attempts.  My gut feel is that we want 0660 (that's why
> we invented virSetUIDGID in the first place - people like to use group
> permissions and membership in multiple groups as a reasonable form of
> access management); but I also agree with your decision to keep this
> patch semantically unchanged, and leave any permissions modifications
> for a later patch in isolation.
>
>> -/* return -errno on failure, or 0 on success */
>> +/* virFileOpenForked() - an internal utility function called only by
>> + * virFileOpenAs(). It forks, then the child does setuid+setgid to
>> + * given uid:gid and attempts to open the file, while the parent just
>> + * calls recvfd to get the open fd back from the child. returns the
>> + * fd, or -errno if there is an error. */
>>   static int
>> -virFileOpenAsNoFork(const char *path, int openflags, mode_t mode,
>> -                    uid_t uid, gid_t gid, unsigned int flags)
> Here, I'll just paste the code post-patch and review that in isolation,
> rather than caring about the delta:
>
>> /* virFileOpenForked() - an internal utility function called only by
>>   * virFileOpenAs(). It forks, then the child does setuid+setgid to
>>   * given uid:gid and attempts to open the file, while the parent just
>>   * calls recvfd to get the open fd back from the child. returns the
>>   * fd, or -errno if there is an error. */
>> static int
>> virFileOpenForked(const char *path, int openflags, mode_t mode,
>>                    uid_t uid, gid_t gid, unsigned int flags)
>> {
>>      pid_t pid;
>>      int waitret, status, ret = 0;
>>      int fd = -1;
>>      int pair[2] = { -1, -1 };
>>      int forkRet;
>>
>>      /* parent is running as root, but caller requested that the
>>       * file be created as some other user and/or group). The
> Comment is out-of-date; s/created/opened/
>
>>       * following dance avoids problems caused by root-squashing
>>       * NFS servers. */
>>
>>      if (socketpair(AF_UNIX, SOCK_STREAM, 0, pair)<  0) {
>>          ret = -errno;
>>          virReportSystemError(errno,
>>                               _("failed to create socket needed for '%s'"),
>>                               path);
>>          return ret;
>>      }
>>
>>      forkRet = virFork(&pid);
>>      if (pid<  0)
>>          return -errno;
>>
>>      if (pid) { /* parent */
>>          VIR_FORCE_CLOSE(pair[1]);
>>
>>          do {
>>              fd = recvfd(pair[0], 0);
>>          } while (fd<  0&&  errno == EINTR);
>>          VIR_FORCE_CLOSE(pair[0]); /* NB: this preserves errno */
>>
>>          if (fd<  0&&  errno != EACCES) {
>>              ret = -errno;
>>              while (waitpid(pid, NULL, 0) == -1&&  errno == EINTR);
> Hmm, this might do better as virPidAbort(pid).
>
>>              return ret;
>>          }
>>
> And from here...
>
>>          /* wait for child to complete, and retrieve its exit code */
>>          while ((waitret = waitpid(pid,&status, 0) == -1)
>>                 &&  (errno == EINTR));
>>          if (waitret == -1) {
>>              ret = -errno;
>>              virReportSystemError(errno,
>>                                   _("failed to wait for child creating '%s'"),
>>                                   path);
>>              VIR_FORCE_CLOSE(fd);
>>              return ret;
>>          }
>>          if (!WIFEXITED(status) || (ret = -WEXITSTATUS(status)) == -EACCES ||
>>              fd == -1) {
> ...to here, we could probably replace a good chunk of code with the
> simpler virPidWait(pid,&status).  But both of those should be separate
> cleanups, as this part of the function was not touched by your
> refactoring of the public interface.

Agreed.

>>              /* fall back to the simpler method, which works better in
>>               * some cases */
>>              VIR_FORCE_CLOSE(fd);
>>              if ((fd = open(path, openflags, mode))<  0)
> I would tweak things to only attempt this if !(flags&
> VIR_FILE_OPEN_NOFORK).  That is, if NOFORK was requested, the we already
> tried this open once, before even deciding to call this helper function,
> so since it failed once it will fail again.  Only if NOFORK was not
> requested is it worth this fallback in the parent, but then you have the
> issue that the parent needs to honor fchmod() somehow.

Right.  I think I've decided that this bit of code (trying the open 
un-forked after trying it forked fails) is really an artifact of the old 
code that I don't want to keep around; the only existing code that took 
advantage of it in a useful way was virStorageBackendCreateRaw, and now 
that virFileOpenAs is setup to "officially" try both ways in a single 
call, that can actually be served *better* by just letting virFileOpenAs 
try both in the normal order.

However, I agree with you that, since it's a semantic change, I will 
make that a separate patch.

(The only situation I can think of where trying open() as root first 
would cause problems would be if there was a directory on root-squashed 
NFS where user nobody was allowed to create files - the initial open() 
as root would succeed, leaving a file owned by "nobody", and we would 
then be unable to chown it to "qemu". I don't think I've ever heard of 
anyone in their right mind giving user nobody write access to anything 
important on an NFS share, so this doesn't seem like a scenario we need 
to worry about.)

>>                  return -errno;
>>          }
>>          return fd;
>>      }
>>
>>      /* child */
>>
>>      VIR_FORCE_CLOSE(pair[0]); /* preserves errno */
>>      if (forkRet<  0) {
>>          /* error encountered and logged in virFork() after the fork. */
>>          ret = -errno;
>>          goto childerror;
>>      }
>>
>>      /* set desired uid/gid, then attempt to create the file */
>>
>>      if (virSetUIDGID(uid, gid)<  0) {
>>          ret = -errno;
>>          goto childerror;
>>      }
>>
>>      if ((fd = open(path, openflags, mode))<  0) {
>>          ret = -errno;
>>          virReportSystemError(errno, _("child process failed to create file '%s'"),
>>                               path);
> Repeating myself, logging in the child needs fixing someday, but it is
> no worse than the code you were refactoring, so I'm okay with leaving
> things as they were and keeping this patch focused on the interface.
>
>>          goto childerror;
>>      }
>>
>>      /* File is successfully open. Set permissions if requested. */
>>      if ((flags&  VIR_FILE_OPEN_FORCE_MODE)
>>          &&  (fchmod(fd, mode)<  0)) {
>>          ret = -errno;
>>          virReportSystemError(errno,
>>                               _("child process cannot set mode of '%s' to %04o"),
>>                               path, mode);
>>          goto childerror;
>>      }
>>
>>      do {
>>          ret = sendfd(pair[1], fd);
>>      } while (ret<  0&&  errno == EINTR);
>>
>>      if (ret<  0) {
>>          ret = -errno;
>>          virReportSystemError(errno, "%s",
>>                               _("child process failed to send fd to parent"));
>>          goto childerror;
>>      }
>>
>> childerror:
>>      /* ret tracks -errno on failure, but exit value must be positive.
>>       * If the child exits with EACCES, then the parent tries again.  */
>>      /* XXX This makes assumptions about errno being<  255, which is
>>       * not true on Hurd.  */
>>      VIR_FORCE_CLOSE(pair[1]);
>>      if (ret<  0) {
>>          VIR_FORCE_CLOSE(fd);
>>      }
>>      ret = -ret;
>>      if ((ret&  0xff) != ret) {
>>          VIR_WARN("unable to pass desired return value %d", ret);
>>          ret = 0xff;
>>      }
>>      _exit(ret);
> If we do any cleanups to fix that XXX comment, they can also be as
> separate patches.
>
>> }
>>
>> /**
>>   * virFileOpenAs:
>>   * @path: file to open or create
>>   * @openflags: flags to pass to open
>>   * @mode: mode to use on creation or when forcing permissions
>>   * @uid: uid that should own file on creation
>>   * @gid: gid that should own file
>>   * @flags: bit-wise or of VIR_FILE_OPEN_* flags
>>   *
>>   * Open @path, and return an fd to the open file. @openflags are the flags
>>   * normally passed to open(2), while @flags is used internally. If
> s/is/are/
>
>>   * @flags includes VIR_FILE_OPEN_NOFORK, then try opening the file
>>   * while executing with the current uid:gid (i.e. don't
>>   * fork+setuid+setgid before the call to open()).  If @flags includes
>>   * VIR_FILE_OPEN_FORK, then try opening the file while the effective
>>   * user id is @uid (by forking a child process); this allows one to
>>   * bypass root-squashing NFS issues ; NOFORK is always tried before
> s/ ;/;/
>
>>   * FORK (the absence of both flags is treated identically to
>>   * (VIR_FILE_OPEN_NOFORK | VIR_FILE_OPEN_FORK)). If @flags includes
>>   * VIR_FILE_OPEN_FORCE_OWNER, then ensure that @path is owned by
>>   * uid:gid before returning (even if it already existed with a
>>   * different owner). If @flags includes VIR_FILE_OPEN_FORCE_MODE,
>>   * ensure it has those permissions before returning (again, even if
>>   * the file already existed with different permissions).  The return
>>   * value (if non-negative) is the file descriptor, left open.  Returns
>>   * -errno on failure.  */
>> int
>> virFileOpenAs(const char *path, int openflags, mode_t mode,
>>                uid_t uid, gid_t gid, unsigned int flags)
>> {
>>      int ret = 0, fd = -1;
>>
>>      /* allow using -1 to mean "current value" */
>>      if (uid == (uid_t) -1)
>>         uid = getuid();
>>      if (gid == (gid_t) -1)
>>         gid = getgid();
>>
>>      /* treat absence of both flags as presence of both for simpler
>>       * calling. */
>>      if (!(flags&  (VIR_FILE_OPEN_NOFORK|VIR_FILE_OPEN_FORK)))
>>         flags |= VIR_FILE_OPEN_NOFORK|VIR_FILE_OPEN_FORK;
>>
>>      if ((flags&  VIR_FILE_OPEN_NOFORK)
>>          || (getuid() != 0)
>>          || ((uid == 0)&&  (gid == 0))) {
>>
>>          if ((fd = open(path, openflags, mode))<  0) {
>>              ret = -errno;
>>          } else {
>>              if (flags&  VIR_FILE_OPEN_FORCE_OWNER) {
>>                  /* successfully opened as current user. Try fchown if
>>                   * appropriate. */
>>                  struct stat st;
>>
>>                  if (fstat(fd,&st) == -1) {
>>                      ret = -errno;
>>                      virReportSystemError(errno, _("stat of '%s' failed"), path);
>>                      goto error;
>>                  }
>>                  if (((st.st_uid != uid) || (st.st_gid != gid))
> Safe, since we know uid and gid are not -1 at this point.  (It took me
> two tries to pick up on that point, though, so adding a comment might be
> useful).

Well, about 20 lines further up, the documentation (otherwise know as 
"source code") says "if (uid == -1) uid = getuid()". For me, that's 
pretty close to English :-P

>>                      &&  (fchown(fd, uid, gid)<  0)) {
>>                      ret = -errno;
>>                      virReportSystemError(errno, _("cannot chown '%s' to (%u, %u)"),
>>                                           path, (unsigned int) uid,
>>                                           (unsigned int) gid);
>>                      goto error;
>>                  }
>>              }
>>              /* File is successfully open. Set permissions if requested. */
>>              if ((flags&  VIR_FILE_OPEN_FORCE_MODE)
>>                  &&  (fchmod(fd, mode)<  0)) {
> I'm wondering if we should hoist the fstat, and only attempt the fchmod
> if needed.  That is, arrange the logic:
>
> } else { // fd is open
>      if (flags&  (force_owner | force_mode)) {
>          if fstat fails
>              error
>          if (flags&  force_owner&&  owner is wrong) {
>              if fchown fails
>                  error
>          }
>          if (flags&  force_mode&&  mode is wrong) {
>              if fchmod fails
>                  error
>          }
>      }
> }

Yeah, I agree on that one. What's in there is just another artifact of 
the old code that got moved around.

>>                  ret = -errno;
>>                  virReportSystemError(errno,
>>                                       _("cannot set mode of '%s' to %04o"),
>>                                       path, mode);
>>                  goto error;
>>              }
>>          }
>>      }
>>
>>      /* If we either 1) didn't try opening as current user at all, or
>>       * 2) failed, and errno/virStorageFileIsSharedFS indicate we might
>>       * be successful if we try as a different uid, then try doing
>>       * fork+setuid+setgid before opening.
>>       */
>>      if ((fd<  0)&&  (flags&  VIR_FILE_OPEN_FORK)) {
>>
>>          if (ret<  0) {
>>              /* An open(2) that failed due to insufficient permissions
>>               * could return one or the other of these depending on OS
>>               * version and circumstances. Any other errno indicates a
>>               * problem that couldn't be remedied by fork+setuid
>>               * anyway. */
>>              if (ret != -EACCES&&  ret != -EPERM)
>>                  goto error;
>>
>>              /* On Linux we can also verify the FS-type of the
>>               * directory.  (this is a NOP on other platforms). */
>>              switch (virStorageFileIsSharedFS(path)) {
>>              case 1:
>>                  /* it was on a network share, so we'll re-try */
>>                  break;
>>              case -1:
>>                  /* failure detecting fstype */
>>                  virReportSystemError(errno, _("couldn't determine fs type of"
>>                                                "of mount containing '%s'"), path);
>>                  goto error;
>>              case 0:
>>              default:
>>                  /* file isn't on a recognized network FS */
>>                  goto error;
>>              }
>>          }
>>
>>          /* passed all prerequisites - retry the open w/fork+setuid */
>>          if ((fd = virFileOpenForked(path, openflags, mode, uid, gid, flags))<  0) {
>>              ret = fd;
>>              fd = -1;
>>              goto error;
> Hmm, this means that force_mode has to be done in the helper function.
>
> I'm almost thinking you should move the fchmod/fchown code into a helper
> routine, and call it in two places:
> in virFileOpenForked, in the child, after fd is successfully opened
>
> in virFileOpenAs, _after_ you have a working fd (whether from NOFORK or
> FORK code paths)
>
> and if the fchmod/fchown code is a no-op when modes are already correct,
> then you won't have issues with root-squash being unable to fchmod in
> the parent.

I'm making a helper function, but in order to make the code more 
understandable when I later eliminate that extra "last-ditch" open(), 
I'm calling it from 3 different places - in each case immediately after 
the open() succeeds.

>>          }
>>      }
>>
>>      /* File is successfully opened */
>>      return fd;
>>
>> error:
>>      if (fd<  0) {
>>          /* whoever failed the open last has already set ret = -errno */
>>          virReportSystemError(-ret, openflags&  O_CREAT
>>                               ? _("failed to create file '%s'")
>>                               : _("failed to open file '%s'"),
>>                               path);
>>      } else {
>>          /* some other failure after the open succeeded */
>>          VIR_FORCE_CLOSE(fd);
> Hmm, should we unlink() the file if O_CREAT was specified and we were
> the ones that ended up creating it?  That's a tough call.

I greatly prefer APIs that either complete the entire task they were 
given, or leave the universe unchanged aside from returning an error 
code. The problem in this case is that (if the open was done in a child 
process), we can't always know whether it was us who created the file - 
in many cases the parent will be unable to successfully stat() the file 
beforehand even if it did already exist, so the only way to know for 
sure is if 1) a stat() of the file failed prior to a successful call to 
open() in the parent, or 2) if O_EXCL is also specified, and open() is 
successful (in the child or parent). (I guess there is also 3) if stat() 
of the file in the child failed prior to a successful open() in the 
child, but there's currently no way for the child to indicate that to 
the parent). Otherwise we're just taking a guess.

(I noticed the "needUnlink" stuff in qemuOpenFile, and actually ended up 
with the feeling that it couldn't get the answer 100% correct either)

>>      }
>>      return ret;
>> }
>>
> I still think there's a bit of work needed; are you up for trying a v4,
> or should I try it on top of what you have?

I'm going to post a v4 which basically addresses the issues you raised 
without saying "should be in a separate patch". If you don't like that 
version, maybe it will be time to eliminate more back-forth by having 
the next version from you.