[libvirt] [PATCH 05/14] Rewrite pidfile handling to be crash safe
Daniel P. Berrange
berrange at redhat.com
Mon Jul 11 09:30:21 UTC 2011
On Fri, Jul 08, 2011 at 04:33:28PM -0600, Eric Blake wrote:
> On 07/07/2011 08:17 AM, Daniel P. Berrange wrote:
> > From: "Daniel P. Berrange" <berrange at redhat.com>
> >
> > If libvirtd crashes then the pidfile may not be cleaned up,
> > making a later restart fail, even though the original process
> > no longer exists.
> >
> > Instead of simply using file creation as the test for successful
> > pidfile acquisition, actually take out a lock on the pidfile.
> >
> > To avoid races, after locking the pidfile, verify that the
> > inode hasn't changed.
> >
> > Also make sure the unprivileged libvirtd now acquires the
> > pidfile, instead of relying on the UNIX socket to ensure
> > mutual exclusion
>
> Cool idea.
>
> > -static int daemonWritePidFile(const char *pidFile, const char *argv0)
> > -{
> > +static int
> > +daemonAcquirePidFile(const char *argv0, const char *pidFile) {
>
> Why'd you hoist the { to the previous line? Our convention has been
> (slowly) leaning towards a function body starting on column 1.
>
> > int fd;
> > - FILE *fh;
> > char ebuf[1024];
> > + unsigned long long pid = (unsigned long long)getpid();
>
> See my comments in an earlier thread about our existing assumptions that
> pid_t fits in int. In fact, I would be okay with:
>
> verify(sizeof(pid_t) <= sizeof(int));
>
> int pid = getpid();
> char pidstr[INT_BUFSIZE_BOUND(pid)];
>
> ...
> snprintf(pidstr, sizeof(pidstr), "%u", pid);
>
>
> > + char pidstr[INT_BUFSIZE_BOUND(pid)];
> >
> > if (pidFile[0] == '\0')
> > return 0;
> >
> > - if ((fd = open(pidFile, O_WRONLY|O_CREAT|O_EXCL, 0644)) < 0) {
> > - VIR_ERROR(_("Failed to open pid file '%s' : %s"),
> > - pidFile, virStrerror(errno, ebuf, sizeof ebuf));
> > - return -1;
> > - }
> > + while (1) {
> > + struct stat a, b;
> > + if ((fd = open(pidFile, O_WRONLY|O_CREAT, 0644)) < 0) {
> > + VIR_ERROR(_("%s: Failed to open pid file '%s' : %s"),
> > + argv0, pidFile, virStrerror(errno, ebuf, sizeof ebuf));
> > + return -1;
> > + }
> > +
> > + if (fstat(fd, &b) < 0) {
> > + VIR_ERROR(_("%s: Pid file '%s' disappeared: %s"),
>
> Misleading error message. fstat can indeed fail (although such failures
> are rare), but not because a file disappeared - after all, you have an
> fd open to the file.
>
> > - if (fprintf(fh, "%lu\n", (unsigned long)getpid()) < 0) {
> > - VIR_ERROR(_("%s: Failed to write to pid file '%s' : %s"),
> > - argv0, pidFile, virStrerror(errno, ebuf, sizeof ebuf));
> > - VIR_FORCE_FCLOSE(fh);
> > - return -1;
> > - }
> > + snprintf(pidstr, sizeof(pidstr), "%llu", pid);
> >
> > - if (VIR_FCLOSE(fh) == EOF) {
> > - VIR_ERROR(_("%s: Failed to close pid file '%s' : %s"),
> > - argv0, pidFile, virStrerror(errno, ebuf, sizeof ebuf));
> > - return -1;
> > + if (safewrite(fd, pidstr, strlen(pidstr)) < 0) {
>
> Nice conversion from FILE* to write().
>
> > @@ -1436,6 +1453,12 @@ int main(int argc, char **argv) {
> > umask(old_umask);
> > }
> >
> > + /* Try to claim the pidfile, existing if we can't */
>
> s/existing/exiting/
>
> > + if ((pid_file_fd = daemonAcquirePidFile(argv[0], pid_file)) < 0) {
> > + ret = VIR_DAEMON_ERR_PIDFILE;
> > + goto cleanup;
> > + }
> > +
> > if (!(srv = virNetServerNew(config->min_workers,
> > config->max_workers,
> > config->max_clients,
> > @@ -1570,8 +1593,10 @@ cleanup:
> > }
> > VIR_FORCE_CLOSE(statuswrite);
> > }
> > - if (pid_file)
> > - unlink (pid_file);
> > + if (pid_file_fd != -1) {
> > + unlink(pid_file);
> > + VIR_FORCE_CLOSE(pid_file_fd);
>
> Swap these two lines. Not that flock (or even libvirtd) works on mingw,
> but mingw doesn't like unlink() on an open fd.
If you swap those two lines you open up a (small) race condition
in the locking scheme where you could unlink a pidfile that has now
been claimed by someone else. By unlinking while we still have it
open, we know we still hold the fcntl() lock.
Daniel
--
|: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org -o- http://virt-manager.org :|
|: http://autobuild.org -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
More information about the libvir-list
mailing list