[PATCH] libvirt-guests.sh: fix log output with new systemd

Daniel P. Berrangé berrange at redhat.com
Thu Apr 30 17:59:26 UTC 2020


On Thu, Apr 30, 2020 at 07:36:54PM +0200, Christian Ehrhardt wrote:
> On Thu, Apr 30, 2020 at 7:23 PM Daniel P. Berrangé <berrange at redhat.com> wrote:
> >
> > On Thu, Apr 30, 2020 at 06:28:08PM +0200, Christian Ehrhardt wrote:
> > > On Thu, Apr 30, 2020 at 5:10 PM Daniel P. Berrangé <berrange at redhat.com> wrote:
> > > >
> > > > On Thu, Apr 30, 2020 at 04:58:25PM +0200, Christian Ehrhardt wrote:
> > > > > On Thu, Apr 30, 2020 at 2:51 PM Daniel P. Berrangé <berrange at redhat.com> wrote:
> > > > > Well it seems I have a reproducible symptom and a fix, but not the
> > > > > explanation why the latter fixes the former.
> > > > > I'll need to dive into some debug & analysis myself to explain it better.
> > > > > I'll be back here once I got time to do that in depth check.
> > > > >
> > > > > Until then whoever is affected (should be everyone) can give it a
> > > > > thought as well.
> > > > > Repro is as easy as
> > > > > One console:
> > > > > $ journalctl -f -u libvirt-guests
> > > > > Other console:
> > > > > $ systemctl stop libvirt-guests
> > > > > $ systemctl start libvirt-guests
> > > > >
> > > > > I see it with 245.4-4ubuntu3 (18.04) I'm not seeing it on
> > > > > 237-3ubuntu10.39 (20.04).
> > > > > Maybe it is a systemd bug after all?
> > > > > I'd be interested to hear if that is/isn't clobbering output for
> > > > > anyone else and what your systemd versions are?
> > > >
> > > > FWIW, it works correctly on Fedora 31 with systemd 243.
> > >
> > > Eoan with 242-7ubuntu3.7 is good as well.
> > > I might need to try to get some interim versions from somewhere.
> >
> > I've reproduced on Fedora 33 rawhide with systemd 245 - the first
> > place where it lists running guests is screwed up slightly:
> >
> > Apr 30 17:16:13 libvirt-fedora-rawhide systemd[1]: Stopping Suspend/Resume Running libvirt Guests...
> > Apr 30 17:16:13 libvirt-fedora-rawhide libvirt-guests.sh[69903]: Running guests on default URI:
> > Apr 30 17:16:13 libvirt-fedora-rawhide libvirt-guests.sh[69892]: Runningcore2
> > Apr 30 17:16:13 libvirt-fedora-rawhide libvirt-guests.sh[69934]: Suspending guests on default URI...
> > Apr 30 17:16:13 libvirt-fedora-rawhide libvirt-guests.sh[69892]: SSuspending core1: ...
> > Apr 30 17:16:14 libvirt-fedora-rawhide libvirt-guests.sh[69892]: Suspending core1: done
> > Apr 30 17:16:14 libvirt-fedora-rawhide libvirt-guests.sh[69892]: Suspending core2: ...
> > Apr 30 17:16:15 libvirt-fedora-rawhide libvirt-guests.sh[69892]: Suspending core2: done
> > Apr 30 17:16:15 libvirt-fedora-rawhide systemd[1]: libvirt-guests.service: Succeeded.
> > Apr 30 17:16:15 libvirt-fedora-rawhide systemd[1]: Stopped Suspend/Resume Running libvirt Guests.
> >
> > On resume it is even worse
> >
> > Apr 30 17:19:40 libvirt-fedora-rawhide systemd[1]: Starting Suspend/Resume Running libvirt Guests...
> > Apr 30 17:19:40 libvirt-fedora-rawhide libvirt-guests.sh[70041]: Resuming guests on default URI...
> > Apr 30 17:19:40 libvirt-fedora-rawhide libvirt-guests.sh[70030]: R
> > Apr 30 17:19:40 libvirt-fedora-rawhide libvirt-guests.sh[70048]: R
> > Apr 30 17:19:41 libvirt-fedora-rawhide libvirt-guests.sh[70048]: esuming guest core1:
> > Apr 30 17:19:41 libvirt-fedora-rawhide libvirt-guests.sh[70079]: esum
> > Apr 30 17:19:41 libvirt-fedora-rawhide libvirt-guests.sh[70030]: e
> > Apr 30 17:19:41 libvirt-fedora-rawhide libvirt-guests.sh[70086]: e
> > Apr 30 17:19:42 libvirt-fedora-rawhide libvirt-guests.sh[70086]: esuming guest core2:
> > Apr 30 17:19:42 libvirt-fedora-rawhide libvirt-guests.sh[70119]: esum
> > Apr 30 17:19:42 libvirt-fedora-rawhide libvirt-guests.sh[70030]: e
> > Apr 30 17:19:42 libvirt-fedora-rawhide systemd[1]: Finished Suspend/Resume Running libvirt Guests.
> >
> >
> > It is possible it isn't systemd related - could be other packages that
> > are co-incidentally affecting it.
> 
> Since launchpad holds all former builds still ready for download I
> could easily test a few versions.
> 
> I up/downgraded just the following packages on an otherwise unmodified system:
> - libnss-systemd
> - libpam-systemd
> - libsystemd0
> - systemd
> - systemd-container
> - systemd-sysv
> 
> Those versions I did check:
> 
> 245.4-4ubuntu3 bad
> 245.2-1ubuntu1 bad
> 244.3-1ubuntu1 bad
> 244.1-0ubuntu3 bad (bad on retry)
> 243-3ubuntu1 good (good (on retry)
> 242-7ubuntu3 good
> 
> Retry means that I went and installed from good -> bad -> good and the
> behavior was the same.
> So none of the installs fixed it to then be good forever. It was
> consistently good <244 and bad >=244.

Based on that, a possibly interesting change between v243..v244 was
this:

commit 09d0b46ab61bebafe5bdc1be95ee153dfb13d6bc
Author: Lorenz Bauer <lmb at cloudflare.com>
Date:   Mon Nov 4 16:35:46 2019 +0000

    journal: refresh cached credentials of stdout streams
    
    journald assumes that getsockopt(SO_PEERCRED) correctly identifies the
    process on the remote end of the socket. However, this is incorrect
    according to man 7 socket:
    
        The returned  credentials  are  those that were in effect at the
        time of the call to connect(2) or socketpair(2).
    
    This becomes a problem when a new process inherits the stdout stream
    from a parent. First, log messages from the child process will
    be attributed to the parent. Second, the struct ucred used by journald
    becomes invalid as soon as the parent exits. Further sendmsg calls then
    fail with ENOENT. Logs for the child process then vanish from the journal.
    
    Fix this by using recvmsg on the stdout stream, and refreshing the cached
    struct ucred if SCM_CREDENTIALS indicate a new process.
    

but I can't see an obvious mistake the code which would cause the
behaviour we see.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|




More information about the libvir-list mailing list