Squelch 'eof from qemu monitor' error on normal VM shutdown

Jim Fehlig jfehlig at suse.com
Thu Sep 30 22:17:14 UTC 2021


On 9/30/21 11:24, Daniel P. Berrangé wrote:
> On Thu, Sep 30, 2021 at 11:15:18AM -0600, Jim Fehlig wrote:
>> On 9/29/21 21:29, Jim Fehlig wrote:
>>> Hi All,
>>>
>>> Likely Christian received a bug report that motivated commit aeda1b8c56,
>>> which was later reverted by Michal with commit 72adaf2f10. In the past,
>>> I recall being asked about "internal error: End of file from qemu
>>> monitor" on normal VM shutdown and gave a hand wavy response using some
>>> of Michal's words from the revert commit message.
>>>
>>> I recently received a bug report (sorry, but no public link) from a
>>> concerned user about this error and wondered if there is some way to
>>> improve it? I went down some dead ends before circling back to
>>> Christian's patch. When rebased to latest master, I cannot reproduce the
>>> hangs reported by Michal [1]. Perhaps Nikolay's series to resolve
>>> hangs/crashes of libvirtd [2] has now made Christian's patch viable?
>>
>> Hmm, Nikolay's series improves thread management at daemon shutdown and
>> doesn't touch VM shutdown logic. But there has been some behavior change
>> from the time aeda1b8c56 was committed (3.4.0 dev cycle) to current git
>> master. At least I don't see libvirtd hanging when running Michal's test on
>> master + rebased aeda1b8c56.
> 
> This particular "eof" error message has been a source of never ending
> complaints from people who mistakenly think it indicates a significant
> problem.

Nod. I've probably been asked about it more times than I'm remembering. "error" 
and "fail" often catch the attention of users and monitor tools.

> I've always been wary of hiding it by default as there are potentially
> scenarios it which it is interesting to see. I think I'm coming around
> to the idea though that we're better off hiding it by default. The
> scenarios which care about it will probably already need the user to
> contribute full debug level logs in order to diagnose properly.

I've started a variant of Michal's test on git master + rebased aeda1b8c56. The 
test starts 6 VMs, waits 90sec, shuts them down in parallel, waits for shutdowns 
to finish, then repeats. All while continuously calling GetAllDomainStats from 
another client. I'll see how that holds up before re-proposing Christian's patch.

Regards,
Jim





More information about the libvir-list mailing list