[PATCH 6/7] util: Check for pkttyagent availability properly

Martin Kletzander mkletzan at redhat.com
Sun Dec 12 21:15:58 UTC 2021


On Sun, Dec 12, 2021 at 10:40:46AM -0700, Jim Fehlig wrote:
>On 12/11/21 03:28, Martin Kletzander wrote:
>> On Sat, Dec 11, 2021 at 11:16:13AM +0100, Martin Kletzander wrote:
>>> On Fri, Dec 10, 2021 at 05:48:03PM -0700, Jim Fehlig wrote:
>>>> Hi Martin!
>>>>
>>>> I recently received a bug report (sorry, not public) about simple operations
>>>> like 'virsh list' hanging when invoked with an internal test tool. I found this
>>>> commit to be the culprit.
>>>>
>>
>> OK, one more thing though, the fact that pkttyagent is spawned cannot
>> cause virsh to hang.  If the authentications is not required, then it
>> will just wait there for a while and then be killed.  If authentication
>> *is* required, then either you already have an agent running and that
>> one should be used since we're starting pkttyagent with `--fallback` or
>> you do not have any agent running in which case virsh list would fail
>> to connect.  Where does the virsh hang, what's the backtrace?
>
>The last scenario you describe appears to be the case. virsh fails to connect
>then gets stuck trying to kill off pkttyagent
>
>#0  0x00007f9f07530241 in clock_nanosleep at GLIBC_2.2.5 () from /lib64/libc.so.6
>#1  0x00007f9f07535ad3 in nanosleep () from /lib64/libc.so.6
>#2  0x00007f9f07f478af in g_usleep () from /usr/lib64/libglib-2.0.so.0
>#3  0x00007f9f086694fa in virProcessAbort (pid=367) at ../src/util/virprocess.c:187
>#4  0x00007f9f0861ed9b in virCommandAbort (cmd=cmd at entry=0x55a798660c50) at
>../src/util/vircommand.c:2774
>#5  0x00007f9f08621478 in virCommandFree (cmd=0x55a798660c50) at
>../src/util/vircommand.c:3061
>#6  0x00007f9f08668581 in virPolkitAgentDestroy (agent=0x55a7986426e0) at
>../src/util/virpolkit.c:164
>#7  0x000055a797836d93 in virshConnect (ctl=ctl at entry=0x7ffc551dd980, uri=0x0,
>readonly=readonly at entry=false) at ../tools/virsh.c:187
>#8  0x000055a797837007 in virshReconnect (ctl=ctl at entry=0x7ffc551dd980,
>name=name at entry=0x0, readonly=<optimized out>, readonly at entry=false,
>force=force at entry=false) at ../tools/virsh.c:223
>#9  0x000055a7978371e0 in virshConnectionHandler (ctl=0x7ffc551dd980) at
>../tools/virsh.c:325
>#10 0x000055a797880172 in vshCommandRun (ctl=ctl at entry=0x7ffc551dd980,
>cmd=0x55a79865f580) at ../tools/vsh.c:1308
>#11 0x000055a7978367b7 in main (argc=2, argv=<optimized out>) at
>../tools/virsh.c:907
>
>Odd thing is, I attached gdb to this virsh process several minutes after
>invoking the test tool that calls 'virsh list'. I can't explain why the process
>is still blocked in g_usleep, which should only have slept for 10 milliseconds.
>Even odder, detaching from the process appears to awaken g_usleep and allows
>process shutdown to continue. The oddness can also be seen in the debug output
>
>2021-12-12 16:35:38.783+0000: 5912: debug : virCommandRunAsync:2629 : About to
>run /usr/bin/pkttyagent --process 5912 --notify-fd 4 --fallback
>
>2021-12-12 16:35:38.787+0000: 5912: debug : virCommandRunAsync:2632 : Command
>result 0, with PID 5914
>
>...
>2021-12-12 16:35:38.830+0000: 5912: debug : virProcessAbort:177 : aborting child
>process 5914
>
>2021-12-12 16:35:38.830+0000: 5912: debug : virProcessAbort:185 : trying SIGTERM
>to child process 5914
>
>
>Attach gdb to the process, observe above backtrace, quit gdb.
>
>2021-12-12 16:44:18.059+0000: 5912: debug : virProcessAbort:195 : trying SIGKILL
>to child process 5914
>
>2021-12-12 16:44:18.061+0000: 5912: debug : virProcessAbort:201 : process has
>ended: fatal signal 9
>

The stuck g_usleep() is weird.  Isn't there a tremendous load on the
machine?  I can't think of much else.

Also I am looking at the pkttyagent code and it looks like it blocks the
first SIGTERM and sending two of them should help in that case, but if
we want to wait for few ms between them, than we;ll be in the same
pickle.

>
>> Anyway, if just adding:
>>
>>      if (!isatty(STDIN_FILENO))
>>          return false;
>
>This indeed fixes the regression in the test tool.
>

That just means that it won't start the agent.  Let's do this, but I
would really, *really* like to figure out what the heck is happening
there, because there has to be something wrong and it might just be
waiting around the corner for us and bite us in the back in a year or
so.  Although I understand how improbable that is.

>> to top of virPolkitAgentAvailable() solves your problem I do not have a
>> particular issue with it.  In any case I would like to better understand
>> the issue as just the fact that we're running pkttyagent should not
>> cause any issues.
>
>Given the above observations, I'm having a difficult time articulating the root
>cause :-).
>
>Regards,
>Jim
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/libvir-list/attachments/20211212/b497dd3b/attachment-0001.sig>


More information about the libvir-list mailing list