[Crash-utility] crash sometimes doesn't terminate, loops forever looking for a process that doesn't exist

Tue Nov 8 15:33:20 UTC 2011

On 11/08/2011 06:58 AM, Dave Anderson wrote:
> 
> 
> ----- Original Message -----
>> On Mon, Nov 7, 2011 at 2:24 PM, Dave Anderson <anderson at redhat.com>
>> wrote:
>>> I would be a little hesitant to get rid of the pc->pipe_pid at this
>>> point
>>> in time.
>>>
>>> I can't seem to be able to reproduce it, but certainly there should
>>> be an escape valve in output_commands_to_pid() to recognize it and
>>> bail
>>> out.  But I presume that your piped command sequence actually
>>> worked,
>>> and so it would be strange/unnecessary for setup_redirect() to do
>>> the
>>> error(FATAL_RESTART, ...) that it currently does when
>>> output_commands_to_pid()
>>> returns with a NULL?
>>
>> I am not sure either what happened exactly but as far as I can tell
>> the piped command didn't really work since it terminated before
>> reading anything from its stdin. I am not sure how to reproduce the
>> problem and it may very well be symptomatic of a problem in our
>> environment but I know it happened at least twice (cores available on
>> demand). So I think error(FATAL_RESTART) is actually appropriate (or
>> at least more appropriate than looping forever). Or do you think it's
>> important to get the return value of the child before deciding what
>> to do?
> 
> Ah OK, if the piped command actually failed, then it certainly should go
> through the error(FATAL_RESTART) path.
> 
>>
>>> Anyway, my point is to try to keep the fix as simple as possible...
>>
>> Makes sense.
> 
> --
> Crash-utility mailing list
> Crash-utility at redhat.com
> https://www.redhat.com/mailman/listinfo/crash-utility

I see this happen on our crash analysis machine:

crash> sys | fneep
sh: fneep: command not found

and then it sits, running up CPU time, until I hit ctrl-C 3 times.

--Guy