Stopped detach/attach status

Oleg Nesterov oleg at redhat.com
Thu Oct 8 16:06:58 UTC 2009


Incomplete reply, just can't read/think/concentrate today...

On 10/07, Roland McGrath wrote:
>
> > We had a lengthy discussion about this.
>
> Yes.  I only ever wanted that revert then because it was too late in the
> 2.6.30 cycle to hash this all out and get it really right.  I meant that
> we should leave wrong enough alone in 2.6.30 but get it all worked out
> more properly in 2.6.31, but I forgot to follow up on it.  If we can
> iron out the behavior now and the upstream version of implementing it is
> not big new hair, it might still be possible to get it fixed in 2.6.32.
>
> That piece of implementation is 100% wrong.  But we have to figure out
> what the manifest semantics are today from the userland perspective and
> decide what exactly we want them to be before we implement those precise
> semantics in some sensible way.

Yes. In particular, ptrace(PTRACE_DETACH, SIGKILL) should cancel
SIGNAL_STOP_STOPPED, yes?

> > 	-			sig->flags = SIGNAL_STOP_STOPPED;
> > 	+			sig->flags = SIGNAL_STOP_STOPPED | SIGNAL_STOP_DEQUEUED;
>
> Boy, do I not understand why that does anything about this at all!
> But I am barely awake tonight.  Ok, I guess I do sort of if it goes
> along with some other patch to set SIGNAL_STOP_STOPPED.  But since
> you've verified you really understand what happens, you can tell us!

Two threads T1 and T2, both ptraced by P, both TASK_TRACED, T2 sleeps
in ptrace_signal().

P does:

	ptrace(DETACH, T1, SIGSTOP);
	ptrace(DETACH, T2, SIGSTOP);

The first DETACH wakes up T1, it dequeues SIGSTOP, calls do_signal_stop().
T2 is still TASK_TRACED, this means T1 completes the group-stop and sets
sig->flags = SIGNAL_STOP_STOPPED.

The second detach wakes up T2, it returns from ptrace_signal() and calls
do_signal_stop() which does nothing without SIGNAL_STOP_DEQUEUED.

But please remember, the patch above is not complete of course and currently
I do not see the good solution. I am starting to think we should forget
about these bugs, merge utrace-ptrace, and then try to fix them.

Even the first detach can fail to stop T1, because SIGNAL_STOP_DEQUEUED
can be cleared before.

I never knew what user-space actually does with ptrace, now I am really
surprized gdb/etc assume it can trust ptrace(SIGSTOP). Sometime it works,
but only by accident.

Oleg.




More information about the utrace-devel mailing list