[PATCH 5/6] finish_utrace_stop: check ->stopped lockless

Oleg Nesterov oleg at redhat.com
Tue Aug 4 23:24:18 UTC 2009


On 08/03, Roland McGrath wrote:
>
> > But I think we do not care. What we do care is: ->stopped must be F
> > after finish_utrace_stop(), whoever checks it under utrace->lock.
>
> We do care about a false positive if it makes finish_utrace_stop() hit its
> WARN_ON and return true when there was no SIGKILL.

Yes sure. Now I see I misunderstood your previous email. I think
that this type of false positive is not possible. Please see below.

> > when utrace_do_stop() sets ->stopped, we hold ->siglock, this means
> > we can't race with SIGCONT/SIGKILL. If the signal somes after that,
> > we can rely on rq->lock, both try_to_wake_up() and schedule() take
> > this lock, it should act as a barrier. I think.
>
> Ok.  If we are in fact relying on any implicit barrier semantics like
> that, we should have a clear comment about it.

Ah. I guess I just added the unnecessary confusion. Indeed, we should
not rely on any implicit barrier semantics with rq->lock.

What I meant, we can rely on the fact that any wakeup (try_to_wakeup()
actually) must be a barrier wrt schedule(). IOW, if we do

	VAR = 1;
	try_to_wake_up(TASK);

Then TASK can safely do

	set_current_state();
	schedule();
	BUG_ON(VAR != 1);

this is true even if the task does not sleep, in case it is woken
before it enters schedule().

So. If utrace_control() path changes ->stopped and wakes up the
tracee, it must see the correct value of ->stopped. This is simple.

But what if the tracee is woken by CONT/KILL? Should the tracee
see the last change in ->stopped? The signal can race with
utrace_wakeup() or utrace_do_stop(), there are a lot of cases
when the woken tracee can look at ->stopped while it is changed
by utrace. But I think all we need is to be sure that:

	- if ->stopped == T, we have a pending SIGKILL

	  If we were woken by SIGCONT which comes right after
	  utrace_wakeup() does s/TRACED/STOPPED/, I think we must
	  see the result of "->stopped = 0" in utrace_wakeup().
	  Both utrace_wakeup() and signal_wake_up() take ->siglock,
	  the memory change in ->stopped must be visible to
	  utrace_wakeup() and the wakee.

	- If we were killed, we must see ->stopped == T _unless_
	  utrace_wakeup() is in progress or we are already woken
	  from utrace pov.

	  Now we should worry about SIGKILL racing with utrace_do_stop().
	  Again, both take ->siglock, etc.

But in short, I can't understand how utrace->lock can really help,
given that a signal changes task's state without this lock. It can
only help to avoid the races with utrace playing with ->stopped in
parallel, but I see nothing dangerous here.

> > BTW, I think that finish_utrace_stop() doesn't need ->siglock to
> > check the pending SIGKILL.
>
> Shouldn't one always hold siglock to use task->pending at all?

Why? It only checks the bit. Note that fatal_signal_pending() is
used lockless.

But this discussion makes me wonder, what report->killed and
utrace_stop's returned value buy us? Can't we kill report->killed
and make utrace_stop() return void?

With or without this patch. Suppose that (say) utrace_report_syscall_entry()
does utrace_stop() and sleeps in TASK_TRACED. Another thread starts the
group stop and sets SIGNAL_STOP_STOPPED. utrace_wakeup() clears ->stopped,
notices SIGNAL_STOP_STOPPED and doesn't wake up.

Now, we can sleep a long time. Then SIGKILL kills us, but finish_utrace_stop()
returns false.

Afaics, only utrace_get_signal() uses report.kill, and only
utrace_report_syscall_entry() uses the result of utrace_stop(). Both
can just check fatal_signal_pending().


> > 	- Currently, if we see ->stopped == T under utrace->lock we
> > 	  know that the tracee can do nothing interesting from utrace
> > 	  pov. It can't be waked up by utrace_wakeup(). If the tracee
> > 	  is killed it must take ->lock to clear ->stopped before it
> > 	  can do anything else.
>
> Right.  My original thinking was that it must always get into
> utrace_get_signal() right then, and that would be responsible for any
> synchronization that mattered.
>
> > 	  If we remove ->stopped, we can't rely on TASK_TRACED in the
> > 	  same manner. For example, a killed tracee can can call
> > 	  utrace_report_jctl()->REPORT() while utrace thinks it is
> > 	  stopped.
>
> You mean the tracehook_notify_jctl(why, CLD_CONTINUED) case, right?

Not sure. utrace_get_signal() can call ->report_signal() before it
takes utrace->lock.

> I wonder if it might not make sense anyway to suppress this for the
> SIGKILL case.  Not just the tracing, but the entire CLD_CONTINUED
> notification.  A parent doesn't really need a SIGCHLD,CLD_CONTINUED
> immediately followed by a SIGCHLD,CLD_KILLED.

Heh. Actually, I was wrong. utrace_report_jctl() is not possible if the
tracee is killed. complete_signal(SIGKILL) sets ->flags = SIGNAL_GROUP_EXIT.

> > 	- The exiting task with _UTRACE_DEATH_EVENTS can be considered
> 	      ^dead/dying       ^out
> > 	  as quiescent. But, without ->stopped, looking at this task we
> > 	  can't know if some engine wants this task to be stopped. IOW,
> > 	  if we see such a task we can't figure out was utrace_do_stop()
> > 	  called or not.
>
> (I try to reserve "exiting" for PF_EXITING and "dying" for "running
> with ->exit_state set", because the difference is important.)
>
> Who cares?

Agreed.

> OTOH we can just keep utrace_finish_jctl() and have it and
> finish_utrace_stop() just always take the utrace lock purely
> to synchronize (and then drop it without changing anything).

or we can just do spin_unlock_wait().


Anyway, I believe that this change, even if good, is low priority. I mean,
it doesn't immediately makes possible other improvements, afaics. But this
all is up to you.

Oleg.




More information about the utrace-devel mailing list