resume from another engine results in loss of singlestep req.

Wed Aug 12 10:36:44 UTC 2009

On 08/11, Roland McGrath wrote:
>
> I think this might be the right fix.  First, we change the order so that
> UTRACE_INTERRUPT prevails over UTRACE_REPORT.  (I'm really not sure why I
> ever had it the other way around.)

and this is consistent with the behaviour utrace_resume() which does nothing
if utrace->interrupt == T, it just returns and relies on utrace_get_signal().

But I wonder if this always correct, please see below.

> Next, we keep track of not only whether
> one engine wanted a report when another wanted stop, but of the prevailing
> resume action (i.e. least value still > UTRACE_STOP).  After stopping and
> resuming, we apply UTRACE_INTERRUPT or UTRACE_REPORT as needed to make sure
> we get the kind of resume report we need.

Agreed, this looks like a right change to me. In any case I think this makes
the code more understandable and logical, and ->resume_action is much better
than ->reports imho.

But, you seem to forget to fix other callers of utrace_stop(),

	utrace_report_syscall_entry
	utrace_finish_vfork
	utrace_report_exit

they all do utrace_stop(false), and with this change this "false" means
UTRACE_STOP.

> @@ -1298,14 +1305,14 @@ static void finish_report(struct utrace_
>  {
>  	bool clean = (report->takers && !report->detaches);
>
> -	if (report->action <= UTRACE_REPORT && !utrace->report) {
> -		spin_lock(&utrace->lock);
> -		utrace->report = 1;
> -		set_tsk_thread_flag(task, TIF_NOTIFY_RESUME);
> -	} else if (report->action == UTRACE_INTERRUPT && !utrace->interrupt) {
> +	if (report->action == UTRACE_INTERRUPT && !utrace->interrupt) {
>  		spin_lock(&utrace->lock);
>  		utrace->interrupt = 1;
>  		set_tsk_thread_flag(task, TIF_SIGPENDING);
> +	} else if (report->action <= UTRACE_REPORT && !utrace->report) {
> +		spin_lock(&utrace->lock);
> +		utrace->report = 1;
> +		set_tsk_thread_flag(task, TIF_NOTIFY_RESUME);

Can't we (well, partly) miss UTRACE_STOP request?

Two engines A and B. Say, utrace_report_exec() calls ->report_exec().

A returns UTRACE_STOP, B returns UTRACE_INTERRUPT.

->resume_action = UTRACE_INTERRUPT, finish_report() sets ->interrupt = 1,
utrace_report_exec() returns.

Now. We can't rely on utrace_resume()->finish_resume_report() which can
notice the stop request and do utrace_stop() to actually stop. This is
delayed until utrace_get_signal()->finish_resume_report(), and I am not
sure this is what we really want:

	- minor, but it is possible that tracehook_notify_jctl() will
	  be called before tracehook_get_signal(). Not a real problem,
	  but shouldn't UTRACE_STOP mean "stop as soon as possible" ?

	- utrace_get_signal() can return without finish_resume_report()
	  if B->report_signal() (say) returns UTRACE_SIGNAL_DELIVER.

	  Yes, in this case A->report_signal() can return UTRACE_STOP
	  again, this means utrace_get_signal() returns with ->report = T.
	  But another UTRACE_INTERRUPT can set ->interrupt, this can
	  "disable" ->report again, and so on.

In short: unless I missed something, it is hardly possible to predict when
the tracee will actually stop, perhaps UTRACE_STOP needs more respect?

Oleg.