[RFC, PATCH] teach utrace to destroy engine->data

Fri Sep 4 13:02:37 UTC 2009

On Fri, Sep 04, 2009 at 02:55:29AM -0700, Roland McGrath wrote:
> > Simple answer. Because I do not know how to implement this. At least now.
> > I tried to think of this more, but I don't see how to make the first steps.
> > 
> > (Yes, to be honest, this looks like "unnecessary complication" to me, I have
> >  to admit. But this is not the reason.)
> 
> Well, I am befuddled.  I explained the model and it seems obvious to me how
> you'd go about it, the only nontrivial corner being coping with the MT exec
> reap case.  But I won't let our mutual befuddlement stand in the way of
> making immediate progress the way you like best.
> 
> > Agreed, but nobody else cares ;)
> 
> Ananth, Srikar, Jim, Frank?  (Beuller?)  We do have some people around
> here with some experience dealing with the API.  They made me add the
> engine ref counts in the first place, for Pete's sake.  They've got to
> have some opinions!

I am not sure if all of what I say here makes any sense as I am still in
the midst of understanding the discussion threads between the two of you
(admittedly, its one hairy piece of the kernel). However, here goes:

I prefer Roland's ops->release approach. However, is there any chance
that ->ops is no longer valid at the time __utrace_release_engine() is
called? For the ptrace case at least, it'll always be valid, so there
are no issues.

The concept of how utrace engines use refcnts initially befuddled me.
The entity that does a utrace_attach_*, without it having to explicitly
ask for a utrace_engine_get(), already owns a reference. That was
counter intuitive. (That's the reason for a series of Srikar's
utrace_engine_put() fixes too, I guess).

That apart, the problem of freeing ->data is similar to what Prasad is
trying to grapple in hardware watchpoints and its ptrace interaction. I
guess its a problem unique to ptrace(?). The issue there, is that when
ptrace did a register_user_hardware_watchpoint(), and that thread later
died/exec'd, ptrace may (will?) not actually unregister the watchpoint,
and the hwbkpt layer is left to fend for itself in cleaning up the
struct hw_breakpoint in flush_thread_hardware_watchpoint(). Now, you
could have non-ptrace users of the interface (who may have passed just
a reference to a statically allocated struct hw_breakpoint) in which case,
freeing the structure in the hwbkpt layer is just wrong. Right now, the
way Prasad has fixed it is to have a check if the
bp->triggered == ptrace_triggered, in which case, free up the structure.

Yes, there is a kind of layering violation here -- hwbkpt now knows
about ptrace_triggered. But I am not sure if there is a better way of
fixing it.

Ananth