incremental arch work

Roland McGrath roland at redhat.com
Wed Nov 21 04:29:19 UTC 2007


Here are the steps I have in mind.  I think this work should be pretty well
clear to merge upstream without much controversy.  Basically, this is the
arch parts now done in the tracehook and regset patches, with a little
sugar.  Several of these steps can be done in parallel and merged upstream
independently.

First, the common groundwork (maybe 2 or 3 arch-independent patches): I
envision a new linux/regset.h defining the regset types and helpers from
linux/tracehook.h, but with utrace_regset renamed user_regset throughout.
Next, the ptrace_layout stuff and helpers from the utrace-ptrace-compat
patch go into linux/ptrace.h in fairly similar form (but no utrace references).

Also, I will make another generic patch to fs/binfmt_elf.c giving it the
option to use user_regset as its only access to machine state (instead of
the current asm/elf.h macros on each arch).  

Next, each arch has several steps.  It would be good if someone took my old
write-up http://people.redhat.com/roland/utrace/utrace-arch-porting-howto.txt
and these notes and synthesized new instructions for doing the "user_regset"
cleanups from scratch on an arch (independent of utrace).

1. asm/tracehook.h functions, i.e. single_step et al.
   Define these called user_* instead of tracehook_*, declare them in
   the __KERNEL__ part of asm/ptrace.h.  I'll probably simplify the macro
   bits a bit and drop the is-enabled hook that nothing uses.

   The tracehook_{enable,disable}_syscall_trace functions can just be
   dropped.  Everything uses TIF_SYSCALL_TRACE uniformly for that now.

   You're just taking the bits from your arch's utrace-tracehook or
   utrace-regset patch that does this and changing the names.  Most of
   those patches were just renaming existing functions anyway.  In some
   cases the arch code has to change so it can be called on current and
   not just on another stopped thread.  This is not strictly necessary for
   ptrace cleanup, but will be required for the full generality of
   something like utrace.  It's already done in your arch utrace-tracehook
   patch, since utrace requires it now, so you do have the code on hand
   and have tested it.  But convincing an arch maintainer to accept that
   part of the patch has to rely on the argument that it will be good for
   the future benefit of new debugger features and their future
   implementation details staying out of the arch maintainers' hair.

   Update the calls in your arch's old ptrace code to the new names.

   When every arch has consolidated its code into user_enable_single_step
   et al functions with standard signatures, we can move PTRACE_SINGLESTEP,
   PTRACE_CONT, etc. from the arch ptrace code into the generic code.

   #1 depends on nothing else.

2. arch regset functions, and define task_user_regset_view (new name for
   utrace_native_view).  This is in the utrace-regset patch for your arch
   now.  How to organize this for upstream submission will depend on the
   details of the code for your arch and what your arch maintainers want.

   If your utrace-regset patch mostly adds entirely new functions for the
   utrace_regset (now user_regset) accessor style, then you can do an
   incremental patch just adding new stuff without changing anything to
   actually use it yet.  If your patch reuses old ptrace code more directly
   by turning it directly into the regset functions, then you may need to
   roll this into one patch with the ptrace_layout bits (below) to have
   something to send upstream in one patch that doesn't break anything.

   Note that these functions need to incorporate one wrinkle beyond just
   wrapping the old code in the new accessor signatures.  This is already
   handled in your utrace-regset patch (and is noted in the old porting
   howto), but may need careful attention to preserve and justify in the
   upstream arch maintainers' merge.  That is, the accessors can be called
   on current.  For the get calls, this is necessary to use them for core
   dumps, so that should be an easy argument to make.  For the set calls,
   you have to rely on the argument that it will be good for the future
   benefit of new debugger features and their future implementation
   details staying out of the arch maintainers' hair.

   #2 depends on nothing else but the generic patch adding linux/regset.h

3. Use regset functions for core dump.  This will just be defining a new
   macro in asm/elf.h that tells the new fs/binfmt_elf.c code that's what
   you want.  For each arch it could probably be one patch adding that
   macro and a second patch removing all the asm/elf.h macros no longer
   used.  For arch's with compat, another pair doing that for its compat
   hack that #include's fs/binfmt_elf.c.

   #3 depends on #2

4. Use regset functions for ptrace.  This is in the utrace-ptrace-compat
   patch for your arch now.  Depending on your code, you might need to roll
   this into #2 to avoid breaking anything in that patch while still
   avoiding unpleasant duplication of code that should be cleaned up directly.

   This should be trivial, and hopefully noncontroversial upstream if #2
   has any traction.  It's just making arch_ptrace use the ptrace_layout
   stuff or ptrace_regset helpers for its PEEKUSR, GETREGS, etc.

   Depending on your arch maintainers' taste, this might be several little
   patches for each read/write pair of ptrace requests, or just one patch.

   #4 depends on #2 and the generic patches providing the ptrace_layout helpers

5. arch cruft removal.  After all those have been merged, there may be
   some unused code or macros lingering around from the old ptrace or
   core-dump support that can be cleaned out.  Make it tidy.


Except for the dependencies noted, all of these things across the various
arch's can proceed in parallel.  I hope that these patches will have some
appeal to arch maintainers as pure cleanup and consolidation.  It reduces
the duplication between ptrace and core-dump that arch code has to provide
functions and easy-to-miss magic macros for.  It provides a single
coherent place for an arch maintainer to look for what they should do to
cover all userland needs for debugging and dumps.  The non-regset cleanups
will make it possible to reduce the amount of arch ptrace code now required
and keep non-arch ptrace details out of that arch code.  

This work can be merged into each arch tree more or less right now.  That
takes me, and any hairy issues about utrace/ptrace, out of the loop for
upstream review and merging.  I'd still like to stay CC'd on patches
related to this sort of thing.  But now we can really have the arch
maintainers (and the people on this list interested in each arch) "own"
this arch code.  (That was my hope from the beginning, but in practice
they've all been "my baby" and others working on them go through me.)
Once the first arch adopts the work to get the ball rolling, I hope for
each arch an advocate (other than me) can take the lead in ironing out
these cleanups with the upstream arch maintainers.

I will get started on the groundwork patches and do x86 to set the
example.  I think I can get this going in the next few days (next few
business days anyway, not sure about the holiday weekend right now).

Once upstream arch code has merged all the steps above, there will be no
more arch changes--or very nearly none, anyway--required to work with a
later merging of utrace or something else like it.  I've thought about
ways to be more incremental about the core changes, too.  But if we can
get the ball rolling with the arch changes and get a majority of upstream
arch trees converted over, that will be a first big win.


Thanks,
Roland




More information about the utrace-devel mailing list