[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: NPTL Tracing Proposal to support debugging and analysis

Hi all,

As the author of numerous performance tools (PAPI) I'm very glad to see
this proposal. Of the utmost importance to us, is the ability to call a
stub routine in the newly created thread's context. Why? Because all
hardware performance analysis takes place on the basis of a kernel
context. The kernel is in charge of saving and restoring performance
counters on context switch. With PAPI/Linux, we use the Perfctr
interface, which no doubt many of you are familiar. This is similar to
other infrastructures developed on Solaris, AIX, IRIX etc. Obviously,
hardware performance analysis of a threaded application only makes sense
when one uses a bound thread model, i.e. 1 to 1 mapping. (Enabled on AIX
for example by setting AIXTHREAD_SCOPE='S'. Currently on Linux, I have
to replace the pthread_create call with a customized version that
dlsym()'s to find the real pthread_create call. Example of this code can
be found in the 'trapper' tool in PAPI. However, this is UGLY and prone
to error. 

Many vendors do use PAPI and the Perfctr distribution, so this is not
just something academic, but something of utmost importance to the
commercial and government HPC community. This includes Totalview,
VampirTrace, TAU and other tools deemed 'critical' to the HPC
Modernization office and other HPC sites around the world. 

Should you find that such an interface contributes to too much overhead,
I would be in favor of an additional version of the threading library
with such hooks. 

Thanks for your time,


Phil Mucci

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]