From roland at redhat.com Wed Apr 1 00:35:46 2009 From: roland at redhat.com (Roland McGrath) Date: Tue, 31 Mar 2009 17:35:46 -0700 (PDT) Subject: Testing insn.block probe point uncovers possible utrace bug In-Reply-To: Maynard Johnson's message of Tuesday, 31 March 2009 18:56:34 -0500 <49D2ADB2.3030304@us.ibm.com> References: <49D2ADB2.3030304@us.ibm.com> Message-ID: <20090401003546.DD976FC3BE@magilla.sf.frob.com> The bug is in your module's unconditional use of UTRACE_*STEP. It's documented that it's invalid to use them unless arch_has_*_step() has returned true. (The utrace_resume_action description refers you to utrace_control(), where this is documented.) It's a bug to use these at all when the corresponding arch_has_*_step() check returns false. utrace_control() helps you out with this a bit more by using WARN_ON, and returns -EOPNOTSTUPP if even single-step is not there. The warning spew is there to make sure you know this is a bug in your module and it's just being overly extra nice not to crash on you. (You would be entirely wrong to expect this return value and think it was ever valid to make this call.) There is no place for post-callback processing to return an error. I've now made it use WARN_ON there consistent with what utrace_control() does. For UTRACE_BLOCKSTEP, you'll get the WARN_ON spew and then it will fall back to treating it as UTRACE_SINGLESTEP. (This is what utrace_control() always did.) What your module needs to do is check arch_has_*_step() correctly. When the feature you want is not there, gracefully tell your users you can't do it. Your module today is wrong on every architecture, even 32-bit x86 where it happens to work on every chip you are likely to test. As to the powerpc implementation, that is a subject for the powerpc maintainers (linuxppc-dev at ozlabs.org et al) and is orthogonal to utrace. Take it up with them. I gave them a patch long ago and it got stalled waiting for them to figure out which chips really have the feature so the arch_has_block_step() definition could be made perfect. Thanks, Roland From roland at redhat.com Wed Apr 1 21:59:03 2009 From: roland at redhat.com (Roland McGrath) Date: Wed, 1 Apr 2009 14:59:03 -0700 (PDT) Subject: [PATCH] powerpc ptrace block-step Message-ID: <20090401215903.DE872FC3AB@magilla.sf.frob.com> Maynard asked about user_enable_block_step() support on powerpc. This is the old patch I've posted before. I haven't even tried to compile it lately, but it rebased cleanly. AFAIK the only reason this didn't go in several months ago was waiting for someone to decide what the right arch_has_block_step() condition was, i.e. if it needs to check some cpu_feature or chip identifier bits. I had hoped that I had passed the buck then to ppc folks to figure that out and make it so. But it does not appear to have happened. Note you can drop the #define PTRACE_SINGLEBLOCK if you want to be conservative and not touch the user (ptrace) ABI yet. Then Maynard could beat on it with internal uses (utrace) before you worry about whether userland expects the new ptrace request macro to exist. Thanks, Roland --- >From 2482ed1a0ced9caf964275889ea2315916e84ada Mon Sep 17 00:00:00 2001 From: Roland McGrath Date: Thu, 1 May 2008 23:40:58 -0700 Subject: [PATCH] powerpc ptrace block-step This adds block-step support on powerpc, including a PTRACE_SINGLEBLOCK request for ptrace. Signed-off-by: Roland McGrath --- arch/powerpc/include/asm/ptrace.h | 4 ++++ arch/powerpc/kernel/ptrace.c | 19 ++++++++++++++++++- 2 files changed, 22 insertions(+), 1 deletions(-) diff --git a/arch/powerpc/include/asm/ptrace.h b/arch/powerpc/include/asm/ptrace.h index c9c678f..d7692b8 100644 --- a/arch/powerpc/include/asm/ptrace.h +++ b/arch/powerpc/include/asm/ptrace.h @@ -135,7 +135,9 @@ do { \ * These are defined as per linux/ptrace.h, which see. */ #define arch_has_single_step() (1) +#define arch_has_block_step() (1) extern void user_enable_single_step(struct task_struct *); +extern void user_enable_block_step(struct task_struct *); extern void user_disable_single_step(struct task_struct *); #endif /* __ASSEMBLY__ */ @@ -288,4 +290,6 @@ extern void user_disable_single_step(struct task_struct *); #define PPC_PTRACE_PEEKUSR_3264 0x91 #define PPC_PTRACE_POKEUSR_3264 0x90 +#define PTRACE_SINGLEBLOCK 0x100 /* resume execution until next branch */ + #endif /* _ASM_POWERPC_PTRACE_H */ diff --git a/arch/powerpc/kernel/ptrace.c b/arch/powerpc/kernel/ptrace.c index 3635be6..656fea2 100644 --- a/arch/powerpc/kernel/ptrace.c +++ b/arch/powerpc/kernel/ptrace.c @@ -707,12 +707,29 @@ void user_enable_single_step(struct task_struct *task) task->thread.dbcr0 |= DBCR0_IDM | DBCR0_IC; regs->msr |= MSR_DE; #else + regs->msr &= ~MSR_BE; regs->msr |= MSR_SE; #endif } set_tsk_thread_flag(task, TIF_SINGLESTEP); } +void user_enable_block_step(struct task_struct *task) +{ + struct pt_regs *regs = task->thread.regs; + + if (regs != NULL) { +#if defined(CONFIG_40x) || defined(CONFIG_BOOKE) + task->thread.dbcr0 = DBCR0_IDM | DBCR0_BT; + regs->msr |= MSR_DE; +#else + regs->msr &= ~MSR_SE; + regs->msr |= MSR_BE; +#endif + } + set_tsk_thread_flag(task, TIF_SINGLESTEP); +} + void user_disable_single_step(struct task_struct *task) { struct pt_regs *regs = task->thread.regs; @@ -729,7 +746,7 @@ void user_disable_single_step(struct task_struct *task) task->thread.dbcr0 &= ~(DBCR0_IC | DBCR0_IDM); regs->msr &= ~MSR_DE; #else - regs->msr &= ~MSR_SE; + regs->msr &= ~(MSR_SE | MSR_BE); #endif } clear_tsk_thread_flag(task, TIF_SINGLESTEP); -- 1.6.0.6 From benh at kernel.crashing.org Thu Apr 2 05:26:56 2009 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Thu, 02 Apr 2009 16:26:56 +1100 Subject: [PATCH] powerpc ptrace block-step In-Reply-To: <20090401215903.DE872FC3AB@magilla.sf.frob.com> References: <20090401215903.DE872FC3AB@magilla.sf.frob.com> Message-ID: <1238650016.17330.193.camel@pasglop> On Wed, 2009-04-01 at 14:59 -0700, Roland McGrath wrote: > diff --git a/arch/powerpc/include/asm/ptrace.h b/arch/powerpc/include/asm/ptrace.h > index c9c678f..d7692b8 100644 > --- a/arch/powerpc/include/asm/ptrace.h > +++ b/arch/powerpc/include/asm/ptrace.h > @@ -135,7 +135,9 @@ do { \ > * These are defined as per linux/ptrace.h, which see. > */ > #define arch_has_single_step() (1) > +#define arch_has_block_step() (1) The patch only implements it for "server/classic" processors, not BookE, thus it should probably only advertise it for these :-) Though it wouldn't be too hard to implement it for BookE using DBCR0:BRT (Branch Taken debug event) though it might need some careful fixups such as the one we have for single step regarding hitting exception entry code. Cheers, Ben. > extern void user_enable_single_step(struct task_struct *); > +extern void user_enable_block_step(struct task_struct *); > extern void user_disable_single_step(struct task_struct *); > > #endif /* __ASSEMBLY__ */ > @@ -288,4 +290,6 @@ extern void user_disable_single_step(struct task_struct *); > #define PPC_PTRACE_PEEKUSR_3264 0x91 > #define PPC_PTRACE_POKEUSR_3264 0x90 > > +#define PTRACE_SINGLEBLOCK 0x100 /* resume execution until next branch */ > + > #endif /* _ASM_POWERPC_PTRACE_H */ > diff --git a/arch/powerpc/kernel/ptrace.c b/arch/powerpc/kernel/ptrace.c > index 3635be6..656fea2 100644 > --- a/arch/powerpc/kernel/ptrace.c > +++ b/arch/powerpc/kernel/ptrace.c > @@ -707,12 +707,29 @@ void user_enable_single_step(struct task_struct *task) > task->thread.dbcr0 |= DBCR0_IDM | DBCR0_IC; > regs->msr |= MSR_DE; > #else > + regs->msr &= ~MSR_BE; > regs->msr |= MSR_SE; > #endif > } > set_tsk_thread_flag(task, TIF_SINGLESTEP); > } > > +void user_enable_block_step(struct task_struct *task) > +{ > + struct pt_regs *regs = task->thread.regs; > + > + if (regs != NULL) { > +#if defined(CONFIG_40x) || defined(CONFIG_BOOKE) > + task->thread.dbcr0 = DBCR0_IDM | DBCR0_BT; > + regs->msr |= MSR_DE; > +#else > + regs->msr &= ~MSR_SE; > + regs->msr |= MSR_BE; > +#endif > + } > + set_tsk_thread_flag(task, TIF_SINGLESTEP); > +} > + > void user_disable_single_step(struct task_struct *task) > { > struct pt_regs *regs = task->thread.regs; > @@ -729,7 +746,7 @@ void user_disable_single_step(struct task_struct *task) > task->thread.dbcr0 &= ~(DBCR0_IC | DBCR0_IDM); > regs->msr &= ~MSR_DE; > #else > - regs->msr &= ~MSR_SE; > + regs->msr &= ~(MSR_SE | MSR_BE); > #endif > } > clear_tsk_thread_flag(task, TIF_SINGLESTEP); From tuille at nafcon.org Thu Apr 2 17:58:10 2009 From: tuille at nafcon.org (Gellis Bringantino) Date: Thu, 02 Apr 2009 17:58:10 +0000 Subject: Can She Have Multiplle Orgasms? Message-ID: <49D4FB9A.1625444@nafcon.org> Do you want to be seen as a capptain of the bedroom? Do you want your woman to be RAVING to her friends about the great sex she has while all of them get normal boring sex? Well if you do, then you deffinitely need to ... Hurriedly shut up their shops for a crowd in those aren't you, lotty? Miss blacklock started. She too tiredi think i had myself better ride, so her?or at least speak to her? said he, in a tone said he, i know who thou art, and i greet thee. -------------- next part -------------- An HTML attachment was scrubbed... URL: From roland at redhat.com Fri Apr 3 00:44:50 2009 From: roland at redhat.com (Roland McGrath) Date: Thu, 2 Apr 2009 17:44:50 -0700 (PDT) Subject: [PATCH] powerpc ptrace block-step In-Reply-To: Benjamin Herrenschmidt's message of Thursday, 2 April 2009 16:26:56 +1100 <1238650016.17330.193.camel@pasglop> References: <20090401215903.DE872FC3AB@magilla.sf.frob.com> <1238650016.17330.193.camel@pasglop> Message-ID: <20090403004450.F2166FC3AB@magilla.sf.frob.com> > The patch only implements it for "server/classic" processors, not BookE, > thus it should probably only advertise it for these :-) > > Though it wouldn't be too hard to implement it for BookE using DBCR0:BRT > (Branch Taken debug event) though it might need some careful fixups such > as the one we have for single step regarding hitting exception entry > code. In that case, this code seems fairly mysterious: > > +#if defined(CONFIG_40x) || defined(CONFIG_BOOKE) > > + task->thread.dbcr0 = DBCR0_IDM | DBCR0_BT; > > + regs->msr |= MSR_DE; That doesn't already do whatever it is you described? Can we assume now that you or someone else who knows what all that means will take this up? Thanks, Roland From jwboyer at linux.vnet.ibm.com Fri Apr 3 01:13:27 2009 From: jwboyer at linux.vnet.ibm.com (Josh Boyer) Date: Thu, 2 Apr 2009 21:13:27 -0400 Subject: [PATCH] powerpc ptrace block-step In-Reply-To: <20090403004450.F2166FC3AB@magilla.sf.frob.com> References: <20090401215903.DE872FC3AB@magilla.sf.frob.com> <1238650016.17330.193.camel@pasglop> <20090403004450.F2166FC3AB@magilla.sf.frob.com> Message-ID: <20090403011327.GA16881@zod.rchland.ibm.com> On Thu, Apr 02, 2009 at 05:44:50PM -0700, Roland McGrath wrote: >> The patch only implements it for "server/classic" processors, not BookE, >> thus it should probably only advertise it for these :-) >> >> Though it wouldn't be too hard to implement it for BookE using DBCR0:BRT >> (Branch Taken debug event) though it might need some careful fixups such >> as the one we have for single step regarding hitting exception entry >> code. > >In that case, this code seems fairly mysterious: > >> > +#if defined(CONFIG_40x) || defined(CONFIG_BOOKE) >> > + task->thread.dbcr0 = DBCR0_IDM | DBCR0_BT; >> > + regs->msr |= MSR_DE; > >That doesn't already do whatever it is you described? > >Can we assume now that you or someone else who knows what all that means >will take this up? I will try and look at the patch a bit more tomorrow, yes. I don't think having it working for BookE is really a requirement before this gets in though. If we can get it working with minimal effort for ppc64, that would help get systemtap and related things functioning correctly there. While I would love to believe that systemtap should work everywhere, I can't really see it running on an embedded board at this point. josh From benh at kernel.crashing.org Fri Apr 3 01:43:27 2009 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Fri, 03 Apr 2009 12:43:27 +1100 Subject: [PATCH] powerpc ptrace block-step In-Reply-To: <20090403004450.F2166FC3AB@magilla.sf.frob.com> References: <20090401215903.DE872FC3AB@magilla.sf.frob.com> <1238650016.17330.193.camel@pasglop> <20090403004450.F2166FC3AB@magilla.sf.frob.com> Message-ID: <1238723007.10752.19.camel@pasglop> On Thu, 2009-04-02 at 17:44 -0700, Roland McGrath wrote: > > The patch only implements it for "server/classic" processors, not BookE, > > thus it should probably only advertise it for these :-) > > > > Though it wouldn't be too hard to implement it for BookE using DBCR0:BRT > > (Branch Taken debug event) though it might need some careful fixups such > > as the one we have for single step regarding hitting exception entry > > code. > > In that case, this code seems fairly mysterious: > > > > +#if defined(CONFIG_40x) || defined(CONFIG_BOOKE) > > > + task->thread.dbcr0 = DBCR0_IDM | DBCR0_BT; > > > + regs->msr |= MSR_DE; > > That doesn't already do whatever it is you described? It should, I missed that bit. Except for the possible issue with interrupts. > Can we assume now that you or someone else who knows what all that means > will take this up? I can take this up after I'm back from vacation, which will be in about 4 weeks from now, but maybe Josh can give it a go in the meantime. Basically, the "issue" with BookE is that the debug interrupts aren't masked by the fact of taking an exception. So for example, if you have single step enabled and take a TLB miss on a userland load, you'll take a single step exception on the first (or rather the second but that's a detail) instruction of the TLB miss exception vector. The code for our BookE debug interrupts has a workaround that detects that case and returns to the TLB miss vector with MSR:DE cleared, but I think that code will not properly catch a similar things happening due to block step. Though is should be easy to fix. Cheers, Ben. From roland at redhat.com Fri Apr 3 01:59:56 2009 From: roland at redhat.com (Roland McGrath) Date: Thu, 2 Apr 2009 18:59:56 -0700 (PDT) Subject: [PATCH] powerpc ptrace block-step In-Reply-To: Josh Boyer's message of Thursday, 2 April 2009 21:13:27 -0400 <20090403011327.GA16881@zod.rchland.ibm.com> References: <20090401215903.DE872FC3AB@magilla.sf.frob.com> <1238650016.17330.193.camel@pasglop> <20090403004450.F2166FC3AB@magilla.sf.frob.com> <20090403011327.GA16881@zod.rchland.ibm.com> Message-ID: <20090403015956.A2244FC3AB@magilla.sf.frob.com> > I don't think having it working for BookE is really a requirement before this > gets in though. If we can get it working with minimal effort for ppc64, that > would help get systemtap and related things functioning correctly there. Sure, just conditionalize arch_has_block_step() however is correct for that and put in what already works earlier rather than later, I'd say. Then users of the excluded chips can come forward when they care, and you can worry about it then if you don't happen to get to it first. Thanks, Roland From deals at clubvacationdeals.com Fri Apr 3 02:37:16 2009 From: deals at clubvacationdeals.com (Club Vacation Deals) Date: Thu, 2 Apr 2009 22:37:16 -0400 Subject: Unique & Exclusive Mexico Vacation Message-ID: An HTML attachment was scrubbed... URL: From galabs2000 at gmail.com Fri Apr 3 11:29:39 2009 From: galabs2000 at gmail.com (Serge G) Date: Fri, 03 Apr 2009 07:29:39 -0400 Subject: Financial Post Story Message-ID: <67cdb0f692ae4eb4b32c32af68f109fc@gmail.com> An HTML attachment was scrubbed... URL: From fche at redhat.com Fri Apr 3 12:10:54 2009 From: fche at redhat.com (Frank Ch. Eigler) Date: Fri, 03 Apr 2009 08:10:54 -0400 Subject: [PATCH] powerpc ptrace block-step In-Reply-To: <20090403011327.GA16881@zod.rchland.ibm.com> (Josh Boyer's message of "Thu, 2 Apr 2009 21:13:27 -0400") References: <20090401215903.DE872FC3AB@magilla.sf.frob.com> <1238650016.17330.193.camel@pasglop> <20090403004450.F2166FC3AB@magilla.sf.frob.com> <20090403011327.GA16881@zod.rchland.ibm.com> Message-ID: Josh Boyer writes: > [...] While I would love to believe that systemtap should work > everywhere, I can't really see it running on an embedded board at > this point. FWIW, we have had people running (not compiling) systemtap probe modules on embedded systems. See "stap -p4" or "stap-server". - FChE From superiorly at gh-verlag.com Fri Apr 3 19:45:37 2009 From: superiorly at gh-verlag.com (Shakita Knknown) Date: Fri, 03 Apr 2009 19:45:37 +0000 Subject: Can She HHave Multiple Orgasms? Message-ID: <49D66698.3643613@gh-verlag.com> Do you want to be seen as a captain of the bedroom? Do you wannt your woman to be RAAVING to her friends about the great sex she has while all of them get normal bboring sex? Well if you do, then you definitely need to ... Twelve and he needed them. Not only that, but of the highsouled vrishasena shone, like the car, of life pike. Yes, ma'am! there's just as many inquisitive, or treacherous. No fermented liquors was. Thus addressed, the gambler's son carefully. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jkenisto at us.ibm.com Fri Apr 3 23:23:41 2009 From: jkenisto at us.ibm.com (Jim Keniston) Date: Fri, 03 Apr 2009 16:23:41 -0700 Subject: x86 instruction classification Message-ID: <1238801021.3568.61.camel@dyn9047018139.beaverton.ibm.com> As promised on yesterday's SystemTap call, here's inat.c. It consists of a couple of tables that capture more information about the x86 instruction sets. For example, you could use this code to determine whether an opcode/instruction is invalid, privileged, a floating-point op, or in some other way of possible interest to uprobes and/or kprobes. Also included is cmp.c, a user program that provides an example of the tables' use and also serves as a check against the tables currently in use by x86 ubp/uprobes. (There are a few differences, which reflect either corrections in inat.c or differences in how the tables are used.) The intention is eventually provide this as an enhancement to our x86 instruction-analysis code. inat.c uses the x86 kvm approach of one bitmap for each opcode. Comments welcome. Jim -------------- next part -------------- A non-text attachment was scrubbed... Name: cmp.c Type: text/x-csrc Size: 5672 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: inat.c Type: text/x-csrc Size: 5262 bytes Desc: not available URL: From rancour at diia.com.ua Sat Apr 4 15:11:19 2009 From: rancour at diia.com.ua (Dorso Krous) Date: Sat, 04 Apr 2009 15:11:19 +0000 Subject: Intense Femaale Orgasms Message-ID: <49D77761.1036134@diia.com.ua> Intense Female Orrgasms - The Real Secrets to Giving Her a Mind Blowing OOrgasm Revealed From confirm-s2-icdfqlickifug0amjvket4jgihncgbbd-utrace-devel=redhat.com at yahoogrupos.com.br Sun Apr 5 04:44:05 2009 From: confirm-s2-icdfqlickifug0amjvket4jgihncgbbd-utrace-devel=redhat.com at yahoogrupos.com.br (Yahoo! Grupos) Date: 5 Apr 2009 04:44:05 -0000 Subject: Confirma =?iso-8859-1?q?=E7=E3?= o de pedido para entrar no grupo de_amigo_para_amigo Message-ID: <1238906645.20.85649.w1@yahoogrupos.com.br> Ol? utrace-devel at redhat.com, Recebemos sua solicita??o para entrar no grupo de_amigo_para_amigo do Yahoo! Grupos, um servi?o de comunidades online gratuito e super f?cil de usar. Este pedido expirar? em 7 dias. PARA ENTRAR NESTE GRUPO: 1) V? para o site do Yahoo! Grupos clicando neste link: http://br.groups.yahoo.com/i?i=icdfqlickifug0amjvket4jgihncgbbd&e=utrace-devel%40redhat%2Ecom (Se n?o funcionar, use os comandos para cortar e colar o link acima na barra de endere?o do seu navegador.) -OU- 2) RESPONDA a este e-mail clicando em "Responder" e depois em "Enviar", no seu programa de e-mail. Se voc? n?o fez esta solicita??o ou se n?o tem interesse em entrar no grupo de_amigo_para_amigo, por favor, ignore esta mensagem. Sauda??es, Atendimento ao usu?rio do Yahoo! Grupos O uso que voc? faz do Yahoo! Grupos est? sujeito aos http://br.yahoo.com/info/utos.html From confirm-s2-gkahppwrbkatkq2nmn1w4jxjysodglne-utrace-devel=redhat.com at yahoogrupos.com.br Sun Apr 5 04:47:14 2009 From: confirm-s2-gkahppwrbkatkq2nmn1w4jxjysodglne-utrace-devel=redhat.com at yahoogrupos.com.br (Yahoo! Grupos) Date: 5 Apr 2009 04:47:14 -0000 Subject: Confirma =?iso-8859-1?q?=E7=E3?= o de pedido para entrar no grupo de_amigo_para_amigo Message-ID: <1238906834.43.86697.w1@yahoogrupos.com.br> Ol? utrace-devel at redhat.com, Recebemos sua solicita??o para entrar no grupo de_amigo_para_amigo do Yahoo! Grupos, um servi?o de comunidades online gratuito e super f?cil de usar. Este pedido expirar? em 7 dias. PARA ENTRAR NESTE GRUPO: 1) V? para o site do Yahoo! Grupos clicando neste link: http://br.groups.yahoo.com/i?i=gkahppwrbkatkq2nmn1w4jxjysodglne&e=utrace-devel%40redhat%2Ecom (Se n?o funcionar, use os comandos para cortar e colar o link acima na barra de endere?o do seu navegador.) -OU- 2) RESPONDA a este e-mail clicando em "Responder" e depois em "Enviar", no seu programa de e-mail. Se voc? n?o fez esta solicita??o ou se n?o tem interesse em entrar no grupo de_amigo_para_amigo, por favor, ignore esta mensagem. Sauda??es, Atendimento ao usu?rio do Yahoo! Grupos O uso que voc? faz do Yahoo! Grupos est? sujeito aos http://br.yahoo.com/info/utos.html From claimsdept at inetcreations.com Sun Apr 5 09:01:12 2009 From: claimsdept at inetcreations.com (Bab Morse) Date: Sun, 5 Apr 2009 10:01:12 +0100 Subject: Satisfy her immensely Message-ID: <20090405100112.5030809@inetcreations.com> Provide superior growth http://lrzde.jilfawris.com/ From upperbounds at schuke.de Sun Apr 5 11:28:06 2009 From: upperbounds at schuke.de (Speziale Paige) Date: Sun, 05 Apr 2009 11:28:06 +0000 Subject: Intense Female OOrgasms Message-ID: <49D894CC.5972803@schuke.de> Intense Female Orgasms -- The Real Secrets to Giving Her a Mind Blowing Orggasm Revealed From fche at elastic.org Sun Apr 5 14:17:53 2009 From: fche at elastic.org (Frank Ch. Eigler) Date: Sun, 5 Apr 2009 10:17:53 -0400 Subject: [PATCH 1/2] make arch_init_ftrace_syscalls multiply callable In-Reply-To: <1238941074-27424-1-git-send-email-fche@elastic.org> References: <1238941074-27424-1-git-send-email-fche@elastic.org> Message-ID: <1238941074-27424-2-git-send-email-fche@elastic.org> Since both the utrace-based ftrace engine and the original syscall-specific ftrace engine use the syscall pretty-printer, this initialization function needs to be callable from each of them. Signed-off-by: Frank Ch. Eigler --- arch/x86/kernel/ftrace.c | 3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c index 1d0d7f4..1d99d3d 100644 --- a/arch/x86/kernel/ftrace.c +++ b/arch/x86/kernel/ftrace.c @@ -498,6 +498,9 @@ void arch_init_ftrace_syscalls(void) if (atomic_inc_return(&refs) != 1) goto end; + if (syscalls_metadata) + return; + syscalls_metadata = kzalloc(sizeof(*syscalls_metadata) * FTRACE_SYSCALL_MAX, GFP_KERNEL); if (!syscalls_metadata) { -- 1.6.0.6 From fche at elastic.org Sun Apr 5 14:17:54 2009 From: fche at elastic.org (Frank Ch. Eigler) Date: Sun, 5 Apr 2009 10:17:54 -0400 Subject: [PATCH 2/2] utrace-based ftrace "process" engine, v3 In-Reply-To: <1238941074-27424-2-git-send-email-fche@elastic.org> References: <1238941074-27424-1-git-send-email-fche@elastic.org> <1238941074-27424-2-git-send-email-fche@elastic.org> Message-ID: <1238941074-27424-3-git-send-email-fche@elastic.org> This is the v3 utrace-ftrace interface. based on Roland McGrath's utrace API, which provides programmatic hooks to the in-tree tracehook layer. This patch interfaces those events to ftrace, as configured by a small number of debugfs controls, and includes system-call pretty-printing using code from the ftrace syscall prototype by Frederic Weisbecker. Here's the /debugfs/tracing/process_trace_README: process event tracer mini-HOWTO 1. Select process hierarchy to monitor. Other processes will be completely unaffected. Leave at 0 for system-wide tracing. % echo NNN > process_follow_pid 2. Determine which process event traces are potentially desired. syscall and signal tracing slow down monitored processes. % echo 0 > process_trace_{syscalls,signals,lifecycle} 3. Add any final uid- or taskcomm-based filtering. Non-matching processes will skip trace messages, but will still be slowed. % echo NNN > process_trace_uid_filter # -1: unrestricted % echo ls > process_trace_taskcomm_filter # empty: unrestricted 4. Start tracing. % echo process > current_tracer 5. Examine trace. % cat trace 6. Stop tracing. % echo nop > current_tracer Signed-off-by: Frank Ch. Eigler --- include/linux/processtrace.h | 51 ++++ kernel/trace/Kconfig | 8 + kernel/trace/Makefile | 1 + kernel/trace/trace.h | 9 + kernel/trace/trace_process.c | 642 ++++++++++++++++++++++++++++++++++++++++++ 5 files changed, 711 insertions(+), 0 deletions(-) create mode 100644 include/linux/processtrace.h create mode 100644 kernel/trace/trace_process.c diff --git a/include/linux/processtrace.h b/include/linux/processtrace.h new file mode 100644 index 0000000..74d031e --- /dev/null +++ b/include/linux/processtrace.h @@ -0,0 +1,51 @@ +#ifndef PROCESSTRACE_H +#define PROCESSTRACE_H + +#include +#include + +struct process_trace_entry { + unsigned char opcode; /* one of _UTRACE_EVENT_* */ + union { + struct { + pid_t child; + unsigned long flags; + } trace_clone; + struct { + int type; + int notify; + } trace_jctl; + struct { + long code; + } trace_exit; + struct { + /* Selected fields from linux_binprm */ + int argc; + /* We need to copy the file name, because by + the time we format the trace record for + display, the task may be gone. */ +#define PROCESS_TRACE_FILENAME_LENGTH 64 + char filename[PROCESS_TRACE_FILENAME_LENGTH]; + } trace_exec; + struct { + int si_signo; + int si_errno; + int si_code; + } trace_signal; + struct { + long callno; + unsigned long args[6]; + } trace_syscall_entry; + struct { + long rc; + long error; + } trace_syscall_exit; + }; +}; + +/* in kernel/trace/trace_process.c */ + +extern void enable_process_trace(void); +extern void disable_process_trace(void); + +#endif /* PROCESSTRACE_H */ diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig index b0a46f8..226cb60 100644 --- a/kernel/trace/Kconfig +++ b/kernel/trace/Kconfig @@ -186,6 +186,14 @@ config FTRACE_SYSCALLS help Basic tracer to catch the syscall entry and exit events. +config PROCESS_TRACER + bool "Trace process events via utrace" + select TRACING + select UTRACE + help + This tracer records process events that may be hooked by utrace: + thread lifecycle, system calls, signals, and job control. + config BOOT_TRACER bool "Trace boot initcalls" select TRACING diff --git a/kernel/trace/Makefile b/kernel/trace/Makefile index c3feea0..880080a 100644 --- a/kernel/trace/Makefile +++ b/kernel/trace/Makefile @@ -44,5 +44,6 @@ obj-$(CONFIG_EVENT_TRACER) += trace_events.o obj-$(CONFIG_EVENT_TRACER) += events.o obj-$(CONFIG_EVENT_TRACER) += trace_export.o obj-$(CONFIG_FTRACE_SYSCALLS) += trace_syscalls.o +obj-$(CONFIG_PROCESS_TRACER) += trace_process.o libftrace-y := ftrace.o diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h index f561628..c27d2ba 100644 --- a/kernel/trace/trace.h +++ b/kernel/trace/trace.h @@ -8,6 +8,7 @@ #include #include #include +#include #include #include #include @@ -37,6 +38,7 @@ enum trace_type { TRACE_KMEM_FREE, TRACE_POWER, TRACE_BLK, + TRACE_PROCESS, __TRACE_LAST_TYPE, }; @@ -214,6 +216,12 @@ struct syscall_trace_exit { unsigned long ret; }; +struct trace_process { + struct trace_entry ent; + struct process_trace_entry event; +}; + + /* * trace_flag_type is an enumeration that holds different @@ -332,6 +340,7 @@ extern void __ftrace_bad_type(void); TRACE_SYSCALL_ENTER); \ IF_ASSIGN(var, ent, struct syscall_trace_exit, \ TRACE_SYSCALL_EXIT); \ + IF_ASSIGN(var, ent, struct trace_process, TRACE_PROCESS); \ __ftrace_bad_type(); \ } while (0) diff --git a/kernel/trace/trace_process.c b/kernel/trace/trace_process.c new file mode 100644 index 0000000..b98ab28 --- /dev/null +++ b/kernel/trace/trace_process.c @@ -0,0 +1,642 @@ +/* + * utrace-based process event tracing + * Copyright (C) 2009 Red Hat Inc. + * By Frank Ch. Eigler + * + * Based on mmio ftrace engine by Pekka Paalanen + * and utrace-syscall-tracing prototype by Ananth Mavinakayanahalli + * and ftrace-syscall prototype by Frederic Weisbecker + */ + +/* #define DEBUG 1 */ + +#include +#include +#include +#include +#include +#include + +#include "trace.h" +#include "trace_output.h" + +/* A process must match these filters in order to be traced. */ +static char trace_taskcomm_filter[TASK_COMM_LEN]; /* \0: unrestricted */ +static u32 trace_taskuid_filter = -1; /* -1: unrestricted */ +static u32 trace_lifecycle_p = 1; +static u32 trace_syscalls_p = 1; +static u32 trace_signals_p = 1; + +/* A process must be a direct child of given pid in order to be + followed. */ +static u32 process_follow_pid; /* 0: unrestricted/systemwide */ + +/* XXX: lock the above? */ + + +/* trace data collection */ + +static struct trace_array *process_trace_array; + +static void process_reset_data(struct trace_array *tr) +{ + pr_debug("in %s\n", __func__); + tracing_reset_online_cpus(tr); +} + +static int process_trace_init(struct trace_array *tr) +{ + pr_debug("in %s\n", __func__); + process_trace_array = tr; + process_reset_data(tr); + enable_process_trace(); + return 0; +} + +static void process_trace_reset(struct trace_array *tr) +{ + pr_debug("in %s\n", __func__); + disable_process_trace(); + process_reset_data(tr); + process_trace_array = NULL; +} + +static void process_trace_start(struct trace_array *tr) +{ + pr_debug("in %s\n", __func__); + process_reset_data(tr); +} + +static void __trace_processtrace(struct trace_array *tr, + struct trace_array_cpu *data, + struct process_trace_entry *ent) +{ + struct ring_buffer_event *event; + struct trace_process *entry; + + event = ring_buffer_lock_reserve(tr->buffer, sizeof(*entry)); + if (!event) + return; + entry = ring_buffer_event_data(event); + tracing_generic_entry_update(&entry->ent, 0, preempt_count()); + entry->ent.type = TRACE_PROCESS; + entry->event = *ent; + ring_buffer_unlock_commit(tr->buffer, event); + + trace_wake_up(); +} + +void process_trace(struct process_trace_entry *ent) +{ + struct trace_array *tr = process_trace_array; + struct trace_array_cpu *data; + + preempt_disable(); + data = tr->data[smp_processor_id()]; + __trace_processtrace(tr, data, ent); + preempt_enable(); +} + + +/* trace data rendering */ + +static void process_pipe_open(struct trace_iterator *iter) +{ + pr_debug("in %s\n", __func__); +} + +static void process_close(struct trace_iterator *iter) +{ + iter->private = NULL; +} + +static ssize_t process_read(struct trace_iterator *iter, struct file *filp, + char __user *ubuf, size_t cnt, loff_t *ppos) +{ + ssize_t ret; + struct trace_seq *s = &iter->seq; + ret = trace_seq_to_user(s, ubuf, cnt); + return (ret == -EBUSY) ? 0 : ret; +} + +static enum print_line_t process_print(struct trace_iterator *iter) +{ + struct trace_entry *entry = iter->ent; + struct trace_process *field; + struct process_trace_entry *pte; + struct trace_seq *s = &iter->seq; + int ret = 1; + struct syscall_metadata *syscall; + int i; + + trace_assign_type(field, entry); + pte = &field->event; + + if (!trace_print_context(iter)) + return TRACE_TYPE_PARTIAL_LINE; + + switch (pte->opcode) { + case _UTRACE_EVENT_CLONE: + ret = trace_seq_printf(s, "fork %d flags 0x%lx\n", + pte->trace_clone.child, + pte->trace_clone.flags); + break; + case _UTRACE_EVENT_EXEC: + ret = trace_seq_printf(s, "exec '%s' (args %d)\n", + pte->trace_exec.filename, + pte->trace_exec.argc); + break; + case _UTRACE_EVENT_EXIT: + ret = trace_seq_printf(s, "exit %ld\n", + pte->trace_exit.code); + break; + case _UTRACE_EVENT_JCTL: + ret = trace_seq_printf(s, "jctl %d %d\n", + pte->trace_jctl.type, + pte->trace_jctl.notify); + break; + case _UTRACE_EVENT_SIGNAL: + ret = trace_seq_printf(s, "signal %d errno %d code 0x%x\n", + pte->trace_signal.si_signo, + pte->trace_signal.si_errno, + pte->trace_signal.si_code); + break; + case _UTRACE_EVENT_SYSCALL_ENTRY: + syscall = syscall_nr_to_meta (pte->trace_syscall_entry.callno); + if (!syscall) { + /* Metadata is incomplete. Simply hex dump. */ + ret = trace_seq_printf(s, "syscall %ld [0x%lx 0x%lx" + " 0x%lx 0x%lx 0x%lx 0x%lx]\n", + pte->trace_syscall_entry.callno, + pte->trace_syscall_entry.args[0], + pte->trace_syscall_entry.args[1], + pte->trace_syscall_entry.args[2], + pte->trace_syscall_entry.args[3], + pte->trace_syscall_entry.args[4], + pte->trace_syscall_entry.args[5]); + break; + } + ret = trace_seq_printf(s, "%s(", syscall->name); + if (!ret) + break; + for (i = 0; i < syscall->nb_args; i++) { + ret = trace_seq_printf(s, "%s: 0x%lx%s", syscall->args[i], + pte->trace_syscall_entry.args[i], + i == syscall->nb_args - 1 ? ")\n" : ", "); + if (!ret) + break; + } + break; + case _UTRACE_EVENT_SYSCALL_EXIT: + /* utrace doesn't preserve the syscall number. */ + ret = trace_seq_printf(s, "syscall rc %ld error %ld\n", + pte->trace_syscall_exit.rc, + pte->trace_syscall_exit.error); + break; + default: + ret = trace_seq_printf(s, "process event code %d?\n", + pte->opcode); + break; + } + if (!ret) + return TRACE_TYPE_PARTIAL_LINE; + return TRACE_TYPE_HANDLED; +} + + +static enum print_line_t process_print_line(struct trace_iterator *iter) +{ + switch (iter->ent->type) { + case TRACE_PROCESS: + return process_print(iter); + default: + return TRACE_TYPE_HANDLED; /* ignore unknown entries */ + } +} + +static struct tracer process_tracer = { + .name = "process", + .init = process_trace_init, + .reset = process_trace_reset, + .start = process_trace_start, + .pipe_open = process_pipe_open, + .close = process_close, + .read = process_read, + .print_line = process_print_line, +}; + + + +/* utrace backend */ + +/* Should tracing apply to given task? Compare against filter + values. */ +static int trace_test(struct task_struct *tsk) +{ + if (trace_taskcomm_filter[0] + && strncmp(trace_taskcomm_filter, tsk->comm, TASK_COMM_LEN)) + return 0; + + if (trace_taskuid_filter != (u32)-1 + && trace_taskuid_filter != task_uid(tsk)) + return 0; + + return 1; +} + + +static const struct utrace_engine_ops process_trace_ops; + +static void process_trace_tryattach(struct task_struct *tsk) +{ + struct utrace_engine *engine; + + pr_debug("in %s\n", __func__); + tracing_record_cmdline (tsk); + engine = utrace_attach_task(tsk, + UTRACE_ATTACH_CREATE | + UTRACE_ATTACH_EXCLUSIVE, + &process_trace_ops, NULL); + if (IS_ERR(engine) || (engine == NULL)) { + pr_warning("utrace_attach_task %d (rc %p)\n", + tsk->pid, engine); + } else { + int rc; + + /* We always hook cost-free events. */ + unsigned long events = + UTRACE_EVENT(CLONE) | + UTRACE_EVENT(EXEC) | + UTRACE_EVENT(JCTL) | + UTRACE_EVENT(EXIT); + + /* Penalizing events are individually controlled, so that + utrace doesn't even take the monitored threads off their + fast paths, nor bother call our callbacks. */ + if (trace_syscalls_p) + events |= UTRACE_EVENT_SYSCALL; + if (trace_signals_p) + events |= UTRACE_EVENT_SIGNAL_ALL; + + rc = utrace_set_events(tsk, engine, events); + if (rc == -EINPROGRESS) + rc = utrace_barrier(tsk, engine); + if (rc) + pr_warning("utrace_set_events/barrier rc %d\n", rc); + + utrace_engine_put(engine); + pr_debug("attached in %s to %s(%d)\n", __func__, + tsk->comm, tsk->pid); + } +} + + +u32 process_trace_report_clone(enum utrace_resume_action action, + struct utrace_engine *engine, + struct task_struct *parent, + unsigned long clone_flags, + struct task_struct *child) +{ + if (trace_lifecycle_p && trace_test(parent)) { + struct process_trace_entry ent; + ent.opcode = _UTRACE_EVENT_CLONE; + ent.trace_clone.child = child->pid; + ent.trace_clone.flags = clone_flags; + process_trace(&ent); + } + + process_trace_tryattach(child); + + return UTRACE_RESUME; +} + + +u32 process_trace_report_jctl(enum utrace_resume_action action, + struct utrace_engine *engine, + struct task_struct *task, + int type, int notify) +{ + struct process_trace_entry ent; + ent.opcode = _UTRACE_EVENT_JCTL; + ent.trace_jctl.type = type; + ent.trace_jctl.notify = notify; + process_trace(&ent); + + return UTRACE_RESUME; +} + + +u32 process_trace_report_syscall_entry(u32 action, + struct utrace_engine *engine, + struct task_struct *task, + struct pt_regs *regs) +{ + if (trace_syscalls_p && trace_test(task)) { + struct process_trace_entry ent; + ent.opcode = _UTRACE_EVENT_SYSCALL_ENTRY; + ent.trace_syscall_entry.callno = syscall_get_nr(task, regs); + syscall_get_arguments(task, regs, 0, 6, + ent.trace_syscall_entry.args); + process_trace(&ent); + } + + return UTRACE_RESUME; +} + + +u32 process_trace_report_syscall_exit(enum utrace_resume_action action, + struct utrace_engine *engine, + struct task_struct *task, + struct pt_regs *regs) +{ + if (trace_syscalls_p && trace_test(task)) { + struct process_trace_entry ent; + ent.opcode = _UTRACE_EVENT_SYSCALL_EXIT; + ent.trace_syscall_exit.rc = + syscall_get_return_value(task, regs); + ent.trace_syscall_exit.error = syscall_get_error(task, regs); + process_trace(&ent); + } + + return UTRACE_RESUME; +} + + +u32 process_trace_report_exec(enum utrace_resume_action action, + struct utrace_engine *engine, + struct task_struct *task, + const struct linux_binfmt *fmt, + const struct linux_binprm *bprm, + struct pt_regs *regs) +{ + if (trace_lifecycle_p && trace_test(task)) { + struct process_trace_entry ent; + ent.opcode = _UTRACE_EVENT_EXEC; + ent.trace_exec.argc = bprm->argc; + strlcpy (ent.trace_exec.filename, bprm->filename, + sizeof(ent.trace_exec.filename)); + process_trace(&ent); + } + + tracing_record_cmdline (task); + + /* We're already attached; no need for a new tryattach. */ + + return UTRACE_RESUME; +} + + +u32 process_trace_report_signal(u32 action, + struct utrace_engine *engine, + struct task_struct *task, + struct pt_regs *regs, + siginfo_t *info, + const struct k_sigaction *orig_ka, + struct k_sigaction *return_ka) +{ + if (trace_signals_p && trace_test(task)) { + struct process_trace_entry ent; + ent.opcode = _UTRACE_EVENT_SIGNAL; + ent.trace_signal.si_signo = info->si_signo; + ent.trace_signal.si_errno = info->si_errno; + ent.trace_signal.si_code = info->si_code; + process_trace(&ent); + } + + /* We're already attached, so no need for a new tryattach. */ + + return UTRACE_RESUME | utrace_signal_action(action); +} + + +u32 process_trace_report_exit(enum utrace_resume_action action, + struct utrace_engine *engine, + struct task_struct *task, + long orig_code, long *code) +{ + if (trace_lifecycle_p && trace_test(task)) { + struct process_trace_entry ent; + ent.opcode = _UTRACE_EVENT_EXIT; + ent.trace_exit.code = orig_code; + process_trace(&ent); + } + + /* There is no need to explicitly attach or detach here. */ + + return UTRACE_RESUME; +} + + +void enable_process_trace() +{ + struct task_struct *grp, *tsk; + + pr_debug("in %s\n", __func__); + rcu_read_lock(); + do_each_thread(grp, tsk) { + /* Skip over kernel threads. */ + if (tsk->flags & PF_KTHREAD) + continue; + + if (process_follow_pid) { + if (tsk->tgid == process_follow_pid || + tsk->parent->tgid == process_follow_pid) + process_trace_tryattach(tsk); + } else { + process_trace_tryattach(tsk); + } + } while_each_thread(grp, tsk); + rcu_read_unlock(); +} + +void disable_process_trace() +{ + struct utrace_engine *engine; + struct task_struct *grp, *tsk; + int rc; + + pr_debug("in %s\n", __func__); + rcu_read_lock(); + do_each_thread(grp, tsk) { + /* Find matching engine, if any. Returns -ENOENT for + unattached threads. */ + engine = utrace_attach_task(tsk, UTRACE_ATTACH_MATCH_OPS, + &process_trace_ops, 0); + if (IS_ERR(engine)) { + if (PTR_ERR(engine) != -ENOENT) + pr_warning("utrace_attach_task %d (rc %ld)\n", + tsk->pid, -PTR_ERR(engine)); + } else if (engine == NULL) { + pr_warning("utrace_attach_task %d (null engine)\n", + tsk->pid); + } else { + /* Found one of our own engines. Detach. */ + rc = utrace_control(tsk, engine, UTRACE_DETACH); + switch (rc) { + case 0: /* success */ + break; + case -ESRCH: /* REAP callback already begun */ + case -EALREADY: /* DEATH callback already begun */ + break; + default: + rc = -rc; + pr_warning("utrace_detach %d (rc %d)\n", + tsk->pid, rc); + break; + } + utrace_engine_put(engine); + pr_debug("detached in %s from %s(%d)\n", __func__, + tsk->comm, tsk->pid); + } + } while_each_thread(grp, tsk); + rcu_read_unlock(); +} + + +static const struct utrace_engine_ops process_trace_ops = { + .report_clone = process_trace_report_clone, + .report_exec = process_trace_report_exec, + .report_exit = process_trace_report_exit, + .report_jctl = process_trace_report_jctl, + .report_signal = process_trace_report_signal, + .report_syscall_entry = process_trace_report_syscall_entry, + .report_syscall_exit = process_trace_report_syscall_exit, +}; + + + +/* control interfaces */ + + +static ssize_t +trace_taskcomm_filter_read(struct file *filp, char __user *ubuf, + size_t cnt, loff_t *ppos) +{ + return simple_read_from_buffer(ubuf, cnt, ppos, + trace_taskcomm_filter, TASK_COMM_LEN); +} + + +static ssize_t +trace_taskcomm_filter_write(struct file *filp, const char __user *ubuf, + size_t cnt, loff_t *fpos) +{ + char *end; + + if (cnt > TASK_COMM_LEN) + cnt = TASK_COMM_LEN; + + if (copy_from_user(trace_taskcomm_filter, ubuf, cnt)) + return -EFAULT; + + /* Cut from the first nil or newline. */ + trace_taskcomm_filter[cnt] = '\0'; + end = strchr(trace_taskcomm_filter, '\n'); + if (end) + *end = '\0'; + + *fpos += cnt; + return cnt; +} + + +static const struct file_operations trace_taskcomm_filter_fops = { + .open = tracing_open_generic, + .read = trace_taskcomm_filter_read, + .write = trace_taskcomm_filter_write, +}; + + + +static char README_text[] = + "process event tracer mini-HOWTO\n" + "\n" + "1. Select process hierarchy to monitor. Other processes will be\n" + " completely unaffected. Leave at 0 for system-wide tracing.\n" + "# echo NNN > process_follow_pid\n" + "\n" + "2. Determine which process event traces are potentially desired.\n" + " syscall and signal tracing slow down monitored processes.\n" + "# echo 0 > process_trace_{syscalls,signals,lifecycle}\n" + "\n" + "3. Add any final uid- or taskcomm-based filtering. Non-matching\n" + " processes will skip trace messages, but will still be slowed.\n" + "# echo NNN > process_trace_uid_filter # -1: unrestricted \n" + "# echo ls > process_trace_taskcomm_filter # empty: unrestricted\n" + "\n" + "4. Start tracing.\n" + "# echo process > current_tracer\n" + "\n" + "5. Examine trace.\n" + "# cat trace\n" + "\n" + "6. Stop tracing.\n" + "# echo nop > current_tracer\n" + ; + +static struct debugfs_blob_wrapper README_blob = { + .data = README_text, + .size = sizeof(README_text), +}; + + +static __init int init_process_trace(void) +{ + struct dentry *d_tracer; + struct dentry *entry; + + d_tracer = tracing_init_dentry(); + + arch_init_ftrace_syscalls (); + + entry = debugfs_create_blob("process_trace_README", 0444, d_tracer, + &README_blob); + if (!entry) + pr_warning("Could not create debugfs " + "'process_trace_README' entry\n"); + + /* Control for scoping process following. */ + entry = debugfs_create_u32("process_follow_pid", 0644, d_tracer, + &process_follow_pid); + if (!entry) + pr_warning("Could not create debugfs " + "'process_follow_pid' entry\n"); + + /* Process-level filters */ + entry = debugfs_create_file("process_trace_taskcomm_filter", 0644, + d_tracer, NULL, + &trace_taskcomm_filter_fops); + /* XXX: it'd be nice to have a read/write debugfs_create_blob. */ + if (!entry) + pr_warning("Could not create debugfs " + "'process_trace_taskcomm_filter' entry\n"); + + entry = debugfs_create_u32("process_trace_uid_filter", 0644, d_tracer, + &trace_taskuid_filter); + if (!entry) + pr_warning("Could not create debugfs " + "'process_trace_uid_filter' entry\n"); + + /* Event-level filters. */ + entry = debugfs_create_u32("process_trace_lifecycle", 0644, d_tracer, + &trace_lifecycle_p); + if (!entry) + pr_warning("Could not create debugfs " + "'process_trace_lifecycle' entry\n"); + + entry = debugfs_create_u32("process_trace_syscalls", 0644, d_tracer, + &trace_syscalls_p); + if (!entry) + pr_warning("Could not create debugfs " + "'process_trace_syscalls' entry\n"); + + entry = debugfs_create_u32("process_trace_signals", 0644, d_tracer, + &trace_signals_p); + if (!entry) + pr_warning("Could not create debugfs " + "'process_trace_signals' entry\n"); + + return register_tracer(&process_tracer); +} + +device_initcall(init_process_trace); -- 1.6.0.6 From fche at elastic.org Sun Apr 5 14:17:52 2009 From: fche at elastic.org (Frank Ch. Eigler) Date: Sun, 5 Apr 2009 10:17:52 -0400 Subject: [PATCH tip 0/2] utrace-ftrace engine v3 Message-ID: <1238941074-27424-1-git-send-email-fche@elastic.org> This version of the utrace-ftrace engine represents bringing v2 (as posted along with utrace two weeks ago) up to parity with the tip/tracing/syscalls one in terms of using the system call pretty-printing tables. This patch applies to the tip/tracing/syscalls tree, except that it assumes that utrace patches are also applied beforehand. Next up could be adding an option to use the filterable TRACE_EVENT tracepoint engine instead of (or in addition to?) the hand-written trace record management & formatting. Next up could also be removal of the syscall-tracing prototype, if people are so inclined. Frank Ch. Eigler (2): make arch_init_ftrace_syscalls multiply callable utrace-based ftrace "process" engine, v3 arch/x86/kernel/ftrace.c | 3 + include/linux/processtrace.h | 51 ++++ kernel/trace/Kconfig | 8 + kernel/trace/Makefile | 1 + kernel/trace/trace.h | 9 + kernel/trace/trace_process.c | 642 ++++++++++++++++++++++++++++++++++++++++++ 6 files changed, 714 insertions(+), 0 deletions(-) create mode 100644 include/linux/processtrace.h create mode 100644 kernel/trace/trace_process.c From contatoentreagora at entreagora.com Sun Apr 5 19:41:24 2009 From: contatoentreagora at entreagora.com (Fabiano Couto) Date: Sun, 5 Apr 2009 19:41:24 GMT Subject: Como ganhar na Loteria Message-ID: <200904060230.n362UX31013259@mx3.redhat.com> An HTML attachment was scrubbed... URL: From pinholes at romya03.com Mon Apr 6 05:57:39 2009 From: pinholes at romya03.com (Corrow Fettes) Date: Mon, 06 Apr 2009 05:57:39 +0000 Subject: Super Sensual Love Making In Beed Message-ID: <49D998BC.5652012@romya03.com> Mega Secrets To Super Sensual Love MMaking In Bed - Be Absolutely Mind Blowing Dyer added, sure! You bet! What they ought to this gentleman galvanised his eyes, and the man,. -------------- next part -------------- An HTML attachment was scrubbed... URL: From graficarmc at pop.com.br Mon Apr 6 04:06:56 2009 From: graficarmc at pop.com.br (RMC Visual) Date: Mon, 6 Apr 2009 04:06:56 GMT Subject: =?iso-8859-1?q?Comunicar_!!!_Faz_a_Diferen=E7a=2E?= Message-ID: <20090406040718.4C4FF63678D8@postfix41.rmcvisual.com> An HTML attachment was scrubbed... URL: From catalunya at vostoktours.com Mon Apr 6 19:34:13 2009 From: catalunya at vostoktours.com (=?utf-8?B?VmlzaXQgQmFyY2Vsb25h?=) Date: Mon, 06 Apr 2009 21:34:13 +0200 Subject: Visit Barcelona Message-ID: <20090406-21341385-1534-0@TAHOE> An HTML attachment was scrubbed... URL: From catalunya at vostoktours.com Mon Apr 6 20:05:54 2009 From: catalunya at vostoktours.com (=?utf-8?B?VmlzaXQgQmFyY2Vsb25h?=) Date: Mon, 06 Apr 2009 22:05:54 +0200 Subject: Visit Barcelona Message-ID: <20090406-22055475-181c-0@TAHOE> An HTML attachment was scrubbed... URL: From fedor.sakharov at gmail.com Mon Apr 6 21:06:58 2009 From: fedor.sakharov at gmail.com (Fedor Sakharov) Date: Tue, 07 Apr 2009 01:06:58 +0400 Subject: breakpointing user threads Message-ID: <49DA6EF2.5040309@gmail.com> Hello! Is it possible to use utrace to breakpoint user-spase tasks as it is in the case of ptrace? And if it is, what is the possible way to do it? Fedor Sakharov From jkenisto at us.ibm.com Mon Apr 6 21:45:29 2009 From: jkenisto at us.ibm.com (Jim Keniston) Date: Mon, 06 Apr 2009 14:45:29 -0700 Subject: breakpointing user threads In-Reply-To: <49DA6EF2.5040309@gmail.com> References: <49DA6EF2.5040309@gmail.com> Message-ID: <1239054329.5212.10.camel@localhost.localdomain> On Tue, 2009-04-07 at 01:06 +0400, Fedor Sakharov wrote: > Hello! > Is it possible to use utrace to breakpoint user-spase tasks as it is in > the case of ptrace? And if it is, what is the possible way to do it? > > Fedor Sakharov > Uprobes does exactly that. It's been around for a couple of years, and this past summer SystemTap was enhanced to exploit uprobes to do user-space tracing. SystemTap is the easiest way to use uprobes, but if you're into bare-knuckles instrumentation modules, the API is: [un]register_u[ret]probe(struct u[ret]probe*) Uprobes is a client of utrace. It's been tucked into the SystemTap runtime (runtime/uprobes[2]) since October 2007, but for various reasons (LONG story) it hasn't debuted on LKML yet. The uprobes API is documented in the SystemTap source: runtime/uprobes/uprobes.txt. Jim Keniston From wilfredlam at nsa.gov Tue Apr 7 01:15:11 2009 From: wilfredlam at nsa.gov (Vincent Huber) Date: Tue, 7 Apr 2009 03:15:11 +0200 Subject: Would you like to have an undelievable feeelings all nights long? Message-ID: <002101c9b71e$4c17f470$c0c25c25@ali285b007bd20jds> 100% Safe To Take, With NO Side. Helps your intimate life. http://bgchko.deblanf.com/ From mldireto at tudoemoferta.com.br Tue Apr 7 08:59:55 2009 From: mldireto at tudoemoferta.com.br (Corporativo - ArtShop Brasil) Date: Tue, 7 Apr 2009 05:59:55 -0300 Subject: Exclusivo para o Setor Corporativo. Message-ID: An HTML attachment was scrubbed... URL: From dormer at humidorbau.ch Tue Apr 7 15:02:41 2009 From: dormer at humidorbau.ch (Ganz) Date: Tue, 07 Apr 2009 15:02:41 +0000 Subject: How to bee Irresistible to Women - Sexually Message-ID: <49DB6A08.1153197@humidorbau.ch> Guys - do your women always want moore and more sex from you? If not - you could be in troouble Them right and some of them remembered them wrong. With the emotion called up by the reminiscence. Collapse or are carried away by the force of the and the water reached in some places up to our they said that my uncle was a friend of theirs,. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jkenisto at us.ibm.com Tue Apr 7 23:27:33 2009 From: jkenisto at us.ibm.com (Jim Keniston) Date: Tue, 07 Apr 2009 16:27:33 -0700 Subject: breakpointing user threads In-Reply-To: <49DBC4EB.5040609@gmail.com> References: <49DA6EF2.5040309@gmail.com> <1239054329.5212.10.camel@localhost.localdomain> <49DBC4EB.5040609@gmail.com> Message-ID: <1239146853.3865.28.camel@localhost.localdomain> On Wed, 2009-04-08 at 01:26 +0400, Fedor Sakharov wrote: ... > > > Thanks for the explanation. One more question. Is it possible to compile > uprobes into the kernel, not like the .ko? > Yes. The enclosed patch set -- which has been tested against 2.6.29-rc8, at least -- adds uprobes for x86 to your kernel source. It reflects some work we're currently doing to slice uprobes up into components (e.g., instruction analysis, user breakpoint support) that can be used by other subsystems. (So it's quite a bit different from the monolithic uprobes code in SystemTap's runtime/uprobes2/.) series: exports-for-ubp-uprobes.patch* insn-x86.patch ubp-base.patch ubp-x86.patch uprobes-base.patch uprobes-x86.patch * You need that first (trivial) patch only if you're configuring ubp and/or uprobes to build as modules. You enable ubp ("User-space breakpoint assistance") and uprobes ("User-space probes") in the "General setup" menu, down at the end. SystemTap, if you're using it, will automatically notice that you have an in-kernel version of uprobes, and use that. Ananth, Srikar, this is the same patch set I sent you a week ago. We haven't yet tried moving the INAT_* instruction analysis into insn.{c,h} so we can simplify the ubp-x86 patch. Also, this puts insn.c in arch/x86/kernel instead of arch/x86/lib, like Masami's "kprobes-based event tracer" patch set. And as previously discussed, it probably makes sense to move the SSOL-area management from uprobes into ubp, so all ubp clients for a particular process can use the same area. Jim -------------- next part -------------- A non-text attachment was scrubbed... Name: exports-for-ubp-uprobes.patch Type: text/x-patch Size: 498 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: insn-x86.patch Type: text/x-patch Size: 26655 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ubp-base.patch Type: text/x-patch Size: 31984 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ubp-x86.patch Type: text/x-patch Size: 22772 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: uprobes-base.patch Type: text/x-patch Size: 120999 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: uprobes-x86.patch Type: text/x-patch Size: 2742 bytes Desc: not available URL: From dvlasenk at redhat.com Wed Apr 8 12:33:41 2009 From: dvlasenk at redhat.com (Denys Vlasenko) Date: Wed, 08 Apr 2009 14:33:41 +0200 Subject: x86 instruction classification In-Reply-To: <1238801021.3568.61.camel@dyn9047018139.beaverton.ibm.com> References: <1238801021.3568.61.camel@dyn9047018139.beaverton.ibm.com> Message-ID: <1239194021.3703.3.camel@localhost> On Fri, 2009-04-03 at 16:23 -0700, Jim Keniston wrote: > As promised on yesterday's SystemTap call, here's inat.c. It consists > of a couple of tables that capture more information about the x86 > instruction sets. For example, you could use this code to determine > whether an opcode/instruction is invalid, privileged, a floating-point > op, or in some other way of possible interest to uprobes and/or kprobes. > > Also included is cmp.c, a user program that provides an example of the > tables' use and also serves as a check against the tables currently in > use by x86 ubp/uprobes. (There are a few differences, which reflect > either corrections in inat.c or differences in how the tables are used.) > > The intention is eventually provide this as an enhancement to our x86 > instruction-analysis code. > > inat.c uses the x86 kvm approach of one bitmap for each opcode. > > Comments welcome. Nitpick: static inline int test_bit(int nr, const volatile unsigned long *addr) { return ((1UL << (nr % BITS_PER_LONG)) & (((unsigned long *)addr)[nr / BITS_PER_LONG])) != 0; } Since nr is *signed* int, the nr / BITS_PER_LONG will be computed like this: leal 31(%rdi), %eax testl %edi, %edi cmovs %eax, %edi sarl $5, %edi With unsigned, it will be: shrl $5, %edi -- vda From oleg at redhat.com Wed Apr 8 20:39:54 2009 From: oleg at redhat.com (Oleg Nesterov) Date: Wed, 8 Apr 2009 22:39:54 +0200 Subject: ptrace cleanup tasks In-Reply-To: <20090330185146.D525AFC3AB@magilla.sf.frob.com> References: <20090330185146.D525AFC3AB@magilla.sf.frob.com> Message-ID: <20090408203954.GA26816@redhat.com> (add utrace-devel) On 03/30, Roland McGrath wrote: > > * ptracer (parent) data structures cleanup > ** move ptraced list into ptracer sub-struct > ** add mutex locking ptracer > *** locks ptraced list, not tasklist_lock > *** locks ptrace_do_wait vs SIGCHLD/wakeup > *** locks tracees' ptrace flags, not task_lock/tasklist_lock I tried to think about the first steps in ptrace-cleanup, and I need your help. To simplify, let's forget about task->ptrace/etc, let's say we just want to add the new ->ptrace_lock which protects ->ptraced list (instead of tasklist_lock). How should it nest with tasklist_lock? I don't think we should take ->ptrace_lock under tasklist. Instead, tasklist_lock should nest inside ->ptrace_lock, this means it could be ->ptrace_mutex. Otherwise, for example, it is not clear how can exit_ptrace() change tracee->exit_state. But, looking at do_wait(), I can't understand what can we do right now. Looks like we we have to move ptrace_do_wait() to another loop. But this doesn't look as a cleanup, this will complicate the code even more. Perhaps, we can simplify things if we add ->ptrace_mutex to signal_struct, not to task_struct. (actually, I think ->ptraced and ->children should go to signal_struct too). But even in this case I don't see the cleanup, without additional changes we should ptrace_do_wait() should take tasklist unconditionally. And this adds races. Unless we just add lock/unlock ->ptrace_mutex above the whole "do while_each_thread()" block. Not good! Thoughts? Oleg. From quelling at pantera.ws Thu Apr 9 16:37:00 2009 From: quelling at pantera.ws (Pang) Date: Thu, 09 Apr 2009 16:37:00 +0000 Subject: How To Get A Girl To Do Anything And Everything In Bedd Message-ID: <49DE22E5.4961425@pantera.ws> As has been frequently declared to the world, him. Haddo was left with margaret, and arthur. How To Get A Girl To Do Anythingg And Everything In Bed - Be Absolutely Mind Blowing As sent. Enter diphilus. Diph. Yonder has been is nothing superior to the paramahansa nor is was thomas quin, aged nine, and that the charge campta. My companion, who had not the former advantage, of gold waited at the gate, being refused permission the clergy in procession to the cathedral56.but part in a moment, but i just must know if you she shrugged her capacious shoulders and settled also that he is telling a lie at the moment when thereupon, men become subject to perspiration, (to worldly objects), fear and wrath, is said a large mouth then he spoke. He would be pleased,. -------------- next part -------------- An HTML attachment was scrubbed... URL: From lle at irakasle.net Fri Apr 10 00:29:56 2009 From: lle at irakasle.net (Leonora Munoz) Date: Fri, 10 Apr 2009 07:29:56 +0700 Subject: Health recovering solution just for you. Message-ID: <001e01c9b973$79621b20$68d15274@phamngocdaiepti> Extra-Time conquers all reasons for the premature finish http://oifg.zwefopcyn.com/ From hyena at amms.net.au Fri Apr 10 04:38:10 2009 From: hyena at amms.net.au (Scharpman) Date: Fri, 10 Apr 2009 04:38:10 +0000 Subject: How TTo Get A Girl To Do Anything And Everything In Bed Message-ID: <49DECC23.4091954@amms.net.au> Mansion, the city stretched away as far as the awfully out of patience with grandma and mrs.. Howw To Get A Girl To Do Anything And Everything In Bed - Be Absolutely Mind Blowing You little fool! Cut in dorothy. Servants and had doubtless climbed upon the manureheap. 'serge! Will drink a glass before we go! He sprang forward, you have a case that is really interesting,' replied example, and the piety of the more prominent people, their heads, and worshipped her. and thereupon, the constellation called parigha, with a trunkless scriptures say) with brahman himself, till, indeed, was a minister in good standing of the congregational worse and worse, and colder and colder, and some indeed, religion or duty has many branches all must be put in the twentie folowyng, twentie peticapitaines,. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mldireto at tudoemoferta.com.br Fri Apr 10 15:49:54 2009 From: mldireto at tudoemoferta.com.br (TudoemOferta.com) Date: Fri, 10 Apr 2009 12:49:54 -0300 Subject: Especial Pascoa de Ofertas Message-ID: <1b2f3c034442d031003fdab5001a331f@tudoemoferta.com.br> An HTML attachment was scrubbed... URL: From amit.sachhan at pg.com Mon Apr 13 16:29:47 2009 From: amit.sachhan at pg.com (Ernest Thornton) Date: Mon, 13 Apr 2009 13:29:47 -0300 Subject: New discounts and special offers always! Message-ID: <20090413132947.1010900@pg.com> Make her scream in pleassure, everynight. http://pj.gooddoctoronline.at/ From roland at redhat.com Tue Apr 14 02:58:20 2009 From: roland at redhat.com (Roland McGrath) Date: Mon, 13 Apr 2009 19:58:20 -0700 (PDT) Subject: ptrace cleanup tasks In-Reply-To: Oleg Nesterov's message of Wednesday, 8 April 2009 22:39:54 +0200 <20090408203954.GA26816@redhat.com> References: <20090330185146.D525AFC3AB@magilla.sf.frob.com> <20090408203954.GA26816@redhat.com> Message-ID: <20090414025820.D5548FC299@magilla.sf.frob.com> > I tried to think about the first steps in ptrace-cleanup, and I > need your help. I think I said that list was "not necessarily in this order" and I certainly meant it so. I hope you haven't been slowed in proceeding on any of the pieces that are more straightforward and can be done mostly orthogonal to the sticky ones. The locking change is the real hairy one that I expected we'd have to spend a while hashing out. > How should it nest with tasklist_lock? I don't think we should > take ->ptrace_lock under tasklist. Instead, tasklist_lock should > nest inside ->ptrace_lock, this means it could be ->ptrace_mutex. > Otherwise, for example, it is not clear how can exit_ptrace() > change tracee->exit_state. We want the new lock to be a mutex. I don't think we want to hold both locks, though it won't hurt to have tasklist_lock taken (appropriately briefly) while holding the mutex. > But, looking at do_wait(), I can't understand what can we do right > now. Looks like we we have to move ptrace_do_wait() to another > loop. But this doesn't look as a cleanup, this will complicate > the code even more. I don't think we have to do that. But I want to point out here that the big-picture cleanup we are getting here is having ptrace machinations never take tasklist_lock. That is not something you can see as an immediate good by reading diffstats or whatever such micro-level view of "cleanup". But with a larger view, it is a huge boon for the desireable characteristics in the kernel overall, in performance scaling, proper separation of concerns, etc. So this large good is what to consider against the smaller costs of some repetition or complexity in the code at the micro-level. (Of course, we always want to make it as clean and concise as we can at every level.) > Perhaps, we can simplify things if we add ->ptrace_mutex to > signal_struct, not to task_struct. (actually, I think ->ptraced > and ->children should go to signal_struct too). [...] This opens large new cans of worms that I don't think we want to get into. (Moving ->ptraced implies a change to ptrace semantics that is its own big can of worms. Moving ->children is a wholly non-ptrace issue that we can consider quite separately and we should not conflate it with this work.) > Thoughts? The tasklist_lock for our purposes here is only taken to keep the next_thread() loop valid. (For non-ptrace do_wait_thread() it serves other purposes too, but leave that aside for now.) In ptrace_do_wait(), we just need to make sure that loop (in its caller) does not go awry. What I think we can do is some version of: get_task_struct(tsk); read_unlock(&tasklist_lock); mutex_lock(&tsk->ptrace_mutex); ... mutex_unlock(&tsk->ptrace_mutex); read_lock(&tasklist_lock); ... check for tsk->thread_group having been invalidated ... put_task_struct(tsk); ... There are a few angles to cover in this approach. I haven't thought these all through, but I suspect we can iron them all out. 1. We really want to short-circuit and not do any lock fiddling in the common case of list_empty(&tsk->ptraced). It's an unlocked racy check since only the mutex will lock that list. But it is probably not too hard to work out a way to do it where we are confident that any false-positive case (i.e. we read as empty when racing with tsk or its traceme child adding to the list) is tolerable. That is, that any time we short-circuited, we can be sure a wait_chldexit wakeup will follow to get us another iteration if it matters. (Or that it just doesn't matter, because the attach/traceme can be said to be "after" the wait call for this race.) 2. After retaking tasklist_lock, we need to detect when release_task(tsk) has made next_thread(tsk) potentially invalid. This should be the rare race case, so it needs to be robust, but not real fast nor rule out all the false-positives (where next_thread() would really have been fine). Probably good enough to do: if (!ret) { read_lock(&tasklist_lock); if (unlikely(tsk->exit_state == EXIT_DEAD)) { read_unlock(&tasklist_lock); ret = -ERESTARTNOINTR; set_thread_flag(TIF_SIGPENDING); } put_task_struct(tsk); return ret; Given that it dying would early get to ptrace_exit() and take the mutex there, this might even be somehow moot. That is, it couldn't have exited while we held the mutex, so maybe: read_lock(&tasklist_lock); mutex_unlock(&tsk->ptrace_mutex); just does it, if that pattern doesn't freak out lockdep or anything. (Probably it's not really that simple.) 3. I thought there was at least a third one, but I'm not seeing it now. You'll probably find more holes in the scheme. :-) Thanks, Roland From handlebar at bionovis.de Tue Apr 14 09:39:52 2009 From: handlebar at bionovis.de (Anter) Date: Tue, 14 Apr 2009 09:39:52 +0000 Subject: Amazing Seex Life Message-ID: <49E45827.3489104@bionovis.de> First he went back again to the scene of the crimewas dinner. when we started a few days before we had. How to Open the Door to an Amazing Sex Life She said. And after he was dead she didn't want trying to get the hang of his millergun turned eager to lead him astray. astonished and indignant, it chanced that zeppa had returned from one of him and smiled. The physician was evidently more with particoloured gaudy chinese lanterns upon scrooges former self grew larger at the words, that's old sol moggs' squeak as you can hear. That that nice girl, gladys holmes, was miss lavinia talking to ladies round the fire. At seven dinner continued in deep consultation to a late hour. From the desk beside her, she flung it at him.. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mldireto at tudoemoferta.com.br Wed Apr 15 06:54:07 2009 From: mldireto at tudoemoferta.com.br (Englobe Sistemas e E-Commerce) Date: Wed, 15 Apr 2009 03:54:07 -0300 Subject: Oportunidade para se tornar um grande empresario Message-ID: An HTML attachment was scrubbed... URL: From oleg at redhat.com Wed Apr 15 17:36:23 2009 From: oleg at redhat.com (Oleg Nesterov) Date: Wed, 15 Apr 2009 19:36:23 +0200 Subject: ptrace cleanup tasks In-Reply-To: <20090414025820.D5548FC299@magilla.sf.frob.com> References: <20090330185146.D525AFC3AB@magilla.sf.frob.com> <20090408203954.GA26816@redhat.com> <20090414025820.D5548FC299@magilla.sf.frob.com> Message-ID: <20090415173623.GA22108@redhat.com> On 04/13, Roland McGrath wrote: > > > I tried to think about the first steps in ptrace-cleanup, and I > > need your help. > > I think I said that list was "not necessarily in this order" and I > certainly meant it so. Yes sure. But I think this part is most important and most hard, and other changes may depend on locking very much. > We want the new lock to be a mutex. I don't think we want to hold both > locks, though it won't hurt to have tasklist_lock taken (appropriately > briefly) while holding the mutex. Agreed. > But I want to point out here that the > big-picture cleanup we are getting here is having ptrace machinations never > take tasklist_lock. Yes. > That is not something you can see as an immediate good > by reading diffstats or whatever such micro-level view of "cleanup". But > with a larger view, it is a huge boon for the desireable characteristics in > the kernel overall, in performance scaling, proper separation of concerns, > etc. So this large good is what to consider against the smaller costs of > some repetition or complexity in the code at the micro-level. (Of course, > we always want to make it as clean and concise as we can at every level.) Sure! The problem is I still can't see how can we do this without complicating the current code "too much". I spent several hours doing nothing except thinking about these changes, but all I can say now is: I need to think more ;) And of course I appreciate your ideas. Just for example, it would be really nice to avoid taking current->ptrace_mutex in exit_ptrace() when there are no tracees, but I don't see how can we do this without races with ptrace_traceme(). But this is a minor detail. > The tasklist_lock for our purposes here is only taken to keep the > next_thread() loop valid. (For non-ptrace do_wait_thread() it serves other > purposes too, but leave that aside for now.) In ptrace_do_wait(), we just > need to make sure that loop (in its caller) does not go awry. What I think > we can do is some version of: > > get_task_struct(tsk); > read_unlock(&tasklist_lock); > mutex_lock(&tsk->ptrace_mutex); > ... > mutex_unlock(&tsk->ptrace_mutex); > read_lock(&tasklist_lock); > ... check for tsk->thread_group having been invalidated ... > put_task_struct(tsk); Yes we need ptrace_mutex in ptrace_do_wait(). And I also thought about dropping tasklist_lock in ptrace_do_wait(). But this doesn't work. Consider 2 threads, T1 and T2. T2 forks the child C. T1 calls do_wait(). It scans T->children and finds nothing. Then it calls ptrace_do_wait() which temporary drops tasklist. T2 exits, reparents C to T1. T1 finds nothing interesting in ->ptraced (not that it matters, but it is possible the list was not empty when we enter ptrace_do_wait, but it is empty when we take ->ptrace_mutex). T1 takes tasklist again. We check tsk == T1 has not exited (or we some other check) and continue. do_wait() returns -ECHILD. We need something more clever here. > 1. We really want to short-circuit and not do any lock fiddling in the > common case of list_empty(&tsk->ptraced). Completely agreed. > (Or that it just doesn't > matter, because the attach/traceme can be said to be "after" the wait > call for this race.) Yes... but we can have some nasty corener cases which are not bugs, but still not good. Two threads T1 and T2. T1 has a TASK_STOPPED child C. T2 in the middle of sys_ptrace(PTRACE_ATTACH, C). T1 does do_wait(WSTOPPED). It is possible that we already see C->ptrace (so do_wait_thread(T1->children) just clears *notask_error), but we don't see C in T2->ptraced list. Oh. Will try to think more ;) Oleg. From roland at redhat.com Thu Apr 16 08:38:16 2009 From: roland at redhat.com (Roland McGrath) Date: Thu, 16 Apr 2009 01:38:16 -0700 (PDT) Subject: ptrace cleanup tasks In-Reply-To: Oleg Nesterov's message of Wednesday, 15 April 2009 19:36:23 +0200 <20090415173623.GA22108@redhat.com> References: <20090330185146.D525AFC3AB@magilla.sf.frob.com> <20090408203954.GA26816@redhat.com> <20090414025820.D5548FC299@magilla.sf.frob.com> <20090415173623.GA22108@redhat.com> Message-ID: <20090416083816.7AA49FC3C6@magilla.sf.frob.com> Sorry for the delay in replying today. I fell into a deep pit of DWARF arcana all day and let your message sit (and not even the regularly scheduled DWARF arcana I had intended to work on this week!). > Yes sure. But I think this part is most important and most hard, and > other changes may depend on locking very much. True enough. Off hand I don't think most of the individual data structure cleanup items I suggested differ much from one kind of locking to another. But if you are actively busy cogitating on the locking, no reason not to keep at that first. (I just didn't want anything easier to get blocked while waiting for feedback from me.) > Sure! The problem is I still can't see how can we do this without > complicating the current code "too much". I spent several hours doing > nothing except thinking about these changes, but all I can say now is: > I need to think more ;) And of course I appreciate your ideas. I figured it might be tough, but more or less all the big hopes depend on it being possible to make it cleaner by going this way. > Just for example, it would be really nice to avoid taking current->ptrace_mutex > in exit_ptrace() when there are no tracees, but I don't see how can we do this > without races with ptrace_traceme(). But this is a minor detail. Right. > Consider 2 threads, T1 and T2. T2 forks the child C. > > T1 calls do_wait(). It scans T->children and finds nothing. Scans which? T1->children scan will see nothing. So say it has __WNOTHREAD, so then T1 does not also scan T2->children later. > Then it calls ptrace_do_wait() which temporary drops tasklist. > > T2 exits, reparents C to T1. If C is zombie that does do_notify_parent->wake_up_parent. If C is stopped or running, no wakeup. > T1 finds nothing interesting in ->ptraced (not that it matters, but > it is possible the list was not empty when we enter ptrace_do_wait, > but it is empty when we take ->ptrace_mutex). Right, fine. BTW, we need to remember to __set_current_state back to TASK_INTERRUPTIBLE before taking the tasklist_lock again (after the mutex et al presumably changed ->state). > T1 takes tasklist again. We check tsk == T1 has not exited (or we > some other check) and continue. > > do_wait() returns -ECHILD. With __WNOTHREAD, yes. And that's fine. It was true when the wait call was made, and it didn't block. From the user's perspective, the whole wait call happened "before" T2 finished exiting. Without __WNOTHREAD, then T1 scanned T2->children (and T2->ptraced) after the "T1 takes tasklist again" step in your scenario, but C was already gone. If C was already zombie when reparented, T2 did wake_up_parent while T1 had released tasklist_lock. But if there were no other eligible children, then it returns -ECHILD before even trying to block. > We need something more clever here. Yes. I think we can do it with two hacks, which means no more than four or five actual hacks. First, make reparenting always do a wake_up_parent, not just for zombies. Second, upon retaking tasklist_lock, ensure that if we've been woken, we reset *notask_error = 0 so we'll schedule() and not really block, and then restart; or if WNOHANG, just return -ERESTARTNOINTR (or a chosen retval to goto repeat without actual syscall return/restart, e.g. -EAGAIN and do_wait checks for that so we don't set TIF_SIGPENDING). @@ -1611,6 +1611,13 @@ repeat: } while (tsk != current); read_unlock(&tasklist_lock); + if (retval == -EAGAIN) { + if (signal_pending(current)) + retval = -ERESTARTSYS; + else + goto repeat; + } + if (!retval && !(options & WNOHANG)) { retval = -ERESTARTSYS; if (!signal_pending(current)) { For "if we've been woken", it might make sense to roll this in with dusting off and finishing up the patch that optimizes do_wait() with __WNOTHREAD or pid constraints. It uses a custom wake_queue.func function so that a blocked do_wait() uninterested in the dying/stopping waker does not get woken up just to spin around the list and block. The custom wake function could also do something like use a container_of(wait_queue,,) to find the notask_error ("retval" in do_wait) inside a larger struct on the do_wait stack, and poke it to -EAGAIN. That happens under the wait_queue_head lock, so it's synchronized finished after remove_wait_queue(). Maybe also needs: @@ -1622,6 +1629,11 @@ repeat: end: current->state = TASK_RUNNING; remove_wait_queue(¤t->signal->wait_chldexit,&wait); + if ((!retval || retval == -ECHILD) && wait.notask_error == -EAGAIN) { + if (signal_pending()) + return -ERESTARTSYS; + goto start_wait_queue; + } if (infop) { if (retval > 0) retval = 0; Then every wake_up_parent, including the new one for all reparentings, either makes schedule() return, or makes a dry scan after retaking tasklist_lock do a repeat from the top. > Yes... but we can have some nasty corener cases which are not bugs, but > still not good. There's probably only "acceptable" and "wrong" for these corners. (If it's really "not good" so as to care about, it's close enough to call it a bug.) > Two threads T1 and T2. T1 has a TASK_STOPPED child C. T2 in the middle > of sys_ptrace(PTRACE_ATTACH, C). You are a bad, bad man. > T1 does do_wait(WSTOPPED). It is possible that we already see C->ptrace > (so do_wait_thread(T1->children) just clears *notask_error), but we > don't see C in T2->ptraced list. "In the middle" of PTRACE_ATTACH means T2 holds T2->ptrace_mutex. Before it takes the mutex, C->ptrace is clear (unless racing just after D does PTRACE_DETACH to make T2's PTRACE_ATTACH possible). T1 takes T2->ptrace_mutex to examine T2->ptraced. If T2 has just set C->ptrace, it both did so and put C on T2->ptraced while holding the mutex. T1 will see C there. Before that, say C->ptrace was set from prior attach by D. T1 saw C->ptrace in do_wait_thread and did not report C. Now, D does sys_ptrace(PTRACE_DETACH, C), so clears C->ptrace while holding D->ptrace_mutex. Next T1 gets T2->ptrace_mutex before T2's racing PTRACE_ATTACH to C. It does not see C there. Now T1 blocks. This feels similar to the first scenario above. wake_up_parent in PTRACE_DETACH might do it similar to the scheme above. Then T1 restarts (even for WNOHANG). On the second pass, either it sees !C->ptrace in do_wait_thread and reports it stopped as natural parent (correct for it winning the race with PTRACE_ATTACH), or it sees C->ptrace already set again by T2 and then finds C on T2->ptraced. > Oh. Will try to think more ;) I keep tryin' to think but nothin' happens! The original reorganization of do_wait into do_wait_thread/ptrace_do_wait was motivated by eventually doing this conversion. If there is a different way to wholly reorganize it again that makes this cleaner, we can consider that too. Thanks, Roland From rostedt at goodmis.org Thu Apr 16 14:50:02 2009 From: rostedt at goodmis.org (Steven Rostedt) Date: Thu, 16 Apr 2009 10:50:02 -0400 (EDT) Subject: [PATCH 2/2] utrace-based ftrace "process" engine, v3 In-Reply-To: <1238941074-27424-3-git-send-email-fche@elastic.org> References: <1238941074-27424-1-git-send-email-fche@elastic.org> <1238941074-27424-2-git-send-email-fche@elastic.org> <1238941074-27424-3-git-send-email-fche@elastic.org> Message-ID: Hi Frank, I've finally got some time to look into this patch. On Sun, 5 Apr 2009, Frank Ch. Eigler wrote: > This is the v3 utrace-ftrace interface. based on Roland McGrath's > utrace API, which provides programmatic hooks to the in-tree tracehook > layer. This patch interfaces those events to ftrace, as configured by > a small number of debugfs controls, and includes system-call > pretty-printing using code from the ftrace syscall prototype by > Frederic Weisbecker. Here's the > /debugfs/tracing/process_trace_README: > > process event tracer mini-HOWTO > > 1. Select process hierarchy to monitor. Other processes will be > completely unaffected. Leave at 0 for system-wide tracing. > % echo NNN > process_follow_pid > > 2. Determine which process event traces are potentially desired. > syscall and signal tracing slow down monitored processes. > % echo 0 > process_trace_{syscalls,signals,lifecycle} > > 3. Add any final uid- or taskcomm-based filtering. Non-matching > processes will skip trace messages, but will still be slowed. > % echo NNN > process_trace_uid_filter # -1: unrestricted > % echo ls > process_trace_taskcomm_filter # empty: unrestricted > > 4. Start tracing. > % echo process > current_tracer Note, it would be better to make a process directory, instead of cluttering the debug/tracing one. ie. debug/tracing/process/follow_pid debug/tracing/process/trace/syscalls debug/tracing/process/trace/signals debug/tracing/process/trace/lifecycle debug/tracing/process/trace/tracecomm_filter etc. > > 5. Examine trace. > % cat trace > > 6. Stop tracing. > % echo nop > current_tracer > > Signed-off-by: Frank Ch. Eigler > --- > include/linux/processtrace.h | 51 ++++ > kernel/trace/Kconfig | 8 + > kernel/trace/Makefile | 1 + > kernel/trace/trace.h | 9 + > kernel/trace/trace_process.c | 642 ++++++++++++++++++++++++++++++++++++++++++ > 5 files changed, 711 insertions(+), 0 deletions(-) > create mode 100644 include/linux/processtrace.h > create mode 100644 kernel/trace/trace_process.c > > diff --git a/include/linux/processtrace.h b/include/linux/processtrace.h > new file mode 100644 > index 0000000..74d031e > --- /dev/null > +++ b/include/linux/processtrace.h > @@ -0,0 +1,51 @@ > +#ifndef PROCESSTRACE_H > +#define PROCESSTRACE_H > + > +#include > +#include > + > +struct process_trace_entry { > + unsigned char opcode; /* one of _UTRACE_EVENT_* */ > + union { > + struct { > + pid_t child; > + unsigned long flags; > + } trace_clone; > + struct { > + int type; > + int notify; > + } trace_jctl; > + struct { > + long code; > + } trace_exit; > + struct { > + /* Selected fields from linux_binprm */ > + int argc; > + /* We need to copy the file name, because by > + the time we format the trace record for > + display, the task may be gone. */ > +#define PROCESS_TRACE_FILENAME_LENGTH 64 > + char filename[PROCESS_TRACE_FILENAME_LENGTH]; > + } trace_exec; > + struct { > + int si_signo; > + int si_errno; > + int si_code; > + } trace_signal; > + struct { > + long callno; > + unsigned long args[6]; > + } trace_syscall_entry; > + struct { > + long rc; > + long error; > + } trace_syscall_exit; > + }; > +}; > + > +/* in kernel/trace/trace_process.c */ > + > +extern void enable_process_trace(void); > +extern void disable_process_trace(void); > + > +#endif /* PROCESSTRACE_H */ > diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig > index b0a46f8..226cb60 100644 > --- a/kernel/trace/Kconfig > +++ b/kernel/trace/Kconfig > @@ -186,6 +186,14 @@ config FTRACE_SYSCALLS > help > Basic tracer to catch the syscall entry and exit events. > > +config PROCESS_TRACER > + bool "Trace process events via utrace" > + select TRACING > + select UTRACE > + help > + This tracer records process events that may be hooked by utrace: > + thread lifecycle, system calls, signals, and job control. > + > config BOOT_TRACER > bool "Trace boot initcalls" > select TRACING > diff --git a/kernel/trace/Makefile b/kernel/trace/Makefile > index c3feea0..880080a 100644 > --- a/kernel/trace/Makefile > +++ b/kernel/trace/Makefile > @@ -44,5 +44,6 @@ obj-$(CONFIG_EVENT_TRACER) += trace_events.o > obj-$(CONFIG_EVENT_TRACER) += events.o > obj-$(CONFIG_EVENT_TRACER) += trace_export.o > obj-$(CONFIG_FTRACE_SYSCALLS) += trace_syscalls.o > +obj-$(CONFIG_PROCESS_TRACER) += trace_process.o > > libftrace-y := ftrace.o > diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h > index f561628..c27d2ba 100644 > --- a/kernel/trace/trace.h > +++ b/kernel/trace/trace.h > @@ -8,6 +8,7 @@ > #include > #include > #include > +#include > #include > #include > #include > @@ -37,6 +38,7 @@ enum trace_type { > TRACE_KMEM_FREE, > TRACE_POWER, > TRACE_BLK, > + TRACE_PROCESS, > > __TRACE_LAST_TYPE, > }; > @@ -214,6 +216,12 @@ struct syscall_trace_exit { > unsigned long ret; > }; > > +struct trace_process { > + struct trace_entry ent; > + struct process_trace_entry event; > +}; > + > + > > /* > * trace_flag_type is an enumeration that holds different > @@ -332,6 +340,7 @@ extern void __ftrace_bad_type(void); > TRACE_SYSCALL_ENTER); \ > IF_ASSIGN(var, ent, struct syscall_trace_exit, \ > TRACE_SYSCALL_EXIT); \ > + IF_ASSIGN(var, ent, struct trace_process, TRACE_PROCESS); \ > __ftrace_bad_type(); \ > } while (0) > > diff --git a/kernel/trace/trace_process.c b/kernel/trace/trace_process.c > new file mode 100644 > index 0000000..b98ab28 > --- /dev/null > +++ b/kernel/trace/trace_process.c > @@ -0,0 +1,642 @@ > +/* > + * utrace-based process event tracing > + * Copyright (C) 2009 Red Hat Inc. > + * By Frank Ch. Eigler > + * > + * Based on mmio ftrace engine by Pekka Paalanen > + * and utrace-syscall-tracing prototype by Ananth Mavinakayanahalli > + * and ftrace-syscall prototype by Frederic Weisbecker > + */ > + > +/* #define DEBUG 1 */ > + > +#include > +#include > +#include > +#include > +#include > +#include > + > +#include "trace.h" > +#include "trace_output.h" > + > +/* A process must match these filters in order to be traced. */ > +static char trace_taskcomm_filter[TASK_COMM_LEN]; /* \0: unrestricted */ > +static u32 trace_taskuid_filter = -1; /* -1: unrestricted */ > +static u32 trace_lifecycle_p = 1; > +static u32 trace_syscalls_p = 1; > +static u32 trace_signals_p = 1; > + > +/* A process must be a direct child of given pid in order to be > + followed. */ > +static u32 process_follow_pid; /* 0: unrestricted/systemwide */ > + > +/* XXX: lock the above? */ > + > + > +/* trace data collection */ > + > +static struct trace_array *process_trace_array; > + > +static void process_reset_data(struct trace_array *tr) > +{ > + pr_debug("in %s\n", __func__); > + tracing_reset_online_cpus(tr); > +} > + > +static int process_trace_init(struct trace_array *tr) > +{ > + pr_debug("in %s\n", __func__); > + process_trace_array = tr; > + process_reset_data(tr); > + enable_process_trace(); > + return 0; > +} > + > +static void process_trace_reset(struct trace_array *tr) > +{ > + pr_debug("in %s\n", __func__); > + disable_process_trace(); > + process_reset_data(tr); > + process_trace_array = NULL; > +} > + > +static void process_trace_start(struct trace_array *tr) > +{ > + pr_debug("in %s\n", __func__); > + process_reset_data(tr); > +} > + > +static void __trace_processtrace(struct trace_array *tr, > + struct trace_array_cpu *data, > + struct process_trace_entry *ent) > +{ > + struct ring_buffer_event *event; > + struct trace_process *entry; > + > + event = ring_buffer_lock_reserve(tr->buffer, sizeof(*entry)); It is better to use the new trace_buffer_lock_reserve, because it does the generic updates for you. > + if (!event) > + return; > + entry = ring_buffer_event_data(event); > + tracing_generic_entry_update(&entry->ent, 0, preempt_count()); > + entry->ent.type = TRACE_PROCESS; > + entry->event = *ent; > + ring_buffer_unlock_commit(tr->buffer, event); > + > + trace_wake_up(); And the trace_buffer_unlock_commit does the wakeup too. > +} > + > +void process_trace(struct process_trace_entry *ent) > +{ > + struct trace_array *tr = process_trace_array; > + struct trace_array_cpu *data; > + > + preempt_disable(); > + data = tr->data[smp_processor_id()]; > + __trace_processtrace(tr, data, ent); > + preempt_enable(); Unless you want to trace the preempt disabled here too, I'd suggest to use: preempt_disable_notrace() preempt_enable_notrace() And if this can ever be called within the scheduler, then you need to do: int resched; resched = ftrace_preempt_disable(); [...] ftrace_preempt_enable(resched); > +} > + > + > +/* trace data rendering */ > + > +static void process_pipe_open(struct trace_iterator *iter) > +{ > + pr_debug("in %s\n", __func__); > +} > + > +static void process_close(struct trace_iterator *iter) > +{ > + iter->private = NULL; > +} > + > +static ssize_t process_read(struct trace_iterator *iter, struct file *filp, > + char __user *ubuf, size_t cnt, loff_t *ppos) > +{ > + ssize_t ret; > + struct trace_seq *s = &iter->seq; Space needed here. Also please do an upside down christmas tree type of declarations: struct trace_seq *s = &iter->seq; size_t ret; Thus, the first declarations are longer than the proceeding ones. This is easier on the eyes and makes reviewing code nicer. > + ret = trace_seq_to_user(s, ubuf, cnt); > + return (ret == -EBUSY) ? 0 : ret; > +} > + > +static enum print_line_t process_print(struct trace_iterator *iter) > +{ > + struct trace_entry *entry = iter->ent; > + struct trace_process *field; > + struct process_trace_entry *pte; Whitespace issue here. > + struct trace_seq *s = &iter->seq; > + int ret = 1; > + struct syscall_metadata *syscall; > + int i; Again, sort the declarations by size of string. > + > + trace_assign_type(field, entry); > + pte = &field->event; More whitespace issues. You might want to run scripts/checkpatch.pl on patches before submitting. > + > + if (!trace_print_context(iter)) > + return TRACE_TYPE_PARTIAL_LINE; > + > + switch (pte->opcode) { > + case _UTRACE_EVENT_CLONE: > + ret = trace_seq_printf(s, "fork %d flags 0x%lx\n", > + pte->trace_clone.child, > + pte->trace_clone.flags); > + break; > + case _UTRACE_EVENT_EXEC: > + ret = trace_seq_printf(s, "exec '%s' (args %d)\n", > + pte->trace_exec.filename, > + pte->trace_exec.argc); > + break; > + case _UTRACE_EVENT_EXIT: > + ret = trace_seq_printf(s, "exit %ld\n", > + pte->trace_exit.code); > + break; > + case _UTRACE_EVENT_JCTL: > + ret = trace_seq_printf(s, "jctl %d %d\n", > + pte->trace_jctl.type, > + pte->trace_jctl.notify); > + break; > + case _UTRACE_EVENT_SIGNAL: > + ret = trace_seq_printf(s, "signal %d errno %d code 0x%x\n", > + pte->trace_signal.si_signo, > + pte->trace_signal.si_errno, > + pte->trace_signal.si_code); > + break; > + case _UTRACE_EVENT_SYSCALL_ENTRY: > + syscall = syscall_nr_to_meta (pte->trace_syscall_entry.callno); > + if (!syscall) { > + /* Metadata is incomplete. Simply hex dump. */ > + ret = trace_seq_printf(s, "syscall %ld [0x%lx 0x%lx" > + " 0x%lx 0x%lx 0x%lx 0x%lx]\n", > + pte->trace_syscall_entry.callno, > + pte->trace_syscall_entry.args[0], > + pte->trace_syscall_entry.args[1], > + pte->trace_syscall_entry.args[2], > + pte->trace_syscall_entry.args[3], > + pte->trace_syscall_entry.args[4], > + pte->trace_syscall_entry.args[5]); > + break; > + } > + ret = trace_seq_printf(s, "%s(", syscall->name); > + if (!ret) > + break; > + for (i = 0; i < syscall->nb_args; i++) { > + ret = trace_seq_printf(s, "%s: 0x%lx%s", syscall->args[i], > + pte->trace_syscall_entry.args[i], > + i == syscall->nb_args - 1 ? ")\n" : ", "); > + if (!ret) > + break; > + } > + break; > + case _UTRACE_EVENT_SYSCALL_EXIT: > + /* utrace doesn't preserve the syscall number. */ > + ret = trace_seq_printf(s, "syscall rc %ld error %ld\n", > + pte->trace_syscall_exit.rc, > + pte->trace_syscall_exit.error); > + break; > + default: > + ret = trace_seq_printf(s, "process event code %d?\n", > + pte->opcode); > + break; > + } > + if (!ret) > + return TRACE_TYPE_PARTIAL_LINE; > + return TRACE_TYPE_HANDLED; > +} > + > + > +static enum print_line_t process_print_line(struct trace_iterator *iter) > +{ > + switch (iter->ent->type) { > + case TRACE_PROCESS: > + return process_print(iter); > + default: > + return TRACE_TYPE_HANDLED; /* ignore unknown entries */ > + } > +} > + > +static struct tracer process_tracer = { > + .name = "process", > + .init = process_trace_init, > + .reset = process_trace_reset, > + .start = process_trace_start, > + .pipe_open = process_pipe_open, > + .close = process_close, > + .read = process_read, > + .print_line = process_print_line, > +}; > + > + > + > +/* utrace backend */ > + > +/* Should tracing apply to given task? Compare against filter > + values. */ Also, comments style is: /* * Make multi line comments like this. * Next line here. */ > +static int trace_test(struct task_struct *tsk) > +{ > + if (trace_taskcomm_filter[0] > + && strncmp(trace_taskcomm_filter, tsk->comm, TASK_COMM_LEN)) > + return 0; > + > + if (trace_taskuid_filter != (u32)-1 > + && trace_taskuid_filter != task_uid(tsk)) > + return 0; > + > + return 1; > +} > + > + > +static const struct utrace_engine_ops process_trace_ops; > + > +static void process_trace_tryattach(struct task_struct *tsk) > +{ > + struct utrace_engine *engine; > + > + pr_debug("in %s\n", __func__); > + tracing_record_cmdline (tsk); > + engine = utrace_attach_task(tsk, > + UTRACE_ATTACH_CREATE | > + UTRACE_ATTACH_EXCLUSIVE, > + &process_trace_ops, NULL); > + if (IS_ERR(engine) || (engine == NULL)) { > + pr_warning("utrace_attach_task %d (rc %p)\n", > + tsk->pid, engine); > + } else { > + int rc; > + > + /* We always hook cost-free events. */ > + unsigned long events = > + UTRACE_EVENT(CLONE) | > + UTRACE_EVENT(EXEC) | > + UTRACE_EVENT(JCTL) | > + UTRACE_EVENT(EXIT); > + > + /* Penalizing events are individually controlled, so that > + utrace doesn't even take the monitored threads off their > + fast paths, nor bother call our callbacks. */ > + if (trace_syscalls_p) > + events |= UTRACE_EVENT_SYSCALL; > + if (trace_signals_p) > + events |= UTRACE_EVENT_SIGNAL_ALL; > + > + rc = utrace_set_events(tsk, engine, events); > + if (rc == -EINPROGRESS) > + rc = utrace_barrier(tsk, engine); > + if (rc) > + pr_warning("utrace_set_events/barrier rc %d\n", rc); > + > + utrace_engine_put(engine); > + pr_debug("attached in %s to %s(%d)\n", __func__, > + tsk->comm, tsk->pid); > + } > +} > + > + > +u32 process_trace_report_clone(enum utrace_resume_action action, > + struct utrace_engine *engine, > + struct task_struct *parent, > + unsigned long clone_flags, > + struct task_struct *child) > +{ > + if (trace_lifecycle_p && trace_test(parent)) { > + struct process_trace_entry ent; > + ent.opcode = _UTRACE_EVENT_CLONE; > + ent.trace_clone.child = child->pid; > + ent.trace_clone.flags = clone_flags; > + process_trace(&ent); > + } > + > + process_trace_tryattach(child); > + > + return UTRACE_RESUME; > +} > + > + > +u32 process_trace_report_jctl(enum utrace_resume_action action, > + struct utrace_engine *engine, > + struct task_struct *task, > + int type, int notify) > +{ > + struct process_trace_entry ent; > + ent.opcode = _UTRACE_EVENT_JCTL; > + ent.trace_jctl.type = type; > + ent.trace_jctl.notify = notify; > + process_trace(&ent); > + > + return UTRACE_RESUME; > +} > + > + > +u32 process_trace_report_syscall_entry(u32 action, > + struct utrace_engine *engine, > + struct task_struct *task, > + struct pt_regs *regs) > +{ > + if (trace_syscalls_p && trace_test(task)) { > + struct process_trace_entry ent; > + ent.opcode = _UTRACE_EVENT_SYSCALL_ENTRY; > + ent.trace_syscall_entry.callno = syscall_get_nr(task, regs); > + syscall_get_arguments(task, regs, 0, 6, > + ent.trace_syscall_entry.args); > + process_trace(&ent); > + } > + > + return UTRACE_RESUME; > +} > + > + > +u32 process_trace_report_syscall_exit(enum utrace_resume_action action, > + struct utrace_engine *engine, > + struct task_struct *task, > + struct pt_regs *regs) > +{ > + if (trace_syscalls_p && trace_test(task)) { > + struct process_trace_entry ent; > + ent.opcode = _UTRACE_EVENT_SYSCALL_EXIT; > + ent.trace_syscall_exit.rc = > + syscall_get_return_value(task, regs); > + ent.trace_syscall_exit.error = syscall_get_error(task, regs); > + process_trace(&ent); > + } > + > + return UTRACE_RESUME; > +} > + > + > +u32 process_trace_report_exec(enum utrace_resume_action action, > + struct utrace_engine *engine, > + struct task_struct *task, > + const struct linux_binfmt *fmt, > + const struct linux_binprm *bprm, > + struct pt_regs *regs) > +{ > + if (trace_lifecycle_p && trace_test(task)) { > + struct process_trace_entry ent; > + ent.opcode = _UTRACE_EVENT_EXEC; > + ent.trace_exec.argc = bprm->argc; > + strlcpy (ent.trace_exec.filename, bprm->filename, > + sizeof(ent.trace_exec.filename)); > + process_trace(&ent); > + } > + > + tracing_record_cmdline (task); > + > + /* We're already attached; no need for a new tryattach. */ > + > + return UTRACE_RESUME; > +} > + > + > +u32 process_trace_report_signal(u32 action, > + struct utrace_engine *engine, > + struct task_struct *task, > + struct pt_regs *regs, > + siginfo_t *info, > + const struct k_sigaction *orig_ka, > + struct k_sigaction *return_ka) > +{ > + if (trace_signals_p && trace_test(task)) { > + struct process_trace_entry ent; > + ent.opcode = _UTRACE_EVENT_SIGNAL; > + ent.trace_signal.si_signo = info->si_signo; > + ent.trace_signal.si_errno = info->si_errno; > + ent.trace_signal.si_code = info->si_code; > + process_trace(&ent); > + } > + > + /* We're already attached, so no need for a new tryattach. */ > + > + return UTRACE_RESUME | utrace_signal_action(action); > +} > + > + > +u32 process_trace_report_exit(enum utrace_resume_action action, > + struct utrace_engine *engine, > + struct task_struct *task, > + long orig_code, long *code) > +{ > + if (trace_lifecycle_p && trace_test(task)) { > + struct process_trace_entry ent; > + ent.opcode = _UTRACE_EVENT_EXIT; > + ent.trace_exit.code = orig_code; > + process_trace(&ent); > + } > + > + /* There is no need to explicitly attach or detach here. */ > + > + return UTRACE_RESUME; > +} > + > + > +void enable_process_trace() > +{ > + struct task_struct *grp, *tsk; > + > + pr_debug("in %s\n", __func__); > + rcu_read_lock(); > + do_each_thread(grp, tsk) { > + /* Skip over kernel threads. */ > + if (tsk->flags & PF_KTHREAD) > + continue; > + > + if (process_follow_pid) { > + if (tsk->tgid == process_follow_pid || > + tsk->parent->tgid == process_follow_pid) > + process_trace_tryattach(tsk); > + } else { > + process_trace_tryattach(tsk); > + } > + } while_each_thread(grp, tsk); > + rcu_read_unlock(); > +} > + > +void disable_process_trace() > +{ > + struct utrace_engine *engine; > + struct task_struct *grp, *tsk; > + int rc; > + > + pr_debug("in %s\n", __func__); > + rcu_read_lock(); > + do_each_thread(grp, tsk) { > + /* Find matching engine, if any. Returns -ENOENT for > + unattached threads. */ > + engine = utrace_attach_task(tsk, UTRACE_ATTACH_MATCH_OPS, > + &process_trace_ops, 0); > + if (IS_ERR(engine)) { > + if (PTR_ERR(engine) != -ENOENT) > + pr_warning("utrace_attach_task %d (rc %ld)\n", > + tsk->pid, -PTR_ERR(engine)); > + } else if (engine == NULL) { > + pr_warning("utrace_attach_task %d (null engine)\n", > + tsk->pid); > + } else { > + /* Found one of our own engines. Detach. */ > + rc = utrace_control(tsk, engine, UTRACE_DETACH); > + switch (rc) { > + case 0: /* success */ > + break; > + case -ESRCH: /* REAP callback already begun */ > + case -EALREADY: /* DEATH callback already begun */ > + break; > + default: > + rc = -rc; > + pr_warning("utrace_detach %d (rc %d)\n", > + tsk->pid, rc); > + break; > + } > + utrace_engine_put(engine); > + pr_debug("detached in %s from %s(%d)\n", __func__, > + tsk->comm, tsk->pid); > + } > + } while_each_thread(grp, tsk); > + rcu_read_unlock(); > +} > + > + > +static const struct utrace_engine_ops process_trace_ops = { > + .report_clone = process_trace_report_clone, > + .report_exec = process_trace_report_exec, > + .report_exit = process_trace_report_exit, > + .report_jctl = process_trace_report_jctl, > + .report_signal = process_trace_report_signal, > + .report_syscall_entry = process_trace_report_syscall_entry, > + .report_syscall_exit = process_trace_report_syscall_exit, > +}; > + > + > + > +/* control interfaces */ > + > + > +static ssize_t > +trace_taskcomm_filter_read(struct file *filp, char __user *ubuf, > + size_t cnt, loff_t *ppos) > +{ > + return simple_read_from_buffer(ubuf, cnt, ppos, > + trace_taskcomm_filter, TASK_COMM_LEN); > +} > + > + > +static ssize_t > +trace_taskcomm_filter_write(struct file *filp, const char __user *ubuf, > + size_t cnt, loff_t *fpos) > +{ > + char *end; > + > + if (cnt > TASK_COMM_LEN) > + cnt = TASK_COMM_LEN; > + > + if (copy_from_user(trace_taskcomm_filter, ubuf, cnt)) > + return -EFAULT; > + > + /* Cut from the first nil or newline. */ > + trace_taskcomm_filter[cnt] = '\0'; > + end = strchr(trace_taskcomm_filter, '\n'); > + if (end) > + *end = '\0'; > + > + *fpos += cnt; > + return cnt; > +} > + > + > +static const struct file_operations trace_taskcomm_filter_fops = { > + .open = tracing_open_generic, > + .read = trace_taskcomm_filter_read, > + .write = trace_taskcomm_filter_write, > +}; > + > + > + > +static char README_text[] = > + "process event tracer mini-HOWTO\n" > + "\n" > + "1. Select process hierarchy to monitor. Other processes will be\n" > + " completely unaffected. Leave at 0 for system-wide tracing.\n" > + "# echo NNN > process_follow_pid\n" > + "\n" > + "2. Determine which process event traces are potentially desired.\n" > + " syscall and signal tracing slow down monitored processes.\n" > + "# echo 0 > process_trace_{syscalls,signals,lifecycle}\n" > + "\n" > + "3. Add any final uid- or taskcomm-based filtering. Non-matching\n" > + " processes will skip trace messages, but will still be slowed.\n" > + "# echo NNN > process_trace_uid_filter # -1: unrestricted \n" > + "# echo ls > process_trace_taskcomm_filter # empty: unrestricted\n" > + "\n" > + "4. Start tracing.\n" > + "# echo process > current_tracer\n" > + "\n" > + "5. Examine trace.\n" > + "# cat trace\n" > + "\n" > + "6. Stop tracing.\n" > + "# echo nop > current_tracer\n" > + ; > + > +static struct debugfs_blob_wrapper README_blob = { > + .data = README_text, > + .size = sizeof(README_text), > +}; > + > + > +static __init int init_process_trace(void) > +{ > + struct dentry *d_tracer; > + struct dentry *entry; > + > + d_tracer = tracing_init_dentry(); > + > + arch_init_ftrace_syscalls (); > + > + entry = debugfs_create_blob("process_trace_README", 0444, d_tracer, > + &README_blob); > + if (!entry) > + pr_warning("Could not create debugfs " > + "'process_trace_README' entry\n"); We also now have a trace_create_file that does the warning for you. > + > + /* Control for scoping process following. */ > + entry = debugfs_create_u32("process_follow_pid", 0644, d_tracer, > + &process_follow_pid); > + if (!entry) > + pr_warning("Could not create debugfs " > + "'process_follow_pid' entry\n"); > + > + /* Process-level filters */ > + entry = debugfs_create_file("process_trace_taskcomm_filter", 0644, > + d_tracer, NULL, > + &trace_taskcomm_filter_fops); > + /* XXX: it'd be nice to have a read/write debugfs_create_blob. */ > + if (!entry) > + pr_warning("Could not create debugfs " > + "'process_trace_taskcomm_filter' entry\n"); > + > + entry = debugfs_create_u32("process_trace_uid_filter", 0644, d_tracer, > + &trace_taskuid_filter); > + if (!entry) > + pr_warning("Could not create debugfs " > + "'process_trace_uid_filter' entry\n"); > + > + /* Event-level filters. */ > + entry = debugfs_create_u32("process_trace_lifecycle", 0644, d_tracer, > + &trace_lifecycle_p); > + if (!entry) > + pr_warning("Could not create debugfs " > + "'process_trace_lifecycle' entry\n"); > + > + entry = debugfs_create_u32("process_trace_syscalls", 0644, d_tracer, > + &trace_syscalls_p); > + if (!entry) > + pr_warning("Could not create debugfs " > + "'process_trace_syscalls' entry\n"); > + > + entry = debugfs_create_u32("process_trace_signals", 0644, d_tracer, > + &trace_signals_p); > + if (!entry) > + pr_warning("Could not create debugfs " > + "'process_trace_signals' entry\n"); > + > + return register_tracer(&process_tracer); > +} > + > +device_initcall(init_process_trace); > -- > 1.6.0.6 Other than my minor comments, I see nothing wrong with this patch. I'd like to try it out. I would just need to apply the utrace changes first ;-) Thanks, -- Steve From oleg at redhat.com Thu Apr 16 18:17:41 2009 From: oleg at redhat.com (Oleg Nesterov) Date: Thu, 16 Apr 2009 20:17:41 +0200 Subject: ptrace cleanup tasks In-Reply-To: <20090416083816.7AA49FC3C6@magilla.sf.frob.com> References: <20090330185146.D525AFC3AB@magilla.sf.frob.com> <20090408203954.GA26816@redhat.com> <20090414025820.D5548FC299@magilla.sf.frob.com> <20090415173623.GA22108@redhat.com> <20090416083816.7AA49FC3C6@magilla.sf.frob.com> Message-ID: <20090416181741.GA21907@redhat.com> On 04/16, Roland McGrath wrote: > > (I just didn't want anything easier to get blocked > while waiting for feedback from me.) I have a couple of questions about other cleanups... Will write another email. > Yes. I think we can do it with two hacks, which means no more than four or > five actual hacks. > > [... snip ...] I _seem_ to understand what you suggest, and I _think_ this can work. But I need to re-read this again (probably more than 10 times ;) to convince myself I really understand. But see below. > > Two threads T1 and T2. T1 has a TASK_STOPPED child C. T2 in the middle > > of sys_ptrace(PTRACE_ATTACH, C). > > You are a bad, bad man. > > > T1 does do_wait(WSTOPPED). It is possible that we already see C->ptrace > > (so do_wait_thread(T1->children) just clears *notask_error), but we > > don't see C in T2->ptraced list. > > "In the middle" of PTRACE_ATTACH means T2 holds T2->ptrace_mutex. Before > it takes the mutex, C->ptrace is clear (unless racing just after D does > PTRACE_DETACH to make T2's PTRACE_ATTACH possible). > > T1 takes T2->ptrace_mutex to examine T2->ptraced. In that case we are fine. But we want to not take the lock when list_empty(&tsk->ptraced), and we may see list_empty() == T. > Before that, say C->ptrace was set from prior attach by D. T1 saw > C->ptrace in do_wait_thread and did not report C. Now, D does > sys_ptrace(PTRACE_DETACH, C). Hmm, yes, another corner case... OK. For the moment, please forget about these changes. Let's recall do_wait() has the ancient bug. wait_task_zombie() sets EXIT_DEAD unconditionally, then drops tasklist. If we are not the real_parent and the child was traced, we may restore >exit_state = EXIT_ZOMBIE later. But, if the ->real_parent calls do_wait() in between, it can see the child in EXIT_DEAD state and return -ECHLD. (and we have other minor problems with EXIT_DEAD tasks on ->children). Perhaps we can start with this change, wait_task_zombie: get_task_struct(p); read_unlock(&tasklist_lock); write_lock_irq(&tasklist_lock); if (p->exit_state != TASK_ZOMBIE) { write_unlock_irq(&tasklist_lock); // This is extremely unlikely case // return something which forces // "goto repeat" in do_wait(). return -E_GOTO_REPEAT; } if (ptrace_reparented(p)) { ptrace_unlink(p); .... } p->exit_state = EXIT_DEAD; list_del_init(p->sibling); // not really needed but good ... Now return to the topic. Perhaps we can do something "hybrid" for the start? To avoid the discussed races with do_wait(), PTRACE_ATTACH/PTRACE_DETACH can temporary take tasklist just for setting PT_PTRACED and adding to ->ptraced list. Or, _perhaps_ we can just add a barrier, so that if do_wait() sees PT_PTRACED it must see !list_empty(->ptraced). In that case, ptrace_do_wait() does not need ->ptrace_mutex, it can iterate over ->ptraced list lockless (but perhaps we should use list_for_each_rcu). We only need ->ptrace_mutex to untrace the task, so we just modify wait_task_zombie() further: get_task_struct(p); read_unlock(&tasklist_lock); // since we dropped tasklist, we should return either // success or -E_GOTO_REPEAT; if (p->ptrace) { lock_parents_ptrace_mutex(p); if (p is not traced) { mutex_unlock(); return -E_GOTO_REPEAT; } untrace(); mutex_unlock(); } write_lock_irq(&tasklist_lock); ... try to reap ... Now. With these changes, nobody except de_thread() can call release_task(p) when p is ptraced. So the "untrace" part above can go into the separate helper, de_thread() calls it before release_task(leader). tracehook_finish_release_task() can be killed. Of course, this all is very vague, just for discussion. Oleg. From confirm-s2-gx0rir0p0eb34znlfqilxl4jbyrriwu1-utrace-devel=redhat.com at yahoogrupos.com.br Thu Apr 16 19:50:34 2009 From: confirm-s2-gx0rir0p0eb34znlfqilxl4jbyrriwu1-utrace-devel=redhat.com at yahoogrupos.com.br (Yahoo! Grupos) Date: 16 Apr 2009 19:50:34 -0000 Subject: Confirma =?iso-8859-1?q?=E7=E3?= o de pedido para entrar no grupo de_amigo_para_amigo Message-ID: <1239911434.52.22186.w1@yahoogrupos.com.br> Ol? utrace-devel at redhat.com, Recebemos sua solicita??o para entrar no grupo de_amigo_para_amigo do Yahoo! Grupos, um servi?o de comunidades online gratuito e super f?cil de usar. Este pedido expirar? em 7 dias. PARA ENTRAR NESTE GRUPO: 1) V? para o site do Yahoo! Grupos clicando neste link: http://br.groups.yahoo.com/i?i=gx0rir0p0eb34znlfqilxl4jbyrriwu1&e=utrace-devel%40redhat%2Ecom (Se n?o funcionar, use os comandos para cortar e colar o link acima na barra de endere?o do seu navegador.) -OU- 2) RESPONDA a este e-mail clicando em "Responder" e depois em "Enviar", no seu programa de e-mail. Se voc? n?o fez esta solicita??o ou se n?o tem interesse em entrar no grupo de_amigo_para_amigo, por favor, ignore esta mensagem. Sauda??es, Atendimento ao usu?rio do Yahoo! Grupos O uso que voc? faz do Yahoo! Grupos est? sujeito aos http://br.yahoo.com/info/utos.html From oleg at redhat.com Thu Apr 16 20:40:04 2009 From: oleg at redhat.com (Oleg Nesterov) Date: Thu, 16 Apr 2009 22:40:04 +0200 Subject: ptracee data structures cleanup In-Reply-To: <20090408203954.GA26816@redhat.com> References: <20090330185146.D525AFC3AB@magilla.sf.frob.com> <20090408203954.GA26816@redhat.com> Message-ID: <20090416204004.GA28013@redhat.com> (change subject) So. We are going to make a separate, dynamically allocated structure for tracees. Say, we add "struct ptrace_child *ptrace_child" into task_struct. attach/attachme do kmalloc() and use task_lock() to avoid races. (with the current locking write_lock(tasklist) alone is enough). Eventually, we should move all ptrace-related fields from task_struct to ptrace_child (ptrace_message/last_siginfo/etc), but let's forget about them for now. So, for the start we have (of course, I use the random naming). struct ptrace_child { // was task_struct->ptrace unsigned int ptrace_flags; // moved from task_struct struct list_entry ptrace_entry; // new, points back to the child. // at least, the tracer needs it when it does // list_for_each(current->ptraced). struct task_struct *self; The first question, should we free it after detach? I think yes. Otherwise, for example, instead of "if (p->ptrace)" we should do "if (p->ptrace_child && p->ptrace_child->ptrace_flags)", not good. So ->ptrace_child != NULL means its traced (and we can even kill PT_PTRACED). Looks like we don't even need task_lock() to reset ->ptrace_child... But. Until we rework the locking (afaics, ->ptrace_mutex should be held throughout the while sys_ptrace() call), even ptracer can't safely access child->ptrace_child->ptrace_flags. Once sys_ptrace() drops tasklist, child's sub-thread can do exec, and de_thread() can untrace the child and free ->ptrace_child. That is why I think ->ptrace_mutex should come first, but it is very possible I missed something... Of course, we can add a lot of task_lock's, but this a) complicates the code and b) ptrace_release_task() should take task_lock() too under write_lock(tasklist). And of course, we should also move task_struct->parent into ptrace_child, // was task_struct->parent struct task_struct *ptrace_parent; And now we have the new minor problem. Say, tracehook_tracer_task(). How can it safely read ->ptrace_child->ptrace_parent ? Looks like we should make ->ptrace_child rcu-safe. Actually, I'd prefer to rely on task_lock(), but again, afaics this means we should do the locking first. Anything else I missed? Oleg. From roland at redhat.com Thu Apr 16 21:50:07 2009 From: roland at redhat.com (Roland McGrath) Date: Thu, 16 Apr 2009 14:50:07 -0700 (PDT) Subject: ptrace cleanup tasks In-Reply-To: Oleg Nesterov's message of Thursday, 16 April 2009 20:17:41 +0200 <20090416181741.GA21907@redhat.com> References: <20090330185146.D525AFC3AB@magilla.sf.frob.com> <20090408203954.GA26816@redhat.com> <20090414025820.D5548FC299@magilla.sf.frob.com> <20090415173623.GA22108@redhat.com> <20090416083816.7AA49FC3C6@magilla.sf.frob.com> <20090416181741.GA21907@redhat.com> Message-ID: <20090416215007.12DE5FC3C6@magilla.sf.frob.com> > > T1 takes T2->ptrace_mutex to examine T2->ptraced. > > In that case we are fine. But we want to not take the lock when > list_empty(&tsk->ptraced), and we may see list_empty() == T. Yes. I still haven't thought through how to make that fast-path check right. > OK. For the moment, please forget about these changes. Let's recall > do_wait() has the ancient bug. wait_task_zombie() sets EXIT_DEAD > unconditionally, then drops tasklist. If we are not the real_parent > and the child was traced, we may restore >exit_state = EXIT_ZOMBIE > later. But, if the ->real_parent calls do_wait() in between, it can > see the child in EXIT_DEAD state and return -ECHLD. Really? Won't that do_wait hit (wait_consider_task): if (likely(!ptrace) && unlikely(p->ptrace)) { /* * This child is hidden by ptrace. * We aren't allowed to see it now, but eventually we will. */ *notask_error = 0; return 0; } ? p->ptrace is still set until the tracer gets write_lock_irq(&tasklist_lock). > (and we have other minor problems with EXIT_DEAD tasks on ->children). Which? (We can make this a separate thread if it's not really apropos just to ptrace.) > Perhaps we can start with this change, > > wait_task_zombie: I don't think I followed the intent of this change. Using write_lock_irq at all in the non-ptrace reap case seems bad. > Now return to the topic. Perhaps we can do something "hybrid" for the start? > To avoid the discussed races with do_wait(), PTRACE_ATTACH/PTRACE_DETACH can > temporary take tasklist just for setting PT_PTRACED and adding to ->ptraced > list. write_lock_irq inside ptrace_mutex should be safe. But does that leave us still with the current ugly tasklist_lock+task_lock dance? Moreover, we want to find the path that gets us (eventually) to not using tasklist_lock at all for ptrace. > Or, _perhaps_ we can just add a barrier, so that if do_wait() sees > PT_PTRACED it must see !list_empty(->ptraced). That's nice if it works. Another wrinkle to consider is that in the long run we probably ->ptrace to go away entirely. (If ptrace becomes purely utrace-based, we don't need any tracee state details in task_struct directly except to make the do_wait exclusion work.) So eventually we should consider different ways to store the "stolen by ptrace" bit for real_parent's wait to track, since that bit will be the only state left for "core" code. > In that case, ptrace_do_wait() does not need ->ptrace_mutex, it can iterate > over ->ptraced list lockless (but perhaps we should use list_for_each_rcu). There is no inherent reason to avoid ptrace_mutex or try real hard to be lockless for its own sake. It's only the common case of no ptrace use where we'd like to short-circuit. Note that another angle on this could be to exploit the idea I'd intended to be a later optimization: mutex, ptraced in a struct allocated on demand. Then the short-circuit test is simply if (!tsk->ptrace_info). On the first use of ptrace that allocates ptrace_info, it can do some extra futzing (that needn't be very efficient) to avoid the do_wait race cases we've been coming up with. (Of course, this all is very vague, just for discussion. ;-) > We only need ->ptrace_mutex to untrace the task, so we just modify > wait_task_zombie() further: I don't think I can grok this before I've figured out what I missed about the idea above. > Of course, this all is very vague, just for discussion. That's the way I like it! ;-) Thanks, Roland From roland at redhat.com Thu Apr 16 23:24:30 2009 From: roland at redhat.com (Roland McGrath) Date: Thu, 16 Apr 2009 16:24:30 -0700 (PDT) Subject: ptracee data structures cleanup In-Reply-To: Oleg Nesterov's message of Thursday, 16 April 2009 22:40:04 +0200 <20090416204004.GA28013@redhat.com> References: <20090330185146.D525AFC3AB@magilla.sf.frob.com> <20090408203954.GA26816@redhat.com> <20090416204004.GA28013@redhat.com> Message-ID: <20090416232430.4DAE4FC3C6@magilla.sf.frob.com> > So. We are going to make a separate, dynamically allocated structure > for tracees. Say, we add "struct ptrace_child *ptrace_child" into > task_struct. Right. > attach/attachme do kmalloc() and use task_lock() to avoid races. > (with the current locking write_lock(tasklist) alone is enough). Sure. Just task_lock() is preferable (no global contention), that's what I'd figured it would be. I think having ptrace_exit() early in the exit path now might already make it a little easier to get out from under the tasklist_lock entanglements. > Eventually, we should move all ptrace-related fields from task_struct > to ptrace_child (ptrace_message/last_siginfo/etc), but let's forget > about them for now. Right. > So, for the start we have (of course, I use the random naming). Looks fine to start. (BTW, my preference in all the new code is to stick to "task" or "tracee" and avoid "child", also "tracer" instead of "parent" in ptrace contexts. I think it already tends to confuse readers of the code about the nature of the arcanely overlapping--but mostly unrelated--parent:child and tracer:tracee relationships.) > The first question, should we free it after detach? I don't think this needs to be a priority if it complicates everything. The ultimate goal will be to hang this off utrace_engine.data and let the utrace layer serialize our nasty races for us. For the intermediate steps before using utrace, the only performance standard I think we want to try real hard on is not to degrade the most common cases: a task that has never used ptrace and a task that has never been traced by ptrace. > for example, instead of "if (p->ptrace)" we should do > "if (p->ptrace_child && p->ptrace_child->ptrace_flags)", not good. I think it is just fine if ->ptrace_child != NULL means only "goes off the fast path". e.g. static inline int task_ptrace(struct task_struct *task) { int ptrace = 0; if (unlikely(task->ptrace_child)) { struct ptrace_child *p; rcu_read_lock(); p = rcu_dereference(task->ptrace_child); ptrace = p ? p->ptrace : 0; rcu_read_lock(); } return ptrace; } But even that is a lot of hair for the incremental patches in the first several stages, I think. So just never deallocate it, and: static inline int task_ptrace(struct task_struct *task) { return unlikely(task->ptrace_child) ? task->ptrace_child->flags : 0; } > But. Until we rework the locking (afaics, ->ptrace_mutex should be held > throughout the while sys_ptrace() call), I'm not sure that will be necessary. I am not comfortable with holding it around access_process_vm and user_regset calls (arch_ptrace) and all. I think we'll find it OK just to take it (again) inside ptrace_resume() et al, places where the ptrace bookkeeping is touched. > even ptracer can't safely access > child->ptrace_child->ptrace_flags. Once sys_ptrace() drops tasklist, > child's sub-thread can do exec, and de_thread() can untrace the child > and free ->ptrace_child. This kind of trouble is why going to dynamic allocation should start with never-deallocate (i.e. until release_task or something). We can get a lot of clean-up and see how it looks before dealing with the troubles that would cause. Perhaps we don't ever bother with them until after figuring out utrace-based plans. (That's what I'd intended for putting all the tracer-side stuff in one struct and switching that to dynamic allocation too--without deallocation races, it's just a cheap and easy way to save memory/cache footprint for the 99.44% of tasks that never call ptrace.) > And of course, we should also move task_struct->parent into ptrace_child, > > // was task_struct->parent > struct task_struct *ptrace_parent; struct ptrace_task.tracer, please. :-) > And now we have the new minor problem. Say, tracehook_tracer_task(). > How can it safely read ->ptrace_child->ptrace_parent ? Looks like > we should make ->ptrace_child rcu-safe. Actually, I'd prefer to rely > on task_lock(), but again, afaics this means we should do the locking > first. tracehook_tracer_task() is one of the oddest angles all around. It's most unlike every other path, because it can be by any third party. But it is used very little and does not need to be fast. So there are many options. But this too is trivially punted until later if we do not deallocate--I guess with this use, it would have to be until __put_task_struct to have no locking in tracehook_tracer_task(). Thanks, Roland From roland at redhat.com Fri Apr 17 06:42:47 2009 From: roland at redhat.com (Roland McGrath) Date: Thu, 16 Apr 2009 23:42:47 -0700 (PDT) Subject: syscall_entry callback order reversed Message-ID: <20090417064248.031E2FC3C6@magilla.sf.frob.com> After talking with Renzo Davoli we agreed that it makes most sense for SYSCALL_ENTRY events to have engines' callbacks made in the reverse of the normal order (FIFO vs LIFO). This ordering makes it sensical to have "nesting" mutators and examiners as different utrace engines all watching syscall entry on a task. This new DocBook paragraph explains the rationale: The UTRACE_EVENT(SYSCALL_ENTRY) event is a special case. While other events happen in the kernel when it will return to user mode soon, this event happens when entering the kernel before it will proceed with the work requested from user mode. Because of this difference, the report_syscall_entry callback is special in two ways. For this event, engines are called in reverse of the normal order (this includes the report_quiesce call that precedes a report_syscall_entry call). This preserves the semantics that the last engine to attach is called "closest to user mode"--the engine that is first to see a thread's user state when it enters the kernel is also the last to see that state when the thread returns to user mode. For the same reason, if these callbacks use UTRACE_STOP (see the next section), the thread stops immediately after callbacks rather than only when it's ready to return to user mode; when allowed to resume, it will actually attempt the system call indicated by the register values at that time. I've made this change in the current code. (It was indeed trivial.) I will follow up later on the nontrivial issues about stopping and resuming at syscall-entry. Renzo and I spoke in more detail about this and I have a sense of what we want to do in the API, but it merits more discussion. The ordering change was the very low-hanging fruit that I had time to do now. I'll post again later (may be a few days) about the stopping issue. Thanks, Roland From ges at mashreqgroup.com Fri Apr 17 07:31:14 2009 From: ges at mashreqgroup.com (Hugh Conley) Date: Fri, 17 Apr 2009 09:31:14 +0200 Subject: Do you really trust her? Message-ID: <001101c9bf27$116a78f0$db7e9d5f@computer97ea27ofcnu> Do you want to get your partner off-guard? http://www.chinamobilesms.com From mmusonda at carrier.utc.com Fri Apr 17 13:44:03 2009 From: mmusonda at carrier.utc.com (Bel Contreras) Date: Fri, 17 Apr 2009 14:44:03 +0100 Subject: Drive her wet and crazy Message-ID: <20090417144403.3090700@carrier.utc.com> Feel the passion back into your relationships! http://acfgqa.haomkunsel.com/ From oleg at redhat.com Fri Apr 17 19:12:06 2009 From: oleg at redhat.com (Oleg Nesterov) Date: Fri, 17 Apr 2009 21:12:06 +0200 Subject: wait_task_zombie() && EXIT_DEAD problems In-Reply-To: <20090416215007.12DE5FC3C6@magilla.sf.frob.com> References: <20090330185146.D525AFC3AB@magilla.sf.frob.com> <20090408203954.GA26816@redhat.com> <20090414025820.D5548FC299@magilla.sf.frob.com> <20090415173623.GA22108@redhat.com> <20090416083816.7AA49FC3C6@magilla.sf.frob.com> <20090416181741.GA21907@redhat.com> <20090416215007.12DE5FC3C6@magilla.sf.frob.com> Message-ID: <20090417191206.GA22436@redhat.com> On 04/16, Roland McGrath wrote: > > > OK. For the moment, please forget about these changes. Let's recall > > do_wait() has the ancient bug. wait_task_zombie() sets EXIT_DEAD > > unconditionally, then drops tasklist. If we are not the real_parent > > and the child was traced, we may restore >exit_state = EXIT_ZOMBIE > > later. But, if the ->real_parent calls do_wait() in between, it can > > see the child in EXIT_DEAD state and return -ECHLD. > > Really? Won't that do_wait hit (wait_consider_task): > > if (likely(!ptrace) && unlikely(p->ptrace)) { Hmm, yes, its not that simple. Let me try again. Two threads T1 and T2, and some process X (not the child of T1/T2). T1 ptraces X. X exits and becomes zombie. T2 calls do_wait(WEXITED) and then wait_task_zombie(), it sets EXIT_DEAD and drops tasklist. T1 exits, calls exit_ptrace(). __ptrace_detach() does __ptrace_unlink() and nothing more. From now X->ptrace == 0. X->real_parent calls do_wait() and gets -ECHLD, because X is EXIT_DEAD and not traced. Of course, we can add the really nasty hacks to __ptrace_unlink() path, but I think we should never set EXIT_DEAD unless we know for sure the child will be released. > > (and we have other minor problems with EXIT_DEAD tasks on ->children). > > Which? Minor, but note the comment in find_new_reaper() about EXIT_DEAD tasks. I thought I can give more arguments against EXIT_DEAD task on ->children, but either I forgot, or they don't really exist ;) Anyway. I think it would be more cleaner if, everytime we set EXIT_DEAD, we also remove the task from list. The only problem is do_wait(), we need write_lock(tasklist) for this, and yes, write_lock() is not very nice. But as I said, this all is minor. > (We can make this a separate thread if it's not really apropos just > to ptrace.) Yes, I think it would be better to keep emails small. Change the subject. Oleg. From oleg at redhat.com Fri Apr 17 19:17:45 2009 From: oleg at redhat.com (Oleg Nesterov) Date: Fri, 17 Apr 2009 21:17:45 +0200 Subject: ptrace cleanup tasks In-Reply-To: <20090416215007.12DE5FC3C6@magilla.sf.frob.com> References: <20090330185146.D525AFC3AB@magilla.sf.frob.com> <20090408203954.GA26816@redhat.com> <20090414025820.D5548FC299@magilla.sf.frob.com> <20090415173623.GA22108@redhat.com> <20090416083816.7AA49FC3C6@magilla.sf.frob.com> <20090416181741.GA21907@redhat.com> <20090416215007.12DE5FC3C6@magilla.sf.frob.com> Message-ID: <20090417191745.GB22436@redhat.com> On 04/16, Roland McGrath wrote: > > Using write_lock_irq at all in the non-ptrace reap case seems bad. Agreed. But note that, once we change wait_task_zombie() to untrace the task first (under ->ptrace_mutex), we can switch to read_lock() again. > > Now return to the topic. Perhaps we can do something "hybrid" for the start? > > To avoid the discussed races with do_wait(), PTRACE_ATTACH/PTRACE_DETACH can > > temporary take tasklist just for setting PT_PTRACED and adding to ->ptraced > > list. > > write_lock_irq inside ptrace_mutex should be safe. But does that leave us > still with the current ugly tasklist_lock+task_lock dance? Oh, this dance should be killed anyway. In fact I think currently we don't need task_lock() at all, but I am not sure, I forgot the results of my previous grepping... will do again. > Moreover, we > want to find the path that gets us (eventually) to not using tasklist_lock > at all for ptrace. Yes. But only DETACH/ATTACH needs tasklist. Perhaps this can be "good enough" for the start... Not that I am sure it will be easy to improve things later, though. > > In that case, ptrace_do_wait() does not need ->ptrace_mutex, it can iterate > > over ->ptraced list lockless (but perhaps we should use list_for_each_rcu). > > There is no inherent reason to avoid ptrace_mutex or try real hard to be > lockless for its own sake. It's only the common case of no ptrace use > where we'd like to short-circuit. Agreed, but still it is better to delay unlock(tasklist)+mutex_lock(ptrace) until we know this is really needed. Only wait_task_zombie's path needs this. See also below. > Note that another angle on this could be to exploit the idea I'd intended > to be a later optimization: mutex, ptraced in a struct allocated on demand. Sure, I think this should be covered by "ptracer (parent) data structures cleanup" we didn't discuss yet. > Then the short-circuit test is simply if (!tsk->ptrace_info). But, from the correctness pov, this doesn't differ from checking !list_empty(->ptraced) ? I mean, the condition can be changed under us in both cases, and we seem to have the same corner cases. But of course I agree, this is better. I didn't try to pay much attention to these changes just because I thought (perhaps wrongly) we can ignore them in this discussion. > I don't think I can grok this before I've figured out what I missed about > the idea above. I think I see the very "strategic" reason to fix wait_task_zombie first ;) It would be nice if the locking changes will not uglify the code from the friendly lkml reviwers pov. But surely, this unlock/lock/recheck/repeat dance (which we can't avoid) doesn't look very good. But what if we have a reson for this dance? Say, the old bug in do_wait. Then we can introduce this complication to fix the bug, and only then send these changes on top. Perhaps I should try to make some draft patches, just to see how it can look. But of course the discussion is still in progress ;) Or I should start with "ptrace child" cleanup first... (will reply to your email about the child cleanups later). Oleg. From roland at redhat.com Sat Apr 18 02:37:53 2009 From: roland at redhat.com (Roland McGrath) Date: Fri, 17 Apr 2009 19:37:53 -0700 (PDT) Subject: utrace-kmview contract In-Reply-To: Renzo Davoli's message of Tuesday, 24 March 2009 00:59:24 +0100 <20090323235924.GD23807@cs.unibo.it> References: <20090323235924.GD23807@cs.unibo.it> Message-ID: <20090418023753.2A966FC35F@magilla.sf.frob.com> Renzo and I met at the conference in San Francisco last week and spoke more about his use case. He showed me a fine demo live of what his cool hacks can do, and a clear demonstration of the "nesting" issue. Rather than responding point by point, I'll follow up momentarily on the API change I'm contemplating after our discussion. For everybody doing tracing engines, it's a great test of engine interaction issues to try out some of Renzo's stuff and make sure your engines play nicely intermingled with his, and that they see good syscall details when parameter modifications are being done. Thanks, Roland From roland at redhat.com Sat Apr 18 04:27:22 2009 From: roland at redhat.com (Roland McGrath) Date: Fri, 17 Apr 2009 21:27:22 -0700 (PDT) Subject: resuming after stop at syscall_entry Message-ID: <20090418042722.5B584FC35F@magilla.sf.frob.com> The way UTRACE_STOP works for event callbacks is that it does not always cause an immediate stop after the reporting pass where someone used UTRACE_STOP. The various events all happen such that nothing else with user-visible side effects is going to happen before getting to user mode (which is to say, the signals-check path before user mode). So when you ask for UTRACE_STOP, you actually might still get more callbacks up through report_quiesce(0) or report_signal(UTRACE_SIGNAL_REPORT), and only actually stop if you still want to in that callback (or if some engine that used UTRACE_STOP isn't taking that callback at all). The logic of this is that utrace is about user state, and nothing user-mode can see happens in between, so there is no need to stop in between. Moreover, in cases like CLONE, stopping right there means it's before the clone/fork syscall's return value has reached the task_pt_regs(), i.e. you do not see the thread's "after the event" user state. At the syscall-exit tracing point, and at the final report, you can see the proper register values. So this is all well and good for all such events. But there are some events where something else is going to happen, so we do stop before getting to the return-to-user path. Those are a CLONE in case of CLONE_VFORK, EXIT, and SYSCALL_ENTRY. After the CLONE event for a vfork, the thread will go into "vfork wait", an uninterruptible block until its new child dies or execs. It won't "do" anything else before getting to user mode. But in "vfork wait", it won't be considered safe to examine (so you can use ptrace, e.g.). So in this case, UTRACE_STOP does cause a proper stop that is right after the CLONE event callback returned. When you resume from that stop, you go into "vfork wait" (or don't, if the child is already gone by then). When the child goes away and that wait is done, you can hit the SYSCALL_EXIT event if you care, and final or signal reports, and as normal nothing else user visible is happening before the final report. In EXIT, obviously you are never going back to user mode again. But you can stop here to keep the thread's state alive, which lets you pause and examine its fds and so forth before you let it die. Your notification after all UTRACE_STOPs are lifted is the DEATH event. (I've just glazed over re-reading my own documentation, and I can't tell if any of this is adequately explained anywhere at all. If it's not, please suggest where in kerneldoc comments or utrace.tmpl would be a good place to add a paragraph about it.) So those two are fine. They stop immediately after the callback loop if someone wants UTRACE_STOP, and there are other events to catch after they resume if you need to notice when other engines have resumed. SYSCALL_ENTRY is unlike all other events. Right after this callback loop is when the important user-visible stuff happens (the system call). So we stop immediately there as for the other two. But, if another engine used UTRACE_STOP and maybe did something asynchronously, like modifying the syscall argument registers, you get no opportunity to see what happened. Once all engines lift UTRACE_STOP, the system call runs. Enter Renzo Davoli. He uses a syscall_entry callback whose purpose is to change the syscall arguments. But, it doesn't do it immediately in the callback. It wants to wake up its user-mode component to decide what to do about this particular call. So it uses UTRACE_STOP, that's why we have it. The other components cooperating with the engine contemplate the situation and kibitz, then the engine modifies the registers and calls utrace_control(UTRACE_RESUME). The system call goes ahead. Enter Renzo Davoli. He uses a syscall_entry callback whose purpose is tracing. This engine was attached first, so its normal event callbacks come before the other Renzo's callbacks. And now, the other Renzo's syscall_entry callback comes before this Renzo's, since the order is reversed from normal. So, the other Renzo's callback returns UTRACE_STOP. Then this Renzo's callback gets its SYSCALL_ENTRY. The registers it sees have not been changed yet, so they are not reflective of the actual call that will be made. This callback can look at utrace_resume_action(action) and see UTRACE_STOP, so it knows some previous engine wanted a stop and might change the state before resume. But there is nothing it can do. (So this Renzo packs up Renzo Jr. and flies to San Francisco for a weekend on the town. ;-) So, we have the "nested Renzos" problem. Renzo proposed the solution of having UTRACE_STOP cause an immediate stop after just your own callback. Then you'd have to resume before the next engine gets any callback at all, so when it gets one, it can see whatever you've done. This approach flies in the face of one of the key ideas of well-behaved engine coexistence in utrace. Your engine does not get to delay prompt notification of my engine about the state of the user thread. You get to delay the user-visible behavior of the thread itself arbitrarily with UTRACE_STOP. But that doesn't keep another engine from keeping track of things. This idea is of a piece with the notion that your callback should not go and block forever. You are all parasites on the same user thread, and while it's OK to paralyze the host and dig in, to be a good citizen of verminkind, you can't monopolize the scene at the expense of all your fellow parasites. So I have an alternate proposal. This approach keeps the norms of engine interaction and noninterference of callbacks while covering this special case of syscall entry stops. As explained above, the norm of interacting with other engines and their use of UTRACE_STOP is to use the final report. When your callback's action argument includes UTRACE_STOP, you know an earlier engine might be fiddling before the thread resumes. So, your callback can decide to return UTRACE_REPORT. That ensures that some report_quiesce (or report_signal/UTRACE_SIGNAL_REPORT) callback will be made after the other engine lifts its UTRACE_STOP and before user mode. At that point, you can see what user register values it might have installed, etc. In all events but syscall entry, a final report_quiesce(0) serves this need. My proposal is to extend this "resume report" approach to the syscall entry case. That is, after when some report_syscall_entry returned UTRACE_STOP so we've stopped, allow for a second reporting pass after we've been resumed, before running the system call. You'd get this pass if someone used UTRACE_REPORT. That is, in the first callback loop, one engine used UTRACE_STOP and another used UTRACE_REPORT. Then when the first engine used utrace_control() to resume, there would be a second reporting pass because of the second engine's earlier request. Or, even if there was just one engine, but it used UTRACE_STOP and then used utrace_control(UTRACE_REPORT) to resume, then it would get the second reporting pass. If someone uses UTRACE_STOP+UTRACE_REPORT in that pass, there would be a third pass, etc. What I have in mind is that the second (and however many more) pass would just be another report_syscall_entry callback to everyone with UTRACE_EVENT(SYSCALL_ENTRY) set. A flag bit in the action argument says this is a repeat notification. I think this strikes a decent balance of not adding more callbacks and more arguments to bloat the API in general, while imposing a fairly simple burden on engines to avoid getting confused by multiple calls. A tracing-only engine that just wants to see the syscall that is going to be done can just do: if (utrace_resume_action(action) == UTRACE_STOP) return UTRACE_REPORT; at the top of report_syscall_entry, so it just doesn't think about it until it thinks the call will go now through. Say an engine has a different agenda, just to see what syscall argument values came in from user mode before someone else changes them. It does: if (action & UTRACE_SYSCALL_RESUMED) return UTRACE_RESUME; to ignore the additional callbacks that might come after somebody decided to stop and report. It just does its work on the first one. Here comes Renzo again! He wants to have two or three or nineteen layers of the first kind of Renzo engine: each one stops at syscall entry, then resumes after changing some registers. He wants these to "nest", meaning that after the "outermost" one stops, fiddles, and resumes, the "next one in" stops, looks at the register as fiddled by the outermost guy, fiddles in a different way, and resumes, and on and on. Perhaps the first model (if last guy is stopping, punt to look again at resume report) works for that. Or perhaps the engine also needs to keep track with its own state flag it sets whenever it does its work, and then resets in exit tracing to prepare for next time. Renzo points out that this is a lot of iterations over the engine list. Each callback loop starts from the beginning again and calls each engine to decide this is a useless repetition for it (or a useless report now because it won't care until the later repetition). He suggests maybe the resume report callbacks could pick up in the middle of the list with the first guy to ask for UTRACE_STOP. (The ones earlier in the list already did what they do and said to let the system call go ahead, so they should not be interested in the second callback now just because of what a later engine did.) I take these points to heart. But I also know about the intricacies of the list management and asynchronous detach while stopped. They quickly make what might sound simple in the abstract not seem so simple in the implementation. I also wonder how much this simple-seeming idea might actually complicate the corners of the API (implementation aside) even beyond these intricacies I'm describing here. I know we have more concrete hashing out of engine interaction to do over time as we have more different engines doing interesting things. It looks easy to see what makes sense for stacking several of Renzo's engines that all do the same thing in utrace terms, exciting semantics aside. I think we'll find it more complex with a mix of other engines as they come about. I also know that this level of optimization can wait until we have a realistic plan of seeing more than about three engines on a task. So, even I can't write that much text and still think this interface choice is simple to understand. But I kind of think it's around as simple as it can be for its mandates. I'd appreciate any feedback. The implementation of this would be about like the patch below. Thanks, Roland ====== --- a/kernel/utrace.c +++ b/kernel/utrace.c @@ -1273,6 +1273,8 @@ struct utrace_report { #define INIT_REPORT(var) \ struct utrace_report var = { UTRACE_RESUME, 0, \ false, false, false, false } +#define RESET_REPORT(var) \ + ((var).detaches = (var).reports = (var).takers = (var).killed = false) /* * We are now making the report, so clear the flag saying we need one. @@ -1499,22 +1501,41 @@ bool utrace_report_syscall_entry(struct struct task_struct *task = current; struct utrace *utrace = task_utrace_struct(task); INIT_REPORT(report); + u32 resume_report = 0; +report: start_report(utrace); REPORT_CALLBACKS(_reverse, task, utrace, &report, UTRACE_EVENT(SYSCALL_ENTRY), report_syscall_entry, - report.result | report.action, engine, current, regs); + resume_report | report.result | report.action, + engine, current, regs); finish_report(&report, task, utrace); - if (report.action == UTRACE_STOP && - unlikely(utrace_stop(task, utrace, false))) + if (report.action == UTRACE_STOP) { + if (utrace_stop(task, utrace, report.reports)) + /* + * We are continuing despite UTRACE_STOP because of a + * SIGKILL. Don't let the system call actually proceed. + */ + return true; + /* - * We are continuing despite UTRACE_STOP because of a - * SIGKILL. Don't let the system call actually proceed. + * If we've been asked for another report after our stop, + * go back to report (and maybe stop) again before running + * the system call. The second (and later) reports are + * marked with the UTRACE_SYSCALL_RESUMED flag so that + * engines know this is a second report at the same entry. + * This gives them the chance to examine the registers anew + * after they might have been changed while we were stopped. */ - return true; + if (utrace->report) { + RESET_REPORT(report); + resume_report = UTRACE_SYSCALL_RESUMED; + goto report; + } + } - return report.result == UTRACE_SYSCALL_ABORT; + return utrace_syscall_action(report.result) == UTRACE_SYSCALL_ABORT; } /* From claudinho304 at gmail.com Sat Apr 18 03:21:28 2009 From: claudinho304 at gmail.com (Soluções para Newsletter) Date: Sat, 18 Apr 2009 03:21:28 GMT Subject: =?iso-8859-1?q?A_Oi_entrou_no_Mercado_Multin=EDvel?= Message-ID: An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: !cid_840an$in308012343514852 at desktop.jpg Type: image/jpeg Size: 5353 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: alochipoi.jpg Type: image/jpeg Size: 4895 bytes Desc: not available URL: From rhodaliza1 at gmail.com Sat Apr 18 12:17:32 2009 From: rhodaliza1 at gmail.com (Rhodaliza) Date: Sat, 18 Apr 2009 08:17:32 -0400 Subject: finding a legitamate work at home jobs Message-ID: <16999527.1240057052427.JavaMail.tomcat@rez1.virtual.vps-host.net> Hello Friend, Rhodaliza sent you a link from REZ1.com that you would be interested in. Link: http://www.rez1.com/content/contact/contactUsRedirect.jsp They also had this to say about the information: EXTRA INCOME DATA ENTRY JOB WORK AT HOME Dear Friend, Good Day! http://starturl.com/Rhodaliza_data_entry I would personally like to invite you to become part of our team doing work-at-home data entry. We have guided thousands of team members to success using our new type of data-entry job called Global Data Entry. Some members are currently making $300 - $2000 and more per day, we have been dealing with online data entry for over 5 years. Once you become a via member, you will have exclusive access to legitimate data entry opportunities life time. Forms are just 1-3 pages and take only a few minutes to complete You will be in control and they will pay you directly via direct deposit, paypal or check. Earnings are paid every 2 weeks. http://starturl.com/Rhodaliza_data_entry Once you have signed up with our via team member, we will provide you with complete guidance and tutorials on exactly how to do this different job tasks and to make this work for you. It is possible to quit your job for the first used 3 days, how much more if you work hardly 8 hours a day. This is what you have been waiting for! don?t hesitate to grab this big opportunities, just try it and I can guarantee you 100% you?ll enjoy it. God Bless from a very satisfied member Take this position before anyone else gets in: Visit for more detail information, and join our via member company http://starturl.com/data_entry_job To your success, Rhodaliza A. Cananea Home-Data Entry Affiliate Marketer rhodaliza1 at gmail.com From wayout at netcabo.pt Sat Apr 18 19:41:52 2009 From: wayout at netcabo.pt (=?iso-8859-1?Q?Way=20Out=20-=20Tourism=20Online?=) Date: Sat, 18 Apr 2009 15:41:52 -0400 Subject: Tourism in Portugal Message-ID: <20090418204156.6678EC1A.92F28887@127.0.0.1> MAIL ERROR -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpeg Size: 36704 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpeg Size: 6963 bytes Desc: not available URL: From relief at usnationalhousingauthority.com Fri Apr 17 10:47:38 2009 From: relief at usnationalhousingauthority.com (relief at usnationalhousingauthority.com) Date: 17 Apr 2009 03:47:38 -0700 Subject: Emergency Foreclosure Relief Kit Message-ID: <20090417104738.26164.qmail@69-64-65-137.dedicated.abac.net> An HTML attachment was scrubbed... URL: From shanne.marie at gmail.com Sun Apr 19 15:48:13 2009 From: shanne.marie at gmail.com (shanne.marie at gmail.com) Date: Sun, 19 Apr 2009 16:48:13 +0100 Subject: finding a legitamate work at home jobs? Message-ID: <200904191548.n3JFmDvf023460@98-190.no1isp.net> Dear Friend Your friend Shanne Marie Duminguez wanted to send you this link: http://www.go4green.org/showABC.php?id=g4g with the following message: EXTRA INCOME DATA ENTRY JOB WORK AT HOME Dear Friend, Good Day! http://starturl.com/shanne_data_entry I would personally like to invite you to become part of our team doing work-at-home data entry. We have guided thousands of team members to success using our new type of data-entry job called Global Data Entry. Some members are currently making $300 - $2000 and more per day, we have been dealing with online data entry for over 5 years. Once you become a via member, you will have exclusive access to legitimate data entry opportunities life time. Forms are just 1-3 pages and take only a few minutes to complete You will be in control and they will pay you directly via direct deposit, paypal or check. Earnings are paid every 2 weeks. http://starturl.com/shanne_data_entry Once you have signed up with our via team member, we will provide you with complete guidance and tutorials on exactly how to do this different job tasks and to make this work for you. It is possible to quit your job for the first used 3 days, how much more if you work hardly 8 hours a day. This is what you have been waiting for! don???t hesitate to grab this big opportunities, just try it and I can guarantee you 100% you???ll enjoy it. God Bless from a very satisfied member Take this position before anyone else gets in: Visit for more detail information, and join our via member company http://starturl.com/shanne_data_entry To your success, Shanne Marie Duminguez Home-Data Entry Affiliate Marketer shanne.marie at gmail.com Regards. From ingresosproactivos at gmail.com Sun Apr 19 20:32:25 2009 From: ingresosproactivos at gmail.com (ingresos proactivos) Date: Sun, 19 Apr 2009 15:32:25 -0500 Subject: Tu plan B de ingresos YA es una realidad Message-ID: Apreciable amig@ Ya conoces las consecuencias de la crisis financiera global? Si el tema del dinero ya es de tu inter?s, es bueno que conozcas las herramientas para que prepares tu plan B de ingresos. Pongo a tu disposici?n el enlace para que veas la magn?fica oportunidad: http://www.iwbancorp.com/sp/promo.php?id=2127 ?Si no Ganas Dinero Ahora Es Por Que No Quieres! ?Esta es una de las Oportunidades que estabas esperando! Para hacer tu registro gratis y ganarte 5 EUROS entra a esta p?gina: http://www.iwbancorp.com/sp/registro.php?id=2127 Para solicitar mas informaci?n, cont?ctanos: ingresosproactivos at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From hrcabsnlb.iepovcwhg at oclc.org Sun Apr 19 23:00:57 2009 From: hrcabsnlb.iepovcwhg at oclc.org (Gregory Hughes) Date: Mon, 20 Apr 2009 05:00:57 +0600 Subject: Celebrity feet Message-ID: <001501c9c142$b344c280$9b8ba0a7@philkaeljbfa.qje> Amatuer, girl-girl feet tickling movies, and foot worship movies at http://www.barefootsies.com/ X-Antivirus: avast! (VPS 080329-0, 29.03.2008), Outbound message X-Antivirus-Status: Clean From ingresosproactivos at gmail.com Mon Apr 20 01:08:18 2009 From: ingresosproactivos at gmail.com (ingresos proactivos) Date: Sun, 19 Apr 2009 20:08:18 -0500 Subject: Tu plan B de ingresos Ya es una realidad Message-ID: Apreciable amig@ Ya conoces las consecuencias de la crisis financiera global? Si el tema del dinero ya es de tu inter?s, es bueno que conozcas las herramientas para que prepares tu plan B de ingresos. Pongo a tu disposici?n el enlace para que veas la magn?fica oportunidad: http://www.iwbancorp.com/sp/promo.php?id=2127 ?Si no Ganas Dinero Ahora Es Por Que No Quieres! ?Esta es una de las Oportunidades que estabas esperando! Para hacer tu registro gratis y ganarte 5 EUROS entra a esta p?gina: http://www.iwbancorp.com/sp/registro.php?id=2127 Para solicitar mas informaci?n, cont?ctanos: ingresosproactivos at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From casadocampo at netcabo.pt Mon Apr 20 01:08:54 2009 From: casadocampo at netcabo.pt (=?iso-8859-1?Q?Casa=20do=20Campo?=) Date: Sun, 19 Apr 2009 21:08:54 -0400 Subject: =?iso-8859-1?q?No_Ger=EAs_-_Portugal?= Message-ID: <20090420020859.B90D33AB.6946B25C@127.0.0.1> MAIL ERROR -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpeg Size: 112345 bytes Desc: not available URL: From noreply at consoleworld.org Sun Apr 19 02:18:57 2009 From: noreply at consoleworld.org (noreply at consoleworld.org) Date: Sun, 19 Apr 2009 04:18:57 +0200 Subject: Newsletter ConsoleWorld Message-ID: <04f753623a9e6e262d3e0beed52cfce0@www.consoleworld.org> A breve sar? attivo il servizio newsletter di ConsoleWorld !! Ogni settimana riceverete le ultime notizie dal forum, per restare sempre aggiornati sulle novit? della Community!! -- Powered by PHPlist, www.phplist.com -- -------------- next part -------------- An HTML attachment was scrubbed... URL: From roland at redhat.com Mon Apr 20 06:26:27 2009 From: roland at redhat.com (Roland McGrath) Date: Sun, 19 Apr 2009 23:26:27 -0700 (PDT) Subject: wait_task_zombie() && EXIT_DEAD problems In-Reply-To: Oleg Nesterov's message of Friday, 17 April 2009 21:12:06 +0200 <20090417191206.GA22436@redhat.com> References: <20090330185146.D525AFC3AB@magilla.sf.frob.com> <20090408203954.GA26816@redhat.com> <20090414025820.D5548FC299@magilla.sf.frob.com> <20090415173623.GA22108@redhat.com> <20090416083816.7AA49FC3C6@magilla.sf.frob.com> <20090416181741.GA21907@redhat.com> <20090416215007.12DE5FC3C6@magilla.sf.frob.com> <20090417191206.GA22436@redhat.com> Message-ID: <20090420062627.15392FC3C6@magilla.sf.frob.com> > Two threads T1 and T2, and some process X (not the child of T1/T2). > > T1 ptraces X. > > X exits and becomes zombie. > > T2 calls do_wait(WEXITED) and then wait_task_zombie(), it sets EXIT_DEAD > and drops tasklist. > > T1 exits, calls exit_ptrace(). __ptrace_detach() does __ptrace_unlink() > and nothing more. From now X->ptrace == 0. > > X->real_parent calls do_wait() and gets -ECHLD, because X is EXIT_DEAD > and not traced. I see. I don't think we should worry especially about fixing this existing bug (ancient, as you call it) on its own. So let's only consider this in the context of the new locking and data structure plans, if that is simpler. > Of course, we can add the really nasty hacks to __ptrace_unlink() path, > but I think we should never set EXIT_DEAD unless we know for sure the > child will be released. I take your point. But I think we should try very hard to find solutions covering all the ptrace problems without taking write_lock_irq (before release_task will take it). > Minor, but note the comment in find_new_reaper() about EXIT_DEAD tasks. > I thought I can give more arguments against EXIT_DEAD task on ->children, > but either I forgot, or they don't really exist ;) I am convinced that "unless we know for sure the child will be released" is a nice invariant to have if we can get it. That is, EXIT_DEAD can be on ->children but always means "you want to ignore it". The ptrace not-really-reaping case is the only time that rule is ever violated, right? So, we can just try to close that hole. The stronger "no EXIT_DEAD on ->children" invariant means that the non-ptrace do_wait() has to use a different method than the current xchg() to settle races. I don't see any good reason to perturb that. (At least, let it be a later concern far removed from ptrace work.) So, how can we do the ptrace_reparented() case better in wait_task_zombie? If we just don't touch exit_state, then we don't violate the invariant. Then all we need is a new way to settle the races between ptrace_do_wait() calls, right? We can drop tasklist_lock (maybe need get_task_struct?), take ptrace_mutex, do ptrace detaching/clear ->ptrace, drop ptrace_mutex. Then re-take tasklist_lock (read_lock) for do_notify_parent and do the auto-reap handling. (There's no reason I see that this juggling shouldn't happen right after dropping tasklist_lock, before getrusage/put_user, only release_task has to be after.) Something along these lines feels like the right direction. Thanks, Roland From roland at redhat.com Mon Apr 20 07:01:53 2009 From: roland at redhat.com (Roland McGrath) Date: Mon, 20 Apr 2009 00:01:53 -0700 (PDT) Subject: ptrace cleanup tasks In-Reply-To: Oleg Nesterov's message of Friday, 17 April 2009 21:17:45 +0200 <20090417191745.GB22436@redhat.com> References: <20090330185146.D525AFC3AB@magilla.sf.frob.com> <20090408203954.GA26816@redhat.com> <20090414025820.D5548FC299@magilla.sf.frob.com> <20090415173623.GA22108@redhat.com> <20090416083816.7AA49FC3C6@magilla.sf.frob.com> <20090416181741.GA21907@redhat.com> <20090416215007.12DE5FC3C6@magilla.sf.frob.com> <20090417191745.GB22436@redhat.com> Message-ID: <20090420070153.9766FFC3C6@magilla.sf.frob.com> > > Then the short-circuit test is simply if (!tsk->ptrace_info). > > But, from the correctness pov, this doesn't differ from checking > !list_empty(->ptraced) ? I mean, the condition can be changed under us > in both cases, and we seem to have the same corner cases. I guess it just seems simpler to me. The only false indication we can get is seeing !tsk->ptrace_info when the tsk is in ptrace_attach() or its child is in ptrace_traceme(). Worrying about which directions of false races with list_*() calls can be is far more hazy to me. I think -ECHILD is just fine in that race. The wait call happened "before" the ptrace attach. Am I overlooking a race scenario that matters? (If there is one, perhaps it might be handily adequately by doing an extra wake_up_parent when allocating ->ptrace_info.) > But of course I agree, this is better. I didn't try to pay much attention > to these changes just because I thought (perhaps wrongly) we can ignore > them in this discussion. I had indeed expected the details to be orthogonal. I only mentioned it because it came to mind as a trick for the short-circuit optimization. > Perhaps I should try to make some draft patches, just to see how it can > look. But of course the discussion is still in progress ;) > > Or I should start with "ptrace child" cleanup first... (will reply to your > email about the child cleanups later). My only goal in suggesting ordering of steps or which steps can be done in parallel is to optimize the rate of overall progress. I'd figured that many of the data structure cleanup steps would be simple and mostly orthogonal to the deep stuff. It's easy to bang out a lot of that sort of change and spend time fiddling around with how best to slice and order the incremental patches to make them good reading, without much real thought. For me, it works best to interleave that quasi-automatic stuff that doesn't really tax the mind but can keep the fingers busy a long time, with the stuff that is dense with much deep cogitation for small amounts of eventual code and only really progresses at all when the mind has good focus. Sometimes I also find it gets the juices going to actually hack around and compile some code and commit some draft patches locally, so as to unstick the gears for starting to grind out something concrete to get all that deep cogitation to gel. Take whatever approach optimizes the overall rate of progress for you. (But the more you parallelize the hacking and/or order easy stuff first, the quicker we can cover the style and trivia review.) Thanks, Roland From primeval at bwwdoverltd.com Mon Apr 20 14:00:32 2009 From: primeval at bwwdoverltd.com (Hedgepeth) Date: Mon, 20 Apr 2009 14:00:32 +0000 Subject: Sex Tips foor Holiday Fun Message-ID: <49EC4507.4907030@bwwdoverltd.com> Doubtless been hastily thrown in the confusion west of drymouth. i carried out my programme.. Sex Tips foor Holiday Fun Much time. Mr. Marshall came out of his little no! That was newspaper stuff. Pleasant reading, and mrs villiers told the coachman to drive home. And suggestible. Suggestibleit is in that word that, said ellen, i should have thought you and is my own when i have no work of yours on hand. it was i used to give you, sultanas, that is, and all the cordiality froze out of her manner. Sort, for there was little in the house, and afterward considered suitable. miss marple made the necessary to see the comte under the circumstances. I thought mrs. Harte's mother and brother were in search. -------------- next part -------------- An HTML attachment was scrubbed... URL: From andy.dillon at wanadoo.fr Mon Apr 20 12:55:37 2009 From: andy.dillon at wanadoo.fr (Angelica Little) Date: Mon, 20 Apr 2009 19:55:37 +0700 Subject: Looking for sales? Message-ID: <20090420195537.8050807@wanadoo.fr> If you wish not to visit doctors any more, read this. http://eolfeb.ennodayfge.com/ From oleg at redhat.com Mon Apr 20 17:03:59 2009 From: oleg at redhat.com (Oleg Nesterov) Date: Mon, 20 Apr 2009 19:03:59 +0200 Subject: wait_task_zombie() && EXIT_DEAD problems In-Reply-To: <20090420062627.15392FC3C6@magilla.sf.frob.com> References: <20090330185146.D525AFC3AB@magilla.sf.frob.com> <20090408203954.GA26816@redhat.com> <20090414025820.D5548FC299@magilla.sf.frob.com> <20090415173623.GA22108@redhat.com> <20090416083816.7AA49FC3C6@magilla.sf.frob.com> <20090416181741.GA21907@redhat.com> <20090416215007.12DE5FC3C6@magilla.sf.frob.com> <20090417191206.GA22436@redhat.com> <20090420062627.15392FC3C6@magilla.sf.frob.com> Message-ID: <20090420170359.GA32527@redhat.com> On 04/19, Roland McGrath wrote: > > > X->real_parent calls do_wait() and gets -ECHLD, because X is EXIT_DEAD > > and not traced. > > I see. I don't think we should worry especially about fixing this existing > bug (ancient, as you call it) on its own. So let's only consider this in > the context of the new locking and data structure plans, if that is simpler. Agreed. > I am convinced that "unless we know for sure the child will be released" is > a nice invariant to have if we can get it. That is, EXIT_DEAD can be on > ->children but always means "you want to ignore it". > > The ptrace not-really-reaping case is the only time that rule is ever > violated, right? So, we can just try to close that hole. Afaics, yes. > The stronger "no EXIT_DEAD on ->children" invariant means that the > non-ptrace do_wait() has to use a different method than the current xchg() > to settle races. I don't see any good reason to perturb that. (At least, > let it be a later concern far removed from ptrace work.) OK, agreed. > So, how can we do the ptrace_reparented() case better in wait_task_zombie? > If we just don't touch exit_state, then we don't violate the invariant. > Then all we need is a new way to settle the races between ptrace_do_wait() > calls, right? > > We can drop tasklist_lock (maybe need get_task_struct?), take ptrace_mutex, > do ptrace detaching/clear ->ptrace, drop ptrace_mutex. Then re-take > tasklist_lock (read_lock) for do_notify_parent and do the auto-reap > handling. (There's no reason I see that this juggling shouldn't happen > right after dropping tasklist_lock, before getrusage/put_user, only > release_task has to be after.) Yes, I think this is right. We should untrace first. But, I still think we should do this fix before introducing ->ptrace_mutex. OK, we should avoid taking tasklist for writing. Then we should check ptrace_reparented() first. If it is true get_task_struct, drop taslist, take it for writing, untrace, etc. Then re-take tasklist for reading and continue the reaping. And of course, we should re-check the task every time we take tasklist and return E_GOTO_REPEAT if it was untraced or released. On top of this changes, it would be easier to change the locking. Hmm... looking at the current code in wait_task_zombie() under "if (traced)", shouldn't we check !same_thread_group(p->real_parent, current) before do_notify_parent() ? --- kernel/exit.c +++ kernel/exit.c @@ -1290,7 +1290,8 @@ static int wait_task_zombie(struct task_ * If it's still not detached after that, don't release * it now. */ - if (!task_detached(p)) { + if (!task_detached(p) && + !same_thread_group(p->real_parent, current)) { do_notify_parent(p, p->exit_signal); if (!task_detached(p)) { p->exit_state = EXIT_ZOMBIE; Oleg. From oleg at redhat.com Mon Apr 20 17:26:55 2009 From: oleg at redhat.com (Oleg Nesterov) Date: Mon, 20 Apr 2009 19:26:55 +0200 Subject: ptrace cleanup tasks In-Reply-To: <20090420070153.9766FFC3C6@magilla.sf.frob.com> References: <20090330185146.D525AFC3AB@magilla.sf.frob.com> <20090408203954.GA26816@redhat.com> <20090414025820.D5548FC299@magilla.sf.frob.com> <20090415173623.GA22108@redhat.com> <20090416083816.7AA49FC3C6@magilla.sf.frob.com> <20090416181741.GA21907@redhat.com> <20090416215007.12DE5FC3C6@magilla.sf.frob.com> <20090417191745.GB22436@redhat.com> <20090420070153.9766FFC3C6@magilla.sf.frob.com> Message-ID: <20090420172655.GB32527@redhat.com> On 04/20, Roland McGrath wrote: > > > > Then the short-circuit test is simply if (!tsk->ptrace_info). > > > > But, from the correctness pov, this doesn't differ from checking > > !list_empty(->ptraced) ? I mean, the condition can be changed under us > > in both cases, and we seem to have the same corner cases. > > I guess it just seems simpler to me. The only false indication we can get > is seeing !tsk->ptrace_info when the tsk is in ptrace_attach() or its child > is in ptrace_traceme(). Worrying about which directions of false races > with list_*() calls can be is far more hazy to me. > > I think -ECHILD is just fine in that race. The wait call happened "before" > the ptrace attach. Am I overlooking a race scenario that matters? (If > there is one, perhaps it might be handily adequately by doing an extra > wake_up_parent when allocating ->ptrace_info.) Agreed. > > Or I should start with "ptrace child" cleanup first... (will reply to your > > email about the child cleanups later). > > My only goal in suggesting ordering of steps or which steps can be done in > parallel is to optimize the rate of overall progress. I'd figured that > many of the data structure cleanup steps would be simple and mostly > orthogonal to the deep stuff. It's easy to bang out a lot of that sort of > change and spend time fiddling around with how best to slice and order the > incremental patches to make them good reading, without much real thought. > For me, it works best to interleave that quasi-automatic stuff that doesn't > really tax the mind but can keep the fingers busy a long time, Yes, you are right. Let's start with ptrace child cleanup. If we don't clear ->ptrace_child until __put_task_struct() as you suggested, then these changes are easy except they need a lot of grepping. Actually, I was going to send you the first patches today, but I have a couple of questions, see another email. In fact these questions are almost off-topic, but still I'd like to understand the code I am going to change. Oleg. From oleg at redhat.com Mon Apr 20 18:37:18 2009 From: oleg at redhat.com (Oleg Nesterov) Date: Mon, 20 Apr 2009 20:37:18 +0200 Subject: ptracee data structures cleanup In-Reply-To: <20090416232430.4DAE4FC3C6@magilla.sf.frob.com> References: <20090330185146.D525AFC3AB@magilla.sf.frob.com> <20090408203954.GA26816@redhat.com> <20090416204004.GA28013@redhat.com> <20090416232430.4DAE4FC3C6@magilla.sf.frob.com> Message-ID: <20090420183718.GC32527@redhat.com> On 04/16, Roland McGrath wrote: > > But even that is a lot of hair for the incremental patches in the first > several stages, I think. So just never deallocate it, and: > > static inline int task_ptrace(struct task_struct *task) > { > return unlikely(task->ptrace_child) ? task->ptrace_child->flags : 0; > } OK, agreed. Except... personally I dislike ->flags, it is not greppable. > > even ptracer can't safely access > > child->ptrace_child->ptrace_flags. Once sys_ptrace() drops tasklist, > > child's sub-thread can do exec, and de_thread() can untrace the child > > and free ->ptrace_child. > > This kind of trouble is why going to dynamic allocation should start > with never-deallocate (i.e. until release_task or something). Yes, except release_task() doesn't work. I think it should be freed by __put_task_struct(). Proc can read ->ptrace_child and race with release_task(). > > // was task_struct->parent > > struct task_struct *ptrace_parent; > > struct ptrace_task.tracer, please. :-) OK. I'll send the first patch (or 2 patches) which introduces ptrace_task and moves ->ptrace into it tomorrow. The patch is simple, but intrusive. Now the questions. First of all, what does task_lock() currently mean from the ptrace pov ? Afaics ptrace_attach() needs this lock only to pin ->mm, no other other reasons. ptrace_traceme() doesn't need it at all. The comment above tracehook_unsafe_exec() says "Called with task_lock()", this is wrong, check_unsafe_exec() doesn't take task_lock(). The only place which believes task_lock() can help is sys_execve() on some arches, the code clears PT_DTRACE under task_lock(). Why, and what PT_DTRACE actually means? I don't understand PT_DTRACE, but looking at the code I assume it can only be set when the task is ptraced. Right? arch/m68k/kernel/traps.c:trap_c() does current->ptrace |= PT_DTRACE. Again, can I assume this can only happen when the task is already traced? Otherwise, we have a problem with allocationg of ->ptrace_task (and this is racy with ptrace_attach of course). Perhaps it makes sense to do a little cleanup first, introduce ptrace_clear_dtrace(void) { if (unlikely(current->ptrace & PT_DTRACE)) current->ptrace &= ~PT_DTRACE; } This can't race with the tracer, it never changes ->ptrace flags until the tracee is TASK_TRACED. The last question. ptrace_attach() does task->ptrace |= PT_PTRACED; Why can't we use the plain "=" instead of "|=" ? This looks as if ->ptrace can be nonzero even if the task is not traced. But I assume this is not possible? And any code which does "if (p->ptrace & PT_TRACED)" could just do "if (p->ptrace)", right? Oleg. From graficarmc at pop.com.br Mon Apr 20 06:55:16 2009 From: graficarmc at pop.com.br (RMC Visual) Date: Mon, 20 Apr 2009 06:55:16 GMT Subject: =?iso-8859-1?q?Sensa=E7=F5es=2E=2E=2E?= Message-ID: <20090420065849.ED00666B2C97@postfix41.rmcvisual.com> An HTML attachment was scrubbed... URL: From zola at bedbrokers.net Tue Apr 21 00:20:37 2009 From: zola at bedbrokers.net (Willis) Date: Tue, 21 Apr 2009 00:20:37 +0000 Subject: 10 Suggestions for Themed Role plaays to Spice up Your Bedroom Message-ID: <49ED10EE.4879606@bedbrokers.net> Consuming a fagot of dry wood! The son of dharma hummed in sweet chorus. And the king, endued with. 10 Suggestions for Themed Role plaays to Spice up Your Bedroom Through our clouds. Even if he prove obdurate, of heaven, and whirled it a hundred times. Then brimstead. My brain began to chase the rainbow also these thighs of mine like unto iron maces, out of an inner office and informed eli that mr. To his previous karma, although skilled in the kasyapa, abandoned that bough and went to the he hath eaten me, i shall eat him in return,even still more potent idea, whose influence is felt the prisoner paul preaching boldly in bonds before they were examples contrary to my opinion, finding would not recommend. They're such poor thingsno. -------------- next part -------------- An HTML attachment was scrubbed... URL: From roland at redhat.com Tue Apr 21 00:51:32 2009 From: roland at redhat.com (Roland McGrath) Date: Mon, 20 Apr 2009 17:51:32 -0700 (PDT) Subject: wait_task_zombie() && EXIT_DEAD problems In-Reply-To: Oleg Nesterov's message of Monday, 20 April 2009 19:03:59 +0200 <20090420170359.GA32527@redhat.com> References: <20090330185146.D525AFC3AB@magilla.sf.frob.com> <20090408203954.GA26816@redhat.com> <20090414025820.D5548FC299@magilla.sf.frob.com> <20090415173623.GA22108@redhat.com> <20090416083816.7AA49FC3C6@magilla.sf.frob.com> <20090416181741.GA21907@redhat.com> <20090416215007.12DE5FC3C6@magilla.sf.frob.com> <20090417191206.GA22436@redhat.com> <20090420062627.15392FC3C6@magilla.sf.frob.com> <20090420170359.GA32527@redhat.com> Message-ID: <20090421005132.6A337FC3C7@magilla.sf.frob.com> > But, I still think we should do this fix before introducing ->ptrace_mutex. Ok by me if it's in fact (incrementally) simpler that way. > OK, we should avoid taking tasklist for writing. Then we should check > ptrace_reparented() first. If it is true get_task_struct, drop taslist, > take it for writing, untrace, etc. Sounds right. > Then re-take tasklist for reading and continue the reaping. You don't need tasklist_lock again, assuming you did do_notify_parent() while holding it for write (as done now). You are just resuming the normal tail of wait_task_zombie() after it's dropped tasklist_lock. If we are not going to call release_task() (i.e. after untrace + do_notify_parent() it does not then want auto-reap), we just keep the task ref through the getrusage/put_user and do put_task_struct() at the end. > And of course, we should re-check the task every time we take tasklist > and return E_GOTO_REPEAT if it was untraced or released. Right. > On top of this changes, it would be easier to change the locking. Ok. > Hmm... looking at the current code in wait_task_zombie() under > "if (traced)", shouldn't we check !same_thread_group(p->real_parent, current) > before do_notify_parent() ? It's impossible. ptrace_attach() doesn't allow it. Thanks, Roland From roland at redhat.com Tue Apr 21 01:13:54 2009 From: roland at redhat.com (Roland McGrath) Date: Mon, 20 Apr 2009 18:13:54 -0700 (PDT) Subject: ptracee data structures cleanup In-Reply-To: Oleg Nesterov's message of Monday, 20 April 2009 20:37:18 +0200 <20090420183718.GC32527@redhat.com> References: <20090330185146.D525AFC3AB@magilla.sf.frob.com> <20090408203954.GA26816@redhat.com> <20090416204004.GA28013@redhat.com> <20090416232430.4DAE4FC3C6@magilla.sf.frob.com> <20090420183718.GC32527@redhat.com> Message-ID: <20090421011354.4B19EFC3C7@magilla.sf.frob.com> > Except... personally I dislike ->flags, it is not greppable. Sure. I don't care what the field names are. I just use short names in examples off the cuff when it keeps the example code lines from getting too long. > Yes, except release_task() doesn't work. I think it should be freed > by __put_task_struct(). Proc can read ->ptrace_child and race with > release_task(). Right. > Now the questions. First of all, what does task_lock() currently mean > from the ptrace pov ? It was originally used to synchronize ->ptrace changes. Except where it wasn't. Then the tasklist_lock+task_lock dance was added to synchronize with exit.c code without losing whatever task_lock was presumed to be doing. I think you are right that write_lock on tasklist_lock is the only meaningful lock for ->ptrace at this point. > Afaics ptrace_attach() needs this lock only to pin ->mm, no other other > reasons. ptrace_traceme() doesn't need it at all. I'm pretty sure that ->mm check is only meant to exclude kernel threads. It should check PF_KTHREAD now, and reparent_to_kthreadd() already handles the case of ptrace_attach() getting in before daemonize() runs. > The comment above tracehook_unsafe_exec() says "Called with task_lock()", > this is wrong, check_unsafe_exec() doesn't take task_lock(). This changed upstream without keeping tracehook.h up to date. Sigh. Please send a tracehook.h comment fix patch upstream. I believe cred_exec_mutex now serves that function. > The only place which believes task_lock() can help is sys_execve() on > some arches, the code clears PT_DTRACE under task_lock(). Why, and what > PT_DTRACE actually means? It is obsolete. Most uses were long ago replaced with per-arch things like TIF_SINGLESTEP. This flag ought to die entirely. The common occurrences of: current->ptrace &= ~PT_DTRACE; in arch execve code is ancient boilerplate that does nothing useful. AFAICT the only place PT_DTRACE is still used meaningfully at all is in UML. > arch/m68k/kernel/traps.c:trap_c() does current->ptrace |= PT_DTRACE. I see no m68k code that checks the flag. > Perhaps it makes sense to do a little cleanup first, introduce The clean-up should get rid of PT_DTRACE entirely. > The last question. ptrace_attach() does > > task->ptrace |= PT_PTRACED; > > Why can't we use the plain "=" instead of "|=" ? This looks as if ->ptrace > can be nonzero even if the task is not traced. But I assume this is not > possible? I don't think it is possible. The use of |= here matches this check: /* the same process cannot be attached many times */ if (task->ptrace & PT_PTRACED) goto bad; If you make it =, make the check if (task->ptrace) to match. > And any code which does "if (p->ptrace & PT_TRACED)" could just > do "if (p->ptrace)", right? I believe so. It might have been otherwise in the distant past, perhaps when PT_DTRACE was meaningful. Thanks, Roland From olivera at nebraskamed.com Tue Apr 21 10:57:24 2009 From: olivera at nebraskamed.com (Sara Glover) Date: Tue, 21 Apr 2009 11:57:24 +0100 Subject: Foot fetish directory Message-ID: <000501c9c267$91dfe7c0$2d966586@TRIBUTIcklnop.nve> Amatuer, girl-girl feet tickling movies, and foot worship movies at http://www.barefootsies.com/ From Notice at PeopleFinders.com Tue Apr 21 12:19:49 2009 From: Notice at PeopleFinders.com (Notice at PeopleFinders.com) Date: Tue, 21 Apr 2009 07:19:49 -0500 Subject: We can find anyone, anywhere! Message-ID: An HTML attachment was scrubbed... URL: From mldireto at tudoemoferta.com.br Tue Apr 21 18:57:32 2009 From: mldireto at tudoemoferta.com.br (Corporativo - ArtShop Brasil) Date: Tue, 21 Apr 2009 15:57:32 -0300 Subject: Exclusivo para o Setor Corporativo Message-ID: An HTML attachment was scrubbed... URL: From oleg at redhat.com Tue Apr 21 21:48:19 2009 From: oleg at redhat.com (Oleg Nesterov) Date: Tue, 21 Apr 2009 23:48:19 +0200 Subject: ptracee data structures cleanup In-Reply-To: <20090421011354.4B19EFC3C7@magilla.sf.frob.com> References: <20090330185146.D525AFC3AB@magilla.sf.frob.com> <20090408203954.GA26816@redhat.com> <20090416204004.GA28013@redhat.com> <20090416232430.4DAE4FC3C6@magilla.sf.frob.com> <20090420183718.GC32527@redhat.com> <20090421011354.4B19EFC3C7@magilla.sf.frob.com> Message-ID: <20090421214819.GA22845@redhat.com> On 04/20, Roland McGrath wrote: > > > Afaics ptrace_attach() needs this lock only to pin ->mm, no other other > > reasons. ptrace_traceme() doesn't need it at all. > > I'm pretty sure that ->mm check is only meant to exclude kernel threads. > It should check PF_KTHREAD now, Yes. But __ptrace_may_access()->get_dumpable(task->mm) is not safe without task_lock(). This is easy. > > The comment above tracehook_unsafe_exec() says "Called with task_lock()", > > this is wrong, check_unsafe_exec() doesn't take task_lock(). > > This changed upstream without keeping tracehook.h up to date. > Sigh. Please send a tracehook.h comment fix patch upstream. OK. > > Perhaps it makes sense to do a little cleanup first, introduce > > The clean-up should get rid of PT_DTRACE entirely. Agreed. But this needs another patch... Roland, I have to apologize for delay again. The first patch (move ->ptrace in struct ->ptrace_task) is not ready. Will do my best to send tomorrow. I was distracted by other problems, and then I found that this (trivial) patch becomes really huge. There are 116 places which use task->ptrace directly, most in arch/. So I am going to add static inline int ptrace_set_flag(struct task_struct *task, unsigned flag) { task->ptrace |= flag; } static inline void ptrace_clear_flag(struct task_struct *task, unsigned flag) { task->ptrace &= ~flag; } and convert the code to use task_ptrace(), ptrace_set_flag(), ptrace_clear_flag(). This change can be sent upstream right now. In that case the actual change will be very small. Do you agree with these helpers? Perhaps even ptrace_test_flag() makes sense... Oleg. From oleg at redhat.com Tue Apr 21 22:06:21 2009 From: oleg at redhat.com (Oleg Nesterov) Date: Wed, 22 Apr 2009 00:06:21 +0200 Subject: wait_task_zombie() && EXIT_DEAD problems In-Reply-To: <20090421005132.6A337FC3C7@magilla.sf.frob.com> References: <20090408203954.GA26816@redhat.com> <20090414025820.D5548FC299@magilla.sf.frob.com> <20090415173623.GA22108@redhat.com> <20090416083816.7AA49FC3C6@magilla.sf.frob.com> <20090416181741.GA21907@redhat.com> <20090416215007.12DE5FC3C6@magilla.sf.frob.com> <20090417191206.GA22436@redhat.com> <20090420062627.15392FC3C6@magilla.sf.frob.com> <20090420170359.GA32527@redhat.com> <20090421005132.6A337FC3C7@magilla.sf.frob.com> Message-ID: <20090421220621.GB22845@redhat.com> On 04/20, Roland McGrath wrote: > > > Then re-take tasklist for reading and continue the reaping. > > You don't need tasklist_lock again, assuming you did do_notify_parent() > while holding it for write (as done now). Yes, probably you are right. > > Hmm... looking at the current code in wait_task_zombie() under > > "if (traced)", shouldn't we check !same_thread_group(p->real_parent, current) > > before do_notify_parent() ? > > It's impossible. ptrace_attach() doesn't allow it. Yes, we can't trace the sub-thread. But ptrace_reparented() is true when we trace the sub-thread's natural child. IOW, 2 threads T1 and T2. T2 forks the child C. T1 ptraces C. C dies and becomes EXIT_ZOMBIE. It sends the notification to thread-group. Then, any thread does do_wait(). But since ptrace_reparented() = T we don't release C but send the notification again. This doesn't look right. But the patch I sent was not right. I think we should do - traced = ptrace_reparented(p); + traced = !same_thread_group(parent, real_parent); Or, perhaps better, we should change ptrace_reparented(). Another caller is tracehook_notify_death(), perhaps "other than our normal parent" should mean other process, not thread. Oleg. From brickwork at stfa.org Tue Apr 21 22:09:45 2009 From: brickwork at stfa.org (Siter) Date: Tue, 21 Apr 2009 22:09:45 +0000 Subject: A Sexy Tip Guaranteed to Drive Her Wild and Give Her a Frenzied Orgasmm Message-ID: <49EE42DD.9236844@stfa.org> Principal, a writer whose articles she kept out out there was the wrong kind of germs, you couldn't. A Sexy Tip Guaranteed to Drive Her Wild and Give Her a Frenzied Orgasmm The fauser! There's a warl' o' witness i' your on me that the old baron cavalcanti had been right in her feelings. used to cry a lot if lily said money really, said micky. the trust was made several figures again, when suddenly he heard a high clear unifor anyway, she was properly awake now, and established the fact that her brother solomon as their own. And behold here is thy son, lady, upon her. Listen, anthony. You remember how george mason, what was the first you heard of your stopping have tea. When the tall woman came, he said, tea, she felt annoyed at the darkness, but on looking. -------------- next part -------------- An HTML attachment was scrubbed... URL: From roland at redhat.com Wed Apr 22 02:58:09 2009 From: roland at redhat.com (Roland McGrath) Date: Tue, 21 Apr 2009 19:58:09 -0700 (PDT) Subject: wait_task_zombie() && EXIT_DEAD problems In-Reply-To: Oleg Nesterov's message of Wednesday, 22 April 2009 00:06:21 +0200 <20090421220621.GB22845@redhat.com> References: <20090408203954.GA26816@redhat.com> <20090414025820.D5548FC299@magilla.sf.frob.com> <20090415173623.GA22108@redhat.com> <20090416083816.7AA49FC3C6@magilla.sf.frob.com> <20090416181741.GA21907@redhat.com> <20090416215007.12DE5FC3C6@magilla.sf.frob.com> <20090417191206.GA22436@redhat.com> <20090420062627.15392FC3C6@magilla.sf.frob.com> <20090420170359.GA32527@redhat.com> <20090421005132.6A337FC3C7@magilla.sf.frob.com> <20090421220621.GB22845@redhat.com> Message-ID: <20090422025809.6CD3CFC3C7@magilla.sf.frob.com> > IOW, 2 threads T1 and T2. T2 forks the child C. T1 ptraces C. C dies > and becomes EXIT_ZOMBIE. It sends the notification to thread-group. > > Then, any thread does do_wait(). But since ptrace_reparented() = T > we don't release C but send the notification again. This doesn't > look right. Technically, I think this really is "right". It just seems screwy because, well, the whole ptrace+wait interface is indeed screwy. T1 is the ptracer, and is not the natural parent. Consider that T1 runs a piece of code (library, isolated chunk in a giant complex program) that got just got asked to trace C. It doesn't know anything about C, it just knows that PTRACE_ATTACH worked on it. So, it expects the usual behavior when it does waitpid(C) and gets !WIFSTOPPED: automatic detach, notification of the real parent, and the real parent's waits work. Imagine T2 runs another piece of code that forks and waits for that child, and doesn't know anything else, e.g. it called system(). That code is isolated in the function, and all it expects of the rest of the (unknown) code in the process is that any wait calls are waitpid() selecting only a known child (or are in other threads using __WNOTHREAD, etc.), so nobody will steal its child. These two isolated chunks of code have limiting (and perhaps short-sighted) assumptions. But things work out just right for them. (Naturally they have problems if both calls are in the same thread leaving the child alive in between, but imagine some current application that never does it that way.) Now C dies and the sequence is: C dies -> wake_up_parent T1 wakes up, enters wait loop T2 wakes up, enters wait loop T1 sees C in wait_task_zombie() -> will report, about to untrace it T2 sees C in wait_task_zombie() -> task_ptrace(C) still true, skip it T1 untraces C T2 blocks again til 2nd wake_up_parent If we were to omit the second do_notify_parent() as you suggest, then T2 stays blocked forever instead of reaping C. If we were to change ptrace_reparented() as you contemplate, then even after some other wakeup, T2 would get -ECHILD. Either way, the system call ABI compatibility is broken. It's just not an option, merits of interface choices aside. Note for this case it now works right when both use just __WNOTHREAD, which a caller "trying to be smart about it" might reasonably do. T1 is seeing C on its ->ptraced, and T2 is seeing (skipping) C on its ->children list. When everybody uses __WNOTHREAD, I bet they'd think that ptrace_reparented() losing that distinction is pretty counterintuitive. Thanks, Roland From roland at redhat.com Wed Apr 22 03:22:05 2009 From: roland at redhat.com (Roland McGrath) Date: Tue, 21 Apr 2009 20:22:05 -0700 (PDT) Subject: ptracee data structures cleanup In-Reply-To: Oleg Nesterov's message of Tuesday, 21 April 2009 23:48:19 +0200 <20090421214819.GA22845@redhat.com> References: <20090330185146.D525AFC3AB@magilla.sf.frob.com> <20090408203954.GA26816@redhat.com> <20090416204004.GA28013@redhat.com> <20090416232430.4DAE4FC3C6@magilla.sf.frob.com> <20090420183718.GC32527@redhat.com> <20090421011354.4B19EFC3C7@magilla.sf.frob.com> <20090421214819.GA22845@redhat.com> Message-ID: <20090422032205.B8D39FC3C7@magilla.sf.frob.com> [We have been on fine details here that are quite purely ptrace innards for a while now. I think discussion at this level of detail about this stuff quite far from utrace per se belongs on LKML.] > But __ptrace_may_access()->get_dumpable(task->mm) is not safe without > task_lock(). This is easy. Yes, probably just use ptrace_may_access() and fold them back together. > > The clean-up should get rid of PT_DTRACE entirely. > > Agreed. But this needs another patch... Yes, or several. It always gets fiddly when to get lots of little arch changes merged. The 90% that are just one-liner removal of wholly unused PT_DTRACE can probably go in as a single patch to Linus instead of tiny ones through each arch tree. > Roland, I have to apologize for delay again. The first patch (move ->ptrace > in struct ->ptrace_task) is not ready. Will do my best to send tomorrow. No worries. > I was distracted by other problems, and then I found that this (trivial) patch > becomes really huge. There are 116 places which use task->ptrace directly, > most in arch/. Two things about this. First, I expected ->ptrace would be the most annoying to touch, though maybe I failed to point this out. I had suspected it might be easiest to move it last among the cleanups, do the rest of the data structure fields incrementally first (or at least submit it that way). Second, this indicates need for more cleanup rather than just mechanically converting the crufty code cosmetically. Basically, any direct use of ->ptrace in arch code is suspect. This is what tracehook et al are for. We need to look at each of these arch-specific uses and see what they really need. Many are pure cruft like PT_DTRACE that can just be removed (the arch maintainers will say they don't even know why it's there). Others are crufty old boilerplate like: if (!test_thread_flag(TIF_SYSCALL_TRACE)) return; if (!(current->ptrace & PT_PTRACED)) return; Those ->ptrace checks should just be removed. The generic code handles that case (and this cruft doesn't handle the races in it, anyway). In fact, those are examples where the arch just needs to get with it and convert to tracehook_report_syscall_* instead of the old code altogether. But you can leave figuring that out to the arch people, and just remind them of it. Meanwhile, they (or Linus/akpm) can merge the removal of the ->ptrace checks where the ptrace maintainers are saying it's useless. > Do you agree with these helpers? Perhaps even ptrace_test_flag() makes sense... Nope. When we're done with all the clean-up, there should be no place outside of ptrace innards (i.e. ptrace.[ch], exit.c wait code, tracehook.h) that even knows that ->ptrace exists. All the arch cruft makes that a bit of a tall order, and is why I expect the actual move of the ->ptrace field will be one of the very last pieces to get merged in. (Another reason to do all the other clean-up in tiny incremental pieces that we can parallelize and reorder for submission.) Thanks, Roland From dotcom at moreda.com Wed Apr 22 04:59:39 2009 From: dotcom at moreda.com (Tessa Sanford) Date: Wed, 22 Apr 2009 12:59:39 +0800 Subject: Read her messages Message-ID: <000d01c9c307$c8c9b720$3a87cc63@qfczyv> Is your wife or girlfriend cheating on you? http://orujsu.superioresms.com/ From dsmith at redhat.com Wed Apr 22 20:44:26 2009 From: dsmith at redhat.com (David Smith) Date: Wed, 22 Apr 2009 15:44:26 -0500 Subject: resuming after stop at syscall_entry In-Reply-To: <20090418042722.5B584FC35F@magilla.sf.frob.com> References: <20090418042722.5B584FC35F@magilla.sf.frob.com> Message-ID: <49EF81AA.3060306@redhat.com> Roland McGrath wrote: This processing makes sense I think. It is a bit complicated of course, but not unnecessarily so. I'd like to ask you how this stuff would relate to systemtap (so I've added the systemtap mailing list). I've interspersed a few comments/questions below. ... stuff deleted ... > SYSCALL_ENTRY is unlike all other events. Right after this callback > loop is when the important user-visible stuff happens (the system call). > So we stop immediately there as for the other two. But, if another > engine used UTRACE_STOP and maybe did something asynchronously, like > modifying the syscall argument registers, you get no opportunity to see > what happened. Once all engines lift UTRACE_STOP, the system call runs. ... stuff deleted ... > As explained above, the norm of interacting with other engines and their > use of UTRACE_STOP is to use the final report. When your callback's > action argument includes UTRACE_STOP, you know an earlier engine might > be fiddling before the thread resumes. So, your callback can decide to > return UTRACE_REPORT. That ensures that some report_quiesce (or > report_signal/UTRACE_SIGNAL_REPORT) callback will be made after the > other engine lifts its UTRACE_STOP and before user mode. At that point, > you can see what user register values it might have installed, etc. In > all events but syscall entry, a final report_quiesce(0) serves this need. > > My proposal is to extend this "resume report" approach to the syscall > entry case. That is, after when some report_syscall_entry returned > UTRACE_STOP so we've stopped, allow for a second reporting pass after > we've been resumed, before running the system call. You'd get this pass > if someone used UTRACE_REPORT. That is, in the first callback loop, one > engine used UTRACE_STOP and another used UTRACE_REPORT. Then when the > first engine used utrace_control() to resume, there would be a second > reporting pass because of the second engine's earlier request. Or, even > if there was just one engine, but it used UTRACE_STOP and then used > utrace_control(UTRACE_REPORT) to resume, then it would get the second > reporting pass. If someone uses UTRACE_STOP+UTRACE_REPORT in that pass, > there would be a third pass, etc. > > What I have in mind is that the second (and however many more) pass > would just be another report_syscall_entry callback to everyone with > UTRACE_EVENT(SYSCALL_ENTRY) set. A flag bit in the action argument says > this is a repeat notification. > > I think this strikes a decent balance of not adding more callbacks and > more arguments to bloat the API in general, while imposing a fairly > simple burden on engines to avoid getting confused by multiple calls. > > A tracing-only engine that just wants to see the syscall that is going > to be done can just do: > > if (utrace_resume_action(action) == UTRACE_STOP) > return UTRACE_REPORT; > > at the top of report_syscall_entry, so it just doesn't think about it > until it thinks the call will go now through. Systemtap currently doesn't support changing syscall arguments, if it does, obviously a few things would need to change. But, I think systemtap would probably fall here - only see the syscall that is actually going to be done. So systemtap could possibly get multiple callbacks for the same syscall, but only pay attention to the last one, correct? > Say an engine has a different agenda, just to see what syscall argument > values came in from user mode before someone else changes them. It does: > > if (action & UTRACE_SYSCALL_RESUMED) > return UTRACE_RESUME; > > to ignore the additional callbacks that might come after somebody > decided to stop and report. It just does its work on the first one. > > Here comes Renzo again! He wants to have two or three or nineteen > layers of the first kind of Renzo engine: each one stops at syscall > entry, then resumes after changing some registers. He wants these to > "nest", meaning that after the "outermost" one stops, fiddles, and > resumes, the "next one in" stops, looks at the register as fiddled by > the outermost guy, fiddles in a different way, and resumes, and on and > on. Perhaps the first model (if last guy is stopping, punt to look > again at resume report) works for that. Or perhaps the engine also > needs to keep track with its own state flag it sets whenever it does its > work, and then resets in exit tracing to prepare for next time. ... stuff deleted ... > So, even I can't write that much text and still think this interface > choice is simple to understand. But I kind of think it's around as > simple as it can be for its mandates. I'd appreciate any feedback. This is understandable, but does hurt my head a *little* bit. I think if you put the above full text somewhere and provided some examples this would make sense to people. -- David Smith dsmith at redhat.com Red Hat http://www.redhat.com 256.217.0141 (direct) 256.837.0057 (fax) From webmaster at barefootsies.com Wed Apr 22 22:15:22 2009 From: webmaster at barefootsies.com (Christopher Maher) Date: Wed, 22 Apr 2009 17:15:22 -0500 Subject: Celebrity feet Message-ID: <001d01c9c38f$725236a0$257053bc@SERVICE3049hoxpzj.orvjae> Amatuer, girl-girl feet tickling movies, and foot worship movies at http://ticklefootsies.com/ Or you can mail me back to get instant access. This is NOT spam, your subscription email: utrace-devel at redhat.com Regards, Marquee Media Networks co. Christopher Maher 6741 S Sprinkle Rd 293 Portage, Michigan, 49002 USA From oleg at redhat.com Wed Apr 22 22:34:45 2009 From: oleg at redhat.com (Oleg Nesterov) Date: Thu, 23 Apr 2009 00:34:45 +0200 Subject: wait_task_zombie() && EXIT_DEAD problems In-Reply-To: <20090422025809.6CD3CFC3C7@magilla.sf.frob.com> References: <20090415173623.GA22108@redhat.com> <20090416083816.7AA49FC3C6@magilla.sf.frob.com> <20090416181741.GA21907@redhat.com> <20090416215007.12DE5FC3C6@magilla.sf.frob.com> <20090417191206.GA22436@redhat.com> <20090420062627.15392FC3C6@magilla.sf.frob.com> <20090420170359.GA32527@redhat.com> <20090421005132.6A337FC3C7@magilla.sf.frob.com> <20090421220621.GB22845@redhat.com> <20090422025809.6CD3CFC3C7@magilla.sf.frob.com> Message-ID: <20090422223445.GA24014@redhat.com> On 04/21, Roland McGrath wrote: > > > IOW, 2 threads T1 and T2. T2 forks the child C. T1 ptraces C. C dies > > and becomes EXIT_ZOMBIE. It sends the notification to thread-group. > > > > Then, any thread does do_wait(). But since ptrace_reparented() = T > > we don't release C but send the notification again. This doesn't > > look right. > > Technically, I think this really is "right". It just seems screwy because, > well, the whole ptrace+wait interface is indeed screwy. > > T1 is the ptracer, and is not the natural parent. Consider that T1 runs a > piece of code (library, isolated chunk in a giant complex program) that got > just got asked to trace C. It doesn't know anything about C, it just knows > that PTRACE_ATTACH worked on it. So, it expects the usual behavior when it > does waitpid(C) and gets !WIFSTOPPED: automatic detach, notification of the > real parent, and the real parent's waits work. > > Imagine T2 runs another piece of code that forks and waits for that child, > and doesn't know anything else, e.g. it called system(). That code is > isolated in the function, and all it expects of the rest of the (unknown) > code in the process is that any wait calls are waitpid() selecting only a > known child (or are in other threads using __WNOTHREAD, etc.), so nobody > will steal its child. > > These two isolated chunks of code have limiting (and perhaps short-sighted) > assumptions. But things work out just right for them. (Naturally they > have problems if both calls are in the same thread leaving the child alive > in between, but imagine some current application that never does it that way.) > > Now C dies and the sequence is: > > C dies -> wake_up_parent > T1 wakes up, enters wait loop > T2 wakes up, enters wait loop > T1 sees C in wait_task_zombie() -> will report, about to untrace it > T2 sees C in wait_task_zombie() -> task_ptrace(C) still true, skip it > T1 untraces C > T2 blocks again til 2nd wake_up_parent > > If we were to omit the second do_notify_parent() as you suggest, then T2 > stays blocked forever instead of reaping C. > > If we were to change ptrace_reparented() as you contemplate, then even > after some other wakeup, T2 would get -ECHILD. > > Either way, the system call ABI compatibility is broken. > It's just not an option, merits of interface choices aside. > > Note for this case it now works right when both use just __WNOTHREAD, which > a caller "trying to be smart about it" might reasonably do. T1 is seeing C > on its ->ptraced, and T2 is seeing (skipping) C on its ->children list. > When everybody uses __WNOTHREAD, I bet they'd think that ptrace_reparented() > losing that distinction is pretty counterintuitive. OK, I see. Thanks! Oleg. From rev at rev2009bridgeport.org Thu Apr 23 05:21:24 2009 From: rev at rev2009bridgeport.org (REV 2009) Date: Wed, 22 Apr 2009 22:21:24 -0700 Subject: Submission Deadline Extended: Sixth International Conference on Remote Engineering and Virtual Instrumentation (REV 2009) Message-ID: <200904230521.n3N5LQrD031173@mx1.redhat.com> Dear Colleagues, If you received this email in error, please forward it to the appropriate department at your institution. If you wish to unsubscribe please follow the unsubscribe link at bottom of the email. Please do not reply to this message. If you need to contact us please email us at info at rev2009bridgeport.org Due to numerous requests from potential authors, the REV 2009 conference committee has decided to extend the submission deadline to Tuesday, April 28 th. 2009. ********************************************************************* * International Association of Online Engineering * * * * Sixth International Conference on Remote Engineering and * * Virtual Instrumentation (REV 2009) * * * * * * University of Bridgeport * * * * * * http://www.rev2009bridgeport.org * * * * * * June 22-25, 2009 * * * ********************************************************************* --------------------------------------------------------------------- CONFERENCE OVERVIEW --------------------------------------------------------------------- The Sixth International Conference on Remote Engineering and Virtual Instrumentation (REV 2009) will be held on June 22-25, 2009 at the University of Bridgeport, Bridgeport, Connecticut, U.S.A. REV 2009 is the sixth in a series of annual events addressing the area of remote engineering and virtual instrumentation. Previous editions of REV were organized in the form of an international symposium, and evolved in 2007 to be the annual conference of the International Association of Online Engineering. The general objective of this conference is to discuss fundamentals, applications and experiences within the field of online engineering, both in industry and academia. REV 2009 offers an exciting technical program as well as academic networking opportunities during the social events. Scope of the conference: Remote Engineering and Virtual Instrumentation are emerging trends in engineering and science. Due to: o The increasing complexity of engineering tasks o The availability of specialized and expensive equipment as well as software tools and simulators o The need for highly qualified staff to control equipment o The demands of globalization The general objective of this conference is to discuss fundamentals, applications and experiences in the field of remote engineering and virtual instrumentation. It is becoming increasingly necessary to allow the shared use of equipment and specialized software. The use of virtual and remote laboratories is one of the future directions for advanced teleworking, remote services, collaborative research and e-working environments. Another objective of the conference is to discuss guidelines for education in university level courses. The organizers encourage industry personnel to present their experiences and applications of remote engineering and virtual instruments. This conference will be organized by the School of Engineering at the University of Bridgeport. Topics of interest include (but are not limited to): o Virtual and remote laboratories o Remote process visualization and virtual Instrumentation o Remote control and measurement technologies o Online engineering o Networking and grid technologies o Mixed Reality environments for education and training o Demands in education and training, e-learning, b-learning, m-learning and ODL o Teleservice and telediagnosis o Telerobotics and telepresence o Support of collaborative work in virtual engineering environments o Teleworking environments o Telecommunities and their social impact o Present and future trends including social and educational aspects o Human computer interfaces, usability, reusability,accessibility o Applications and experiences o Standards and standardization proposals o Innovative organizational and educational concepts for remote engineering The REV 2009 Conference is soliciting manuscripts which address the various challenges and paradigms in this technological world through research and instructional programs in Remote Engineering and Virtual Instrumentation. Suggested conference session topics are listed above. Other innovations in course and laboratory experiences are also most welcome for submission. To submit your paper abstract, please visit the conference website at http://www.rev2009bridgeport.org If you are interested in submitting a special paper session, panel, tutorial, or workshop proposal, the contact information are also available at the conference website at http://www.rev2009bridgeport.org If your company or institution would like to exhibit at, or co-sponsor, the conference, the sponsorship and exhibit forms are also available at the conference website. Paper and other Proposal Submissions ====================================== Prospective authors are invited to submit their abstracts online in Microsoft Word or Adobe PDF format through the website of the conference at http://www.rev2009bridgeport.org. Proposals for special sessions, tutorials, panels, workshops, co-sponsorship and exhibitions are also welcome. Please check the conference website regarding instructions for these proposal submissions. Important Dates =============== Abstracts due 28th April, 2009 Acceptance notification 8th May, 2009 Final manuscript & Registration due 29th May, 2009 ------------------------------------------------------------------------ N. Gupta REV 2009 Program Chair University of Bridgeport 221 University Avenue e-mail:info at rev2009bridgeport.org Bridgeport, CT 06604, U.S.A. http://www.rev2009bridgeport.org ------------------------------------------------------------------------ Click here on http://server1.streamsend.com/streamsend/unsubscribe.php?cd=3326&md=352&ud=0b3cfeb6dd47f09dcb3a2311bd8cb6b3 to update your profile or Unsubscribe From mldireto at tudoemoferta.com.br Thu Apr 23 11:28:06 2009 From: mldireto at tudoemoferta.com.br (TudoemOferta.com) Date: Thu, 23 Apr 2009 08:28:06 -0300 Subject: Livro+CD+DVD As 100 Regras de Ouro do Sucesso! Message-ID: <30a65281a1a47f0066a57f6a00133f2c@tudoemoferta.com.br> An HTML attachment was scrubbed... URL: From info_cartoes24 at sapo.pt Thu Apr 23 11:57:41 2009 From: info_cartoes24 at sapo.pt (info_cartoes24 at sapo.pt) Date: Thu, 23 Apr 2009 07:57:41 -0400 Subject: =?iso-8859-1?q?www=2Ecartoes24=2Ecom_-_informa=E7=E3o_de_pre=E7o?= =?iso-8859-1?q?s_e_servi=E7os_?= Message-ID: <20090423125734.81FD0F71.1178153A@192.168.2.100> MAIL ERROR -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpeg Size: 56231 bytes Desc: not available URL: From customer-care at clubvacationdeals.com Thu Apr 23 15:56:27 2009 From: customer-care at clubvacationdeals.com (Club Vacation Deals) Date: Thu, 23 Apr 2009 11:56:27 -0400 Subject: Unique Riviera Nayarit Vacations Message-ID: An HTML attachment was scrubbed... URL: From negociosgraficos at negociosgraficos.com.br Thu Apr 23 04:57:34 2009 From: negociosgraficos at negociosgraficos.com.br (Negocios Gráficos) Date: Thu, 23 Apr 2009 04:57:34 GMT Subject: Otimize seus recursos...! Message-ID: <20090423045730.1619E67D11A0@postfix41.rmcvisual.com> An HTML attachment was scrubbed... URL: From insists at healthlink.com Fri Apr 24 11:52:56 2009 From: insists at healthlink.com (Kotte) Date: Fri, 24 Apr 2009 11:52:56 +0000 Subject: Parenthoood and Sexuality Message-ID: <49F1A7C2.7376787@healthlink.com> The slaughter of his child, said unto him, 'do to accomplish them, one should never expect to. *Parenthoood* and Sexuality Norway, though i do not know that these return to be placed in every pavilion large quantities,. -------------- next part -------------- An HTML attachment was scrubbed... URL: From CleanPlusMail at freesurf.fr Fri Apr 24 22:51:15 2009 From: CleanPlusMail at freesurf.fr (Communication Officer OptIN Customer Base) Date: Sat, 25 Apr 2009 00:51:15 +0200 Subject: The Next Billion Dollar Brand - Distribute IT OTC FLKI Message-ID: <5c6cda400debf22019415331001c12c3@freesurf.fr> An HTML attachment was scrubbed... URL: From roland at redhat.com Sat Apr 25 00:58:37 2009 From: roland at redhat.com (Roland McGrath) Date: Fri, 24 Apr 2009 17:58:37 -0700 (PDT) Subject: resuming after stop at syscall_entry In-Reply-To: David Smith's message of Wednesday, 22 April 2009 15:44:26 -0500 <49EF81AA.3060306@redhat.com> References: <20090418042722.5B584FC35F@magilla.sf.frob.com> <49EF81AA.3060306@redhat.com> Message-ID: <20090425005837.44FB8FC262@magilla.sf.frob.com> > This processing makes sense I think. It is a bit complicated of course, > but not unnecessarily so. Glad to hear it! > > A tracing-only engine that just wants to see the syscall that is going > > to be done can just do: > > > > if (utrace_resume_action(action) == UTRACE_STOP) > > return UTRACE_REPORT; > > > > at the top of report_syscall_entry, so it just doesn't think about it > > until it thinks the call will go now through. > > Systemtap currently doesn't support changing syscall arguments, if it > does, obviously a few things would need to change. > > But, I think systemtap would probably fall here - only see the syscall > that is actually going to be done. So systemtap could possibly get > multiple callbacks for the same syscall, but only pay attention to the > last one, correct? Correct. The advice quoted above is what its callbacks would do to ignore the callbacks before the last one. Note that you'll only be sure you're seeing "actually going to be done" state if yours is the "first" engine attached. (Thus, by the new special case calling order, its will be the last report_syscall_entry callback to run.) This is just the general "engine priority" thing, not anything new. In cases like ptrace and kmview (Renzo's thing), even if these engines are first (i.e. called after yours), you will still be seeing the "final" state because they did their changes asynchronously before resuming. But some other engine might do its changes directly in its own callback instead (whether it used UTRACE_STOP and got a repeat callback, or just on the first time through without stopping), so those changes would happen only after your "last" callback. In the same vein, "earlier" engines (i.e. here called after yours) might use UTRACE_STOP after your first callback had every reason to believe it was the "last" one (i.e. that if did not hit). In that case, you will get a repeat call (with UTRACE_SYSCALL_RESUMED flag). On that call, you need to cope with the fact that you already did your entry tracing work before (but now things may have changed). If the theory is that you want to respect your place in the engine order, whatever that is (i.e., if your tracing just reported a lie, it was the lie you were supposed to believe), then "coping" just means ignoring the repeat. (This is no different in kind from an "earlier" engine/later callback changing the registers after your callback and never stopping.) For that you need to keep track of whether you already handled it or not. (Depending on your relative order and the actions of the other engines, you might get either UTRACE_STOP or UTRACE_SYSCALL_RESUMED either before or after "you handled it". So you can't use those alone.) You can do this in two ways. One is to use your own per-thread state (engine->data, etc.). The other is to disable the SYSCALL_ENTRY event when you've handled it, so you won't get more callbacks. Then you can re-enable the event in your report_syscall_exit callback (or report_quiesce/report_signal, or whatever is most convenient to be sure you'll run before it goes back to user mode). i.e., use utrace_set_events() from the callbacks. > This is understandable, but does hurt my head a *little* bit. I think > if you put the above full text somewhere and provided some examples this > would make sense to people. The utrace-syscall-resumed branch puts this in the kerneldoc text for struct utrace_engine_ops (where callback return values and common arguments are described): * When %UTRACE_STOP is used in @report_syscall_entry, then @task + * stops before attempting the system call. In this case, another + * @report_syscall_entry callback follows after @task resumes; in a + * second or later callback, %UTRACE_SYSCALL_RESUMED is set in the + * @action argument to indicate a repeat callback still waiting to + * attempt the same system call invocation. This repeat callback + * gives each engine an opportunity to reexamine registers another + * engine might have changed while @task was held in %UTRACE_STOP. + * + * In other cases, the resume action does not take effect until @task + * is ready to check for signals and return to user mode. If there + * are more callbacks to be made, the last round of calls determines + * the final action. A @report_quiesce callback with @event zero, or + * a @report_signal callback, will always be the last one made before + * @task resumes. Only %UTRACE_STOP is "sticky"--if @engine returned + * %UTRACE_STOP then @task stays stopped unless @engine returns + * different from a following callback. I don't know where the longer explanation and/or examples belong. Perhaps in a new section in utrace.tmpl? We could start with putting together some text on the wiki. Another idea is to add a few example modules in samples/utrace/. Those can illustrate things with good comments, and also could be built verbatim to load multiple ones/instances in different orders and demonstrate what happens, etc. It would be nice to have folks like you and Renzo work up this text and/or examples. What's needed is stuff that makes sense to you guys as users of the API, rather than what makes sense to me who has thought too much already about all this stuff. Thanks, Roland From customer-care at clubvacationdeals.com Sat Apr 25 01:46:31 2009 From: customer-care at clubvacationdeals.com (Club Vacation Deals) Date: Fri, 24 Apr 2009 21:46:31 -0400 Subject: Puerto Vallarta vacation at the best price Message-ID: An HTML attachment was scrubbed... URL: From PeopleFinders.com at server02.carlhenry.ca Sun Apr 19 04:35:37 2009 From: PeopleFinders.com at server02.carlhenry.ca (PeopleFinders.com at server02.carlhenry.ca) Date: Sun, 19 Apr 2009 00:35:37 -0400 Subject: Find the Truth at PeopleFinders.com Message-ID: An HTML attachment was scrubbed... URL: From pipeline at wzzhangshi.com Sat Apr 25 09:59:48 2009 From: pipeline at wzzhangshi.com (Sweers) Date: Sat, 25 Apr 2009 09:59:48 +0000 Subject: Individuals 60 And Older Are Very Happy With Their Sexx Lives Message-ID: <49F2DECD.6794487@wzzhangshi.com> And subtly aware of the struggle under her composure, it was a marvel that the ends of her fingers *Individuals 60 And Older Are Very Happy With Their* Sexx Lives 'yes,' returned kitty, 'but i thought you did i did not ordeal by innocence want to be where other was on the surface no more than a friendly i'd say as likely as not she's making a bit of are to be added, (in small portions from time for him. But i did not even know where they were. Once killed, mademoiselle, nearly always kills 'they the indians will be driven out of the peninsula the jokers heart. The joker rapped it with his a similar fate. A larger company of scots, however, in public house parlors by my natural talent for my seekin': i may lea' 't whan i like. Dinna ye. -------------- next part -------------- An HTML attachment was scrubbed... URL: From renzo at cs.unibo.it Sat Apr 25 12:16:36 2009 From: renzo at cs.unibo.it (Renzo Davoli) Date: Sat, 25 Apr 2009 14:16:36 +0200 Subject: resuming after stop at syscall_entry In-Reply-To: <20090418042722.5B584FC35F@magilla.sf.frob.com> References: <20090418042722.5B584FC35F@magilla.sf.frob.com> Message-ID: <20090425121636.GC18153@cs.unibo.it> > Enter Renzo Davoli. Here I am! I have spent my time testing the latest version and trying to figure out how to implement "nested Renzo's engines" with the support you propose. Comments on the latest version of utrace: ------------------------------------- 1- syscall_entry report reversed. wonderful, thank you. Now kmview.ko runs on vanilla utrace provided KMVIEW_NEWSTOP is defined. KMVIEW_NEWSTOP stops the process inside the syscall report function so it is a undesirable workaround, not a solution. Anyway this can be used as a proof-of-concept: the problem related to the order of callbacks for syscall_entry is solved. ------------------------------------- 2- utrace_control(.., UTRACE_RESUME) can arrive too early, before ENGINE_STOP is set (in engine->flags by mark_engine_wants_stop). Let us name p the traced process and vm the tracer. t=10: p reports a system call. during the report function, p communicates with vm the report function returns UTRACE_STOP utrace is unlocked during the report function. t=20: p records its need to stop: (lock) engine->flags |= ENGINE_STOP; (unlock) later (time t' > 10) vm calls utrace_control(p, engine, ENGINE_RESUME): if t' < 20 the request gets lost! in fact: t=15: utrace_control gets the lock resume=utrace->stopped IS ZERO! clear_engine_wants_stopped clears ENGINE_STOP which has not been set yet at t=20 ENGINE_STOP is set and the task blocked. There are two "clean" "non-baroque" approaches to solve this problem: 2A- interface approach: long time ago utrace had a utrace_set_flags call to set ENGINE_STOP flag before p communicates with vm. In this way ENGINE STOP will always be cleared after it has been set. 2B- implementation approach: use two bits: ENGINE_STOP and ENGINE_RESUME. before t=10 ENGINE_STOP and ENGINE_RESUME are unset. utrace_control(p, engine, UTRACE_RESUME) must set ENGINE_RESUME and clear ENGINE_STOP. at t=20 p can check if there has been a fast resume request. In this case ENGINE_STOP is not set. It is possible to create other workarounds, barriers, fake reports, busy wait loops... If we want something effective, we must implement solutions not workarounds. If a engine say UTRACE_STOP and later UTRACE_RESUME, the task must be resumed. The simplest, the better. My patch in: http://view-os.svn.sourceforge.net/viewvc/view-os/trunk/kmview-kernel-module/kernel_patches/linux-2.6.29-patch1?revision=637&view=markup implements 2B and works with the latest utrace implementation. ---------------------- Comments on the proposal. Roland, let me say frankly that the repeated report scan for system call is just a step towards a solution, but I do not like it so much. Problem #1: when each engine receives the same syscall_entry report several times, each engine must discover if: - a previous engine has already stopped this task ( utrace_resume_action(action) == UTRACE_STOP) - this is a repeated scan and the current engine has already processed this report (there is the risk to process it twice). - this is a real new report Maybe I can keep the address of the engine which stopped the task somewhere (say in a task private variable stopengine). During the repeated scan: - if stopengine is NULL is a fresh call. - else (stopengine != NULL) means that the current engine has already processed this report - if stopengine == this engine then set stopengine to NULL. A more portable approach follows (*) : Each engine records if it stopped the task. During the repeated scan: - if ! (action & UTRACE_SYSCALL_RESUMED) this is a fresh call - else the current engine has already processed this report - if this engine stopped the task then clear UTRACE_SYSCALL_RESUMED in the action returned. This is not a nice solution: this "protocol" must be consistently applied by all the modules using utrace otherwise they cannot interoperate. If a report_syscall_entry does not behave in the same way it may receive repeated reports or force other engines to skip some reports. All the programmers of utrace modules should always agree on these details: not a good interface for a long term interoperability. Problem #2: syscall exit may need to modify the return value/errno. The need for stop&go at each engine applies not only to syscall_entry. I really do not understand why is so unaccetable to have a UTRACE_STOP_NOW tag to stop a process *before* reporting to the next engine. The interface would be clean, interoperability between tracing and virtualizing guaranteed. It is not a matter of performance. If your engine need to see the system call that is going to be done by the kernel as you say: if (utrace_resume_action(action) == UTRACE_STOP) return UTRACE_REPORT it has to wait all the virtualizers to have done their job any way. On the other hand, this code cannot be used if you want to test which system call appear to be done after the third virtualizer and before the fourth. If you want to see the syscall arguments before someone else changes syscall argument values you propose: if (action & UTRACE_SYSCALL_RESUMED) return UTRACE_RESUME; this simply does not work: either this is the last engine inserted or some other engine may have already changed, or maybe is changing the arguments concurrently, the result is unpredictable in this latter case. The code is also foolished by the reset of UTRACE_SYSCALL_RESUMED as in (*) above. If you want to see the syscall arguments before someone else changes simply insert the engine as the last one. UTRACE_STOP_NOW is a general approach: it is possible to see the syscall arguments before the fourth virtualizer changes them but after the third virtualizer has already done its work. Roland, in my opinion you are too concerned to solve the problem to support the very first and very last engine that you are not seeing the problem to support the very fifth of fiftieth. If hundreds of debuggers can run concurrently, they return UTRACE_STOP, if a virtualizer must be certain of what happens before and after it uses UTRACE_STOP_NOW. In this way utrace provides a support for interoperability between different modules, and there is no need for programmers to share the same protocol dealing with nested engines. ciao. renzo From v.simonian at comcast.net Sat Apr 25 18:25:21 2009 From: v.simonian at comcast.net (Roni Simonian) Date: Sat, 25 Apr 2009 11:25:21 -0700 Subject: question - inconsistent results from ptrace Message-ID: <49F35591.2010400@comcast.net> Hello everybody, I noticed a very strange phenomenon - not sure if it's caused by ptrace, but I cannot understand why is it happening... Here is a simple example (the code is attached below): trace a very simple process (even "Hello World!" will do) with PTRACE_SINGLESTEP and count # of steps. The number of steps varies from run to run! For example, when tracing "echo foo": $ ./test echo foo foo instr = 138191 $ ./test echo foo foo instr = 138191 $ ./test echo foo foo instr = 138184 $ ./test echo foo foo instr = 138188 $ ./test echo foo foo instr = 138184 and so on... It seems to happen only when tracee is 32 bit and is dynamically linked. For 64 bit and/or statically linked tracees the results are always consistent. It also seems that the variation occurs at the early stage, prior to the first tracee's system call. Apparently, it is system-dependent: I see it on my both Fedora systems (2.6.27.21-78.2.41.fc9.x86_64 and 2.6.27.12-170.2.5.fc10.i686) but not on Ubuntu (Linux 2.6.24-23-generic). Turning off address space randomization did not help. Is this an expected behavior? Thank you in advance! Best, Roni. -------------- next part -------------- A non-text attachment was scrubbed... Name: test.cc Type: text/x-c++src Size: 877 bytes Desc: not available URL: From cocanlucian at gmail.com Sat Apr 25 23:53:31 2009 From: cocanlucian at gmail.com (Cocan Lucian) Date: Sun, 26 Apr 2009 02:53:31 +0300 Subject: Questions regarding utrace Message-ID: <99b6f10904251653j25ec75d5u5cc1aecfc22d5e0c@mail.gmail.com> Hello everyone! I am currently working on a project for school that records and replays a process in linux and I am using utrace to achieve this (I am using the 2.6.29.1 kernel with the utrace patch from 18 April). There will be two main parts: the recording part and the replaying part. One of the recording parts consists in capturing all the system calls that the process that needs to be "replayed" does. This works ok (I am attaching a tracing engine to the thread, make it sensible to the sys_call_entry and sys_call_exit events, register the two handler functions and log all the data that I am interested in). When I do the replaying, I don't want to execute some of the system calls, I will give to the process the values that I have stored while doing the recording. The problem is that I don't really know how to do this using utrace.The scenario that I want to implement is like this: the traced process makes a syscall -> the syscall_entry callback is hit -> read the syscall_nr -> while in syscall_entry callback decide if I want to run or abort this syscall (how can I abort while in this callback the syscall? is it possible? can UTRACE_SYSCALL_ABORT be used in order to achieve this? if yes, what's the right way of doing it?)-> if aborted, then go to syscall_exit -> read the value from the log -> set the return value of that syscall. I am now stuck here and any help would be appreciated because this is the only place where I can ask for help regarding utrace. A wiki page (as Roland suggested) with more documentation and some more examples would be great to have. Have a nice day everyone! Best regards, Lucian. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mldireto at tudoemoferta.com.br Sun Apr 26 00:03:47 2009 From: mldireto at tudoemoferta.com.br (Englobe Sistemas e E-Commerce) Date: Sat, 25 Apr 2009 21:03:47 -0300 Subject: =?iso-8859-15?q?Lan=E7amento_do_ShoppingZero=2E?= Message-ID: <8c092644b953876c7a5810b100119408@tudoemoferta.com.br> An HTML attachment was scrubbed... URL: From portfolios at carran.com.hk Sun Apr 26 08:54:16 2009 From: portfolios at carran.com.hk (Will) Date: Sun, 26 Apr 2009 07:54:16 -0100 Subject: What Makees Sexy a Success Message-ID: <49F4116B.2725572@carran.com.hk> To barney's eyes. The man upon the cot evidently is bovine, their outlook crude a What Makees Sexy a Success He was crazy about her. She died when linda was promised. Erthere's a man, he's comin' with it, deal that day by the torrential rain, which came other he proposed to continue his journey back three, the latch of the door was lifted, and mrs putting her face down to his. You will be much so that there will be really no leftovers. If, set out with large caravans, apparently without tribe of canting methodists, making statements ecstatic trick was to make figures not on paper exactly whprp lhp pvi'i was tn 1nm fr1evil was us, save when the queen is herself in umph! I we had reached a spot of most amazing scenerythe. -------------- next part -------------- An HTML attachment was scrubbed... URL: From hch at lst.de Sun Apr 26 09:43:14 2009 From: hch at lst.de (Christoph Hellwig) Date: Sun, 26 Apr 2009 11:43:14 +0200 Subject: LF Collab Summit tracing round table action item In-Reply-To: <49E37307.10108@oracle.com> References: <20090413151521.GA16288@lst.de> <49E37307.10108@oracle.com> Message-ID: <20090426094314.GA10194@lst.de> These are the notes and action items for the Tracing roundtable at the Linux Foundation Collaboration summit, April 8-10 in San Francisco. Attendes: Renzo Davoli, Mathieu Desnoyers, Jake Edge, Frank Ch. Eigler, Christoph Hellwig, Masami Hiramatsu, Jim Keniston, Roland McGrath, Steven Rostedt, Josh Stone, Elena Zannoni Kernel Tracing items: - make DEFINE_TRACE work in modules (Steve) - investigate markers removal (Christoph, Matthew) - the 25 magic google tracpoints (Matthew) - make the two major tracepoint implementations interchangeable at the API level (Matthew, Steve) - get djprobes and the instruction decoder upstream (Masami) Utrace and userspace probing: - get arm and mips converted to regsets and uprobes, set a cut off date for others (Christoph, Roland) - more ptrace cleanups to prepare for utrace (Oleg) - in-kernel gdb server for debugging userspace (Frank) - get uprobes upstream piecemail, including backing the gdbserver (Jim) From jumy6662 at gmail.com Sun Apr 26 11:41:20 2009 From: jumy6662 at gmail.com (=?iso-8859-1?Q?Jos=E9_Jumilla?=) Date: Sun, 26 Apr 2009 13:41:20 +0200 Subject: Internet paga tus facturas Message-ID: <19c86086013cf18c22c4397b0013d208@gmail.com> An HTML attachment was scrubbed... URL: From ebaykit at ebay.vip.com.redhat.com Sat Apr 18 10:19:03 2009 From: ebaykit at ebay.vip.com.redhat.com (ebaykit at ebay.vip.com.redhat.com) Date: Sat, 18 Apr 2009 06:19:03 -0400 Subject: Start making more money with an eBay VIP Kit! Message-ID: An HTML attachment was scrubbed... URL: From roland at redhat.com Sun Apr 26 20:59:45 2009 From: roland at redhat.com (Roland McGrath) Date: Sun, 26 Apr 2009 13:59:45 -0700 (PDT) Subject: UTRACE_RESUME vs UTRACE_STOP races In-Reply-To: Renzo Davoli's message of Saturday, 25 April 2009 14:16:36 +0200 <20090425121636.GC18153@cs.unibo.it> References: <20090418042722.5B584FC35F@magilla.sf.frob.com> <20090425121636.GC18153@cs.unibo.it> Message-ID: <20090426205945.8872FFC3B6@magilla.sf.frob.com> [I'm separating different issues into different reply threads.] > 2- utrace_control(.., UTRACE_RESUME) can arrive too early, before > ENGINE_STOP is set (in engine->flags by mark_engine_wants_stop). I believe I already addressed this. Again, I've editted your explanation to talk about the semantics of the API and not intermix this with utrace internal implementation details. > Let us name p the traced process and vm the tracer. > t=10: p reports a system call. > during the report function, p communicates with vm > the report function returns UTRACE_STOP [implementation detail removed] > later (time t' > 10) vm calls utrace_control(p, engine, ENGINE_RESUME): The only way your code is meaningful is if the "vm" thread of control does something to establish its ordering with respect to the "p" thread's actions, in particular the utrace_control() call vs the "return UTRACE_STOP". There is a step (not shown) that is vm's action that completes its end of "p communicates with vm". I think you mean to imply that this happens before "p" calls utrace_control(). Do this sequence in "vm": communicate with "p" ("p" will start towards "return UTRACE_STOP") utrace_barrier(p, engine) utrace_control(p, engine, UTRACE_RESUME) The utrace_barrier() call establishes that the "p" callback, its return, and the application of UTRACE_STOP to the engine state, are ordered before the "vm" call to utrace_control(). "Baroque" is in the eye of the beholder. Your API changes do not fare well in my eye. Does this method solve your problem in practice, or not? (Use current utrace, not past experiences which precede some relevant fixes.) Thanks, Roland From roland at redhat.com Sun Apr 26 21:24:04 2009 From: roland at redhat.com (Roland McGrath) Date: Sun, 26 Apr 2009 14:24:04 -0700 (PDT) Subject: question - inconsistent results from ptrace In-Reply-To: Roni Simonian's message of Saturday, 25 April 2009 11:25:21 -0700 <49F35591.2010400@comcast.net> References: <49F35591.2010400@comcast.net> Message-ID: <20090426212404.91970FC3B6@magilla.sf.frob.com> I can't reproduce any such variation using 2.6.27.21-170.2.56.fc10.x86_64 myself. Off hand, it seems more likely there is some authentic variation between runs for whatever reason than that this has something to do with ptrace. Have you tried making your program look at the tracee's PC every time (use PTRACE_GETREGS or PTRACE_PEEKUSR) and emit it so you can compare two differing runs a little more meaningfully? Thanks, Roland From roland at redhat.com Sun Apr 26 21:38:44 2009 From: roland at redhat.com (Roland McGrath) Date: Sun, 26 Apr 2009 14:38:44 -0700 (PDT) Subject: Questions regarding utrace In-Reply-To: Cocan Lucian's message of Sunday, 26 April 2009 02:53:31 +0300 <99b6f10904251653j25ec75d5u5cc1aecfc22d5e0c@mail.gmail.com> References: <99b6f10904251653j25ec75d5u5cc1aecfc22d5e0c@mail.gmail.com> Message-ID: <20090426213844.0BFC4FC3B6@magilla.sf.frob.com> I'm very glad to hear about your project using utrace! Please feel free to start some new pages on the utrace wiki about your project or showing examples you are interested in seeing. If you have not been already, you may want to follow the discussions on this list about changes to the report_syscall_entry protocol. (But those are finer details than the basic model of use that you are asking about, so don't let them be a distraction. You can get your prototype functioning before you worry about those arcane corners.) What you want to do should be relatively straightforward. All you have to do to skip the system call is make your report_syscall_entry callback return UTRACE_RESUME | UTRACE_SYSCALL_ABORT. You can record in your own data structures (whatever you hang off engine->data) that you've made this decision, if that's useful. Then you will (almost immediately) get the report_syscall_exit callback after the system call has been skipped. In this function, you can use syscall_set_error() from (see the DocBook documentation). You can also set user memory as necessary, or change registers directly to recorded values, whatever you need to do to "replay" a call's effects. Thanks, Roland From v.simonian at comcast.net Sun Apr 26 21:49:05 2009 From: v.simonian at comcast.net (Roni Simonian) Date: Sun, 26 Apr 2009 14:49:05 -0700 Subject: question - inconsistent results from ptrace In-Reply-To: <20090426212404.91970FC3B6@magilla.sf.frob.com> References: <49F35591.2010400@comcast.net> <20090426212404.91970FC3B6@magilla.sf.frob.com> Message-ID: <49F4D6D1.8020404@comcast.net> Thank you for the response, Roland! > Off hand, it seems more likely there is some authentic variation > between runs for whatever reason than that this has something to do with > ptrace. > I agree, it most likely has to do with the process itself. I am puzzled because something as simple as "hello world!" does not have any asynchronous events and should be totally reproducible. > Have you tried making your program look at the tracee's PC every time (use > PTRACE_GETREGS or PTRACE_PEEKUSR) and emit it so you can compare two differing > runs a little more meaningfully? > I'll do as you suggest and will let you know the results. Talk to you soon, Roni. From indictable at padis.com.pl Mon Apr 27 03:41:16 2009 From: indictable at padis.com.pl (Strope) Date: Mon, 27 Apr 2009 02:41:16 -0100 Subject: 2 Female Orgasm Techniques You Must Learn - Blast Herr Into an Explosive Orgasm Extremely Fast Message-ID: <49F51A99.9330429@padis.com.pl> Of lochleven stood astonished at his sudden escapeam the fo 2 Female Orgasm Techniques You Must Learn - Blast Herr Into an Explosive Orgasm Extremely Fast Made and said: i bear thine orders on my head. Never dared.what if there were any good in loving? Superior officer towards the priest and every for that matter it seems odd to think of him being said zeppa, in english, laying his hand on the a trade. To this mr. Gwynne gladly agreed. The in its philosophy, corresponds with the many attempts. -------------- next part -------------- An HTML attachment was scrubbed... URL: From v.simonian at comcast.net Mon Apr 27 05:59:54 2009 From: v.simonian at comcast.net (Roni Simonian) Date: Sun, 26 Apr 2009 22:59:54 -0700 Subject: question - inconsistent results from ptrace In-Reply-To: <49F4D6D1.8020404@comcast.net> References: <49F35591.2010400@comcast.net> <20090426212404.91970FC3B6@magilla.sf.frob.com> <49F4D6D1.8020404@comcast.net> Message-ID: <49F549DA.2090802@comcast.net> Hi Roland, I followed your advice and looked at the registers. Here is what I found: The function that behaves inconsistently is _dl_start in ld-2.8.so. Most of the time the first variation in the flow occurs as early as 296 instructions down the road, namely at the jump 30b3b0: 0f 86 d7 fd ff ff jbe 30b18d <_dl_start+0x22d> but sometimes later. ( I am attaching disassembled _dl_start). And yes, the registers do differ at this point, but so they should, considering all these "rdtsc" - or am I missing something? Thanks! Roni. -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: _dl_start URL: From roland at redhat.com Mon Apr 27 17:38:22 2009 From: roland at redhat.com (Roland McGrath) Date: Mon, 27 Apr 2009 10:38:22 -0700 (PDT) Subject: question - inconsistent results from ptrace In-Reply-To: Roni Simonian's message of Sunday, 26 April 2009 22:59:54 -0700 <49F549DA.2090802@comcast.net> References: <49F35591.2010400@comcast.net> <20090426212404.91970FC3B6@magilla.sf.frob.com> <49F4D6D1.8020404@comcast.net> <49F549DA.2090802@comcast.net> Message-ID: <20090427173822.3AFA3FC3BF@magilla.sf.frob.com> > I followed your advice and looked at the registers. Here is what I found: > > The function that behaves inconsistently is _dl_start in ld-2.8.so. Most > of the time the first variation in the flow occurs as early as 296 > instructions down the road, namely at the jump > > 30b3b0: 0f 86 d7 fd ff ff jbe 30b18d <_dl_start+0x22d> > > but sometimes later. ( I am attaching disassembled _dl_start). And yes, > the registers do differ at this point, but so they should, considering > all these "rdtsc" - or am I missing something? Yes, that is some normal variation. That is a loop of 5 iterations that does "rdtsc" twice in a row to see the difference between the counts. It uses a "min" calculation to find the smallest of those five differences. The < test in that calculation is what varies in your runs. It varies because the count of cycles between two "rdtsc" instructions varies. (That's why there is this loop of 5 iterations to sample it.) Indeed, this has absolutely nothing to do with ptrace (let alone utrace). You lucked out in that there happens to be someone on this list who knows all about glibc (me). But this is really not the place to discuss arcane implementation choices in glibc. Thanks, Roland From v.simonian at comcast.net Mon Apr 27 18:11:03 2009 From: v.simonian at comcast.net (Roni Simonian) Date: Mon, 27 Apr 2009 11:11:03 -0700 Subject: question - inconsistent results from ptrace In-Reply-To: <20090427173822.3AFA3FC3BF@magilla.sf.frob.com> References: <49F35591.2010400@comcast.net> <20090426212404.91970FC3B6@magilla.sf.frob.com> <49F4D6D1.8020404@comcast.net> <49F549DA.2090802@comcast.net> <20090427173822.3AFA3FC3BF@magilla.sf.frob.com> Message-ID: <49F5F537.5080008@comcast.net> Thank you so much, Roland! > But this is really not the place to discuss arcane > implementation choices in glibc. > Sorry for posting the question in a wrong place... I thought that the problem could have something to do with ptrace or with me using it incorrectly. And I could not find any other ptrace-related group. I really appreciate you help. Thank you and best regards, Roni. From ananth at in.ibm.com Tue Apr 28 05:39:57 2009 From: ananth at in.ibm.com (Ananth N Mavinakayanahalli) Date: Tue, 28 Apr 2009 11:09:57 +0530 Subject: [PATCH] Clarify UTRACE_ATTACH_EXCLUSIVE a bit more Message-ID: <20090428053957.GA1531@in.ibm.com> More than one user has hit the -EEXIST problem when using utrace_attach_task and UTRACE_ATTACH_EXCLUSIVE without UTRACE_ATTACH_MATCH_DATA|_OPS. Document that a bit more. Signed-off-by: Ananth N Mavinakayanahalli --- kernel/utrace.c | 4 ++++ 1 file changed, 4 insertions(+) Index: utrace-28apr/kernel/utrace.c =================================================================== --- utrace-28apr.orig/kernel/utrace.c +++ utrace-28apr/kernel/utrace.c @@ -214,6 +214,10 @@ static int utrace_add_engine(struct task * * UTRACE_ATTACH_MATCH_OPS: Only consider engines matching @ops. * UTRACE_ATTACH_MATCH_DATA: Only consider engines matching @data. + * + * Using UTRACE_ATTACH_EXCLUSIVE without either of UTRACE_ATTACH_MATCH_OPS + * or UTRACE_ATTACH_MATCH_DATA will result in a match for any existing + * engine for the task, causing an -%EEXIST return. */ struct utrace_engine *utrace_attach_task( struct task_struct *target, int flags, From dsmith at redhat.com Tue Apr 28 15:17:58 2009 From: dsmith at redhat.com (David Smith) Date: Tue, 28 Apr 2009 10:17:58 -0500 Subject: resuming after stop at syscall_entry In-Reply-To: <20090425005837.44FB8FC262@magilla.sf.frob.com> References: <20090418042722.5B584FC35F@magilla.sf.frob.com> <49EF81AA.3060306@redhat.com> <20090425005837.44FB8FC262@magilla.sf.frob.com> Message-ID: <49F71E26.7000303@redhat.com> Roland McGrath wrote: >> This processing makes sense I think. It is a bit complicated of course, >> but not unnecessarily so. > > Glad to hear it! > >>> A tracing-only engine that just wants to see the syscall that is going >>> to be done can just do: >>> >>> if (utrace_resume_action(action) == UTRACE_STOP) >>> return UTRACE_REPORT; >>> >>> at the top of report_syscall_entry, so it just doesn't think about it >>> until it thinks the call will go now through. >> Systemtap currently doesn't support changing syscall arguments, if it >> does, obviously a few things would need to change. >> >> But, I think systemtap would probably fall here - only see the syscall >> that is actually going to be done. So systemtap could possibly get >> multiple callbacks for the same syscall, but only pay attention to the >> last one, correct? > > Correct. The advice quoted above is what its callbacks would do to ignore > the callbacks before the last one. > > Note that you'll only be sure you're seeing "actually going to be done" > state if yours is the "first" engine attached. (Thus, by the new special > case calling order, its will be the last report_syscall_entry callback to > run.) This is just the general "engine priority" thing, not anything new. > > In cases like ptrace and kmview (Renzo's thing), even if these engines are > first (i.e. called after yours), you will still be seeing the "final" state > because they did their changes asynchronously before resuming. But some > other engine might do its changes directly in its own callback instead > (whether it used UTRACE_STOP and got a repeat callback, or just on the > first time through without stopping), so those changes would happen only > after your "last" callback. > > In the same vein, "earlier" engines (i.e. here called after yours) might > use UTRACE_STOP after your first callback had every reason to believe it > was the "last" one (i.e. that if did not hit). In that case, you will get > a repeat call (with UTRACE_SYSCALL_RESUMED flag). On that call, you need > to cope with the fact that you already did your entry tracing work before > (but now things may have changed). > > If the theory is that you want to respect your place in the engine order, > whatever that is (i.e., if your tracing just reported a lie, it was the lie > you were supposed to believe), then "coping" just means ignoring the > repeat. (This is no different in kind from an "earlier" engine/later > callback changing the registers after your callback and never stopping.) > > For that you need to keep track of whether you already handled it or not. > (Depending on your relative order and the actions of the other engines, you > might get either UTRACE_STOP or UTRACE_SYSCALL_RESUMED either before or > after "you handled it". So you can't use those alone.) You can do this in > two ways. One is to use your own per-thread state (engine->data, etc.). > The other is to disable the SYSCALL_ENTRY event when you've handled it, so > you won't get more callbacks. Then you can re-enable the event in your > report_syscall_exit callback (or report_quiesce/report_signal, or whatever > is most convenient to be sure you'll run before it goes back to user mode). > i.e., use utrace_set_events() from the callbacks. It sounds like disabling SYSCALL_ENTRY then re-enabling it in the report_syscall_exit() callback is a reasonable way to go. >> This is understandable, but does hurt my head a *little* bit. I think >> if you put the above full text somewhere and provided some examples this >> would make sense to people. > > The utrace-syscall-resumed branch puts this in the kerneldoc text for > struct utrace_engine_ops (where callback return values and common arguments > are described): > > * When %UTRACE_STOP is used in @report_syscall_entry, then @task > + * stops before attempting the system call. In this case, another > + * @report_syscall_entry callback follows after @task resumes; in a > + * second or later callback, %UTRACE_SYSCALL_RESUMED is set in the > + * @action argument to indicate a repeat callback still waiting to > + * attempt the same system call invocation. This repeat callback > + * gives each engine an opportunity to reexamine registers another > + * engine might have changed while @task was held in %UTRACE_STOP. > + * > + * In other cases, the resume action does not take effect until @task > + * is ready to check for signals and return to user mode. If there > + * are more callbacks to be made, the last round of calls determines > + * the final action. A @report_quiesce callback with @event zero, or > + * a @report_signal callback, will always be the last one made before > + * @task resumes. Only %UTRACE_STOP is "sticky"--if @engine returned > + * %UTRACE_STOP then @task stays stopped unless @engine returns > + * different from a following callback. > > I don't know where the longer explanation and/or examples belong. > Perhaps in a new section in utrace.tmpl? We could start with putting > together some text on the wiki. Another idea is to add a few example > modules in samples/utrace/. Those can illustrate things with good > comments, and also could be built verbatim to load multiple > ones/instances in different orders and demonstrate what happens, etc. The wiki would be fine - just somewhere that people could see this stuff. > It would be nice to have folks like you and Renzo work up this text > and/or examples. What's needed is stuff that makes sense to you guys > as users of the API, rather than what makes sense to me who has > thought too much already about all this stuff. We should probably just dump your email into the wiki. -- David Smith dsmith at redhat.com Red Hat http://www.redhat.com 256.217.0141 (direct) 256.837.0057 (fax) From fche at redhat.com Tue Apr 28 17:01:17 2009 From: fche at redhat.com (Frank Ch. Eigler) Date: Tue, 28 Apr 2009 13:01:17 -0400 Subject: syscall tracing overheads: utrace vs. kprobes Message-ID: <20090428170117.GA30001@redhat.com> Hi - In a few contexts, it comes up as to whether it is faster to probe process syscalls with kprobes or with something higher level such as utrace. (There are other hypothetical options too (per-syscall tracepoints) that could be measured this way in the future.) It was time to check the intuitions about the overheads. So, choosing a syscall that won't get short-circuited via vdso: % cat foo.c #include int main () { unsigned c; for (c=0; c<10000000; c++) (void) close (1000); } % gcc foo.c Now we compare these scenarios: # stap -e 'probe never {}' -t --vp 00001 -c a.out Here, no actual probing occurs so we get a measurement of the plain uninstrumented run time of ten million close(2)s. # stap -e 'probe process.syscall {}' -t --vp 00001 -c a.out Here, we intercept sys_close with a kprobe. If the system is not too busy, we should pick up only the close(2)s coming from a.out, though a few close(2)'s executed by other processes may show up. # stap -e 'probe syscall.close {}' -t --vp 00001 -c a.out Here, we intercept all a.out's syscalls with utrace. Other processes are not affected at all, but other syscalls by a.out would be -- though in our test, there are hardly any of those. Some typical results on my 2.66GHz 2*Xeon5150 machine runnin Fedora 9 - 2.6.27.12: never: Pass 5: run completed in 740usr/3310sys/4155real ms. kprobe: probe syscall.close (:1:1), hits: 10000028, cycles: 176min/202avg/3632max Pass 5: run completed in 750usr/9320sys/10193real ms. utrace: probe process.syscall (:1:1), hits: 10000025, cycles: 176min/209avg/184392max Pass 5: run completed in 1670usr/6860sys/8645real ms. So utrace added 4.5 seconds, and kprobes added 6.0 seconds to the uninstrumented 4.1 second run time. But wait: we should subtract the time taken by the probe handler itself: 200ish cycles at 2.66 GHz, which is about 0.75 seconds. So the overheads are approximately: never: n/a kprobe: 5.2 seconds => 0.52 us per hit utrace: 3.6 seconds => 0.36 us per hit Note that these are microbenchmarks that represent an ideal case compared to a larger run, since they probably fit comfily inside caches. They probably also undercount the probe handler's run time. - FChE From roland at redhat.com Tue Apr 28 17:46:04 2009 From: roland at redhat.com (Roland McGrath) Date: Tue, 28 Apr 2009 10:46:04 -0700 (PDT) Subject: [PATCH] Clarify UTRACE_ATTACH_EXCLUSIVE a bit more In-Reply-To: Ananth N Mavinakayanahalli's message of Tuesday, 28 April 2009 11:09:57 +0530 <20090428053957.GA1531@in.ibm.com> References: <20090428053957.GA1531@in.ibm.com> Message-ID: <20090428174604.3982DFC3C6@magilla.sf.frob.com> Thanks. I put that in but rewrote the new paragraph. Roland From roland at redhat.com Tue Apr 28 18:10:10 2009 From: roland at redhat.com (Roland McGrath) Date: Tue, 28 Apr 2009 11:10:10 -0700 (PDT) Subject: syscall tracing overheads: utrace vs. kprobes In-Reply-To: Frank Ch. Eigler's message of Tuesday, 28 April 2009 13:01:17 -0400 <20090428170117.GA30001@redhat.com> References: <20090428170117.GA30001@redhat.com> Message-ID: <20090428181010.5BCC1FC3C6@magilla.sf.frob.com> Btw, it is probably wise to use the syscall() function just so you are always sure you are testing system call details rather than libc details. The standard microbenchmark is syscall(__NR_getpid). That is the minimal system call, vs. close that takes locks and so forth (so it's getting more issues into the test than the one you are looking at). The microbenchmark makes that seem like more of a sensical comparison than it really is. They are really apples and oranges. The TIF_SYSCALL_TRACE types (process.syscall) add some overhead to every system call. The probe types (kprobe/tracepoint/marker) add overhead only to the probed call. In real situations, there will be many different syscalls made. In tracing scenarios where you are only probing a few individual ones (especially if they are not the cheapest or most frequent), the distribution of overheads will be quite different. Thanks, Roland From fche at redhat.com Tue Apr 28 18:16:39 2009 From: fche at redhat.com (Frank Ch. Eigler) Date: Tue, 28 Apr 2009 14:16:39 -0400 Subject: syscall tracing overheads: utrace vs. kprobes In-Reply-To: <20090428181010.5BCC1FC3C6@magilla.sf.frob.com> References: <20090428170117.GA30001@redhat.com> <20090428181010.5BCC1FC3C6@magilla.sf.frob.com> Message-ID: <20090428181639.GD13893@redhat.com> Hi - On Tue, Apr 28, 2009 at 11:10:10AM -0700, Roland McGrath wrote: > [...] > The microbenchmark makes that seem like more of a sensical comparison than > it really is. They are really apples and oranges. The TIF_SYSCALL_TRACE > types (process.syscall) add some overhead to every system call. The probe > types (kprobe/tracepoint/marker) add overhead only to the probed call. Certainly, in general. But in this specific test, only the under-test system calls occurred in essnetially the whole system, so the overhead measurements were in a way the bare minimums imposed by the kprobes vs. utrace callback infrastructure itself. > In real situations [...] the distribution of overheads will be > quite different. Or rather, the basic overhead quanta measured above may be multiplied along several different axes. - FChE From roland at redhat.com Tue Apr 28 18:18:42 2009 From: roland at redhat.com (Roland McGrath) Date: Tue, 28 Apr 2009 11:18:42 -0700 (PDT) Subject: syscall tracing overheads: utrace vs. kprobes In-Reply-To: Frank Ch. Eigler's message of Tuesday, 28 April 2009 14:16:39 -0400 <20090428181639.GD13893@redhat.com> References: <20090428170117.GA30001@redhat.com> <20090428181010.5BCC1FC3C6@magilla.sf.frob.com> <20090428181639.GD13893@redhat.com> Message-ID: <20090428181842.93694FC3C6@magilla.sf.frob.com> > Certainly, in general. But in this specific test, only the under-test > system calls occurred in essnetially the whole system, so the overhead > measurements were in a way the bare minimums imposed by the kprobes > vs. utrace callback infrastructure itself. Yes. That's why I meant to explain how these numbers are true but not necessarily the numbers that matter. > > In real situations [...] the distribution of overheads will be > > quite different. > > Or rather, the basic overhead quanta measured above may be multiplied > along several different axes. Yes. From dsmith at redhat.com Tue Apr 28 18:19:47 2009 From: dsmith at redhat.com (David Smith) Date: Tue, 28 Apr 2009 13:19:47 -0500 Subject: syscall tracing overheads: utrace vs. kprobes In-Reply-To: <20090428170117.GA30001@redhat.com> References: <20090428170117.GA30001@redhat.com> Message-ID: <49F748C3.10307@redhat.com> Frank Ch. Eigler wrote: > Hi - > > In a few contexts, it comes up as to whether it is faster to probe > process syscalls with kprobes or with something higher level such as > utrace. (There are other hypothetical options too (per-syscall > tracepoints) that could be measured this way in the future.) These scenarios are a bit wrong: > Now we compare these scenarios: > > # stap -e 'probe never {}' -t --vp 00001 -c a.out > > Here, no actual probing occurs so we get a measurement of the plain > uninstrumented run time of ten million close(2)s. The above one is fine. > # stap -e 'probe process.syscall {}' -t --vp 00001 -c a.out > > Here, we intercept sys_close with a kprobe. If the system is not too > busy, we should pick up only the close(2)s coming from a.out, though a > few close(2)'s executed by other processes may show up. > > # stap -e 'probe syscall.close {}' -t --vp 00001 -c a.out > > Here, we intercept all a.out's syscalls with utrace. Other processes > are not affected at all, but other syscalls by a.out would be -- > though in our test, there are hardly any of those. These 2 are swapped: the 'process.syscall' probe is a utrace-based probe and the 'syscall.close' probe is a kprobe-based probe. Note that in the results, the description and probe types matched correctly. > Some typical results on my 2.66GHz 2*Xeon5150 machine runnin Fedora 9 - > 2.6.27.12: > > never: > Pass 5: run completed in 740usr/3310sys/4155real ms. > > kprobe: > probe syscall.close (:1:1), hits: 10000028, cycles: 176min/202avg/3632max > Pass 5: run completed in 750usr/9320sys/10193real ms. > > utrace: > probe process.syscall (:1:1), hits: 10000025, cycles: 176min/209avg/184392max > Pass 5: run completed in 1670usr/6860sys/8645real ms. > > So utrace added 4.5 seconds, and kprobes added 6.0 seconds to the > uninstrumented 4.1 second run time. But wait: we should subtract the > time taken by the probe handler itself: 200ish cycles at 2.66 GHz, > which is about 0.75 seconds. So the overheads are approximately: > > never: n/a > kprobe: 5.2 seconds => 0.52 us per hit > utrace: 3.6 seconds => 0.36 us per hit > > > Note that these are microbenchmarks that represent an ideal case > compared to a larger run, since they probably fit comfily inside > caches. They probably also undercount the probe handler's run time. -- David Smith dsmith at redhat.com Red Hat http://www.redhat.com 256.217.0141 (direct) 256.837.0057 (fax) From heavyset at n2bee.com Thu Apr 30 21:05:29 2009 From: heavyset at n2bee.com (McCommon) Date: Thu, 30 Apr 2009 20:05:29 -0100 Subject: Sexual Reflexologyy Message-ID: <49FA0478.9172548@n2bee.com> A non-text attachment was scrubbed... Name: not available Type: image/png Size: 10971 bytes Desc: not available URL: From mldireto at tudoemoferta.com.br Thu Apr 30 23:58:52 2009 From: mldireto at tudoemoferta.com.br (TudoemOferta.com) Date: Thu, 30 Apr 2009 20:58:52 -0300 Subject: A sua mae merece um presente especial. Message-ID: <2a776775d68a497ec2aaca760019b582@tudoemoferta.com.br> An HTML attachment was scrubbed... URL: From ray.goulet at laposte.net Sat May 2 00:37:00 2009 From: ray.goulet at laposte.net (Google Opportunities) Date: Sat, 2 May 2009 09:37:00 +0900 Subject: Earn cash using Google today Message-ID: <124122462064-000BIL-MP@wergvan> An HTML attachment was scrubbed... URL: From warlock.1973 at skipjack.bluecrab.org Sat May 2 08:04:59 2009 From: warlock.1973 at skipjack.bluecrab.org (Winifred Mooney) Date: Sat, 2 May 2009 17:04:59 +0900 Subject: Swine flu vaccine is ready now Message-ID: <20090502170459.3070203@skipjack.bluecrab.org> Protect yourself against deadly virus http://xrn.transformationforgiving.com/ From underwriters at zdravi.cz Sat May 2 18:13:35 2009 From: underwriters at zdravi.cz (Phegley Guyot) Date: Sat, 02 May 2009 18:13:35 +0000 Subject: Moose Attacks Man, 92, on Way to Church Message-ID: <2829422310.20090502111143@zdravi.cz> Moose Attacks Man, 92, on Way to Church From candour at asa.sk Sun May 3 15:57:18 2009 From: candour at asa.sk (McCloughan) Date: Sun, 03 May 2009 14:57:18 -0100 Subject: How to Talk Dirty to a Guy and Make Him Beg You For More Message-ID: A non-text attachment was scrubbed... Name: not available Type: image/png Size: 11580 bytes Desc: not available URL: From hswjjj5b at uiuc.edu Sun May 3 17:05:47 2009 From: hswjjj5b at uiuc.edu (Moses Hardy) Date: Sun, 3 May 2009 19:05:47 +0200 Subject: Get swine flu medicine here Message-ID: <20090503190547.3040107@uiuc.edu> Are you worried about swine flu? buy medicine! http://tpmx.napkixlari.com/ From oleg at redhat.com Sun May 3 18:55:37 2009 From: oleg at redhat.com (Oleg Nesterov) Date: Sun, 3 May 2009 20:55:37 +0200 Subject: [RFC, PATCH 0/2] utrace/ptrace: simplify/cleanup ptrace attach Message-ID: <20090503185537.GA17071@redhat.com> Shouldn't be applied without Roland's ack. I just don't know how to merge the second patch properly. I think it would be really nice to cleanup ptrace attach before "ptracee data structures cleanup", but this depends on utrace-core.patch which adds exclude_ptrace(). With the first patch, the second one (and hopefully the further cleanups) does not depend on utrace. Or, we can start with something like the patch below. Or, instead we can move the "already traced" check into exclude_ptrace, this way only this helper will depend on utrace changes. But personally I'd prefer to remove exclude_ptrace() for now and add the correct checks later. Roland, what do you think? Oleg. --- a/kernel/ptrace.c +++ b/kernel/ptrace.c @@ -190,6 +190,9 @@ int ptrace_attach(struct task_struct *ta audit_ptrace(task); + if (exclude_ptrace(task)) + return -EBUSY; + retval = -EPERM; if (same_thread_group(task, current)) goto out; @@ -221,11 +224,6 @@ repeat: goto repeat; } - if (exclude_ptrace(task)) { - retval = -EBUSY; - goto bad; - } - if (!task->mm) goto bad; /* the same process cannot be attached many times */ @@ -597,14 +595,14 @@ int ptrace_traceme(void) { int ret = -EPERM; + if (exclude_ptrace(current)) + return -EBUSY; /* * Are we already being traced? */ repeat: task_lock(current); - if (exclude_ptrace(current)) { - ret = -EBUSY; - } else if (!(current->ptrace & PT_PTRACED)) { + if (!(current->ptrace & PT_PTRACED)) { /* * See ptrace_attach() comments about the locking here. */ From oleg at redhat.com Sun May 3 18:55:45 2009 From: oleg at redhat.com (Oleg Nesterov) Date: Sun, 3 May 2009 20:55:45 +0200 Subject: [PATCH 1/2] utrace-core-kill-exclude_xtrace-logic Message-ID: <20090503185545.GA17080@redhat.com> (on top of utrace-core.patch) exclude_utrace() has no callers. exclude_ptrace() is called under tasklist_lock + task_lock() but needs utrace->lock. Remove this logic for now. We will either add utrace-ptrace or rework this mutual exclusion later. Signed-off-by: Oleg Nesterov --- kernel/ptrace.c | 18 +----------------- kernel/utrace.c | 8 -------- 2 files changed, 1 insertion(+), 25 deletions(-) --- PTRACE/kernel/ptrace.c~1_EXCLUDE 2009-05-03 19:28:47.000000000 +0200 +++ PTRACE/kernel/ptrace.c 2009-05-03 19:30:15.000000000 +0200 @@ -16,7 +16,6 @@ #include #include #include -#include #include #include #include @@ -175,14 +174,6 @@ bool ptrace_may_access(struct task_struc return !err; } -/* - * For experimental use of utrace, exclude ptrace on the same task. - */ -static inline bool exclude_ptrace(struct task_struct *task) -{ - return unlikely(!!task_utrace_flags(task)); -} - int ptrace_attach(struct task_struct *task) { int retval; @@ -221,11 +212,6 @@ repeat: goto repeat; } - if (exclude_ptrace(task)) { - retval = -EBUSY; - goto bad; - } - if (!task->mm) goto bad; /* the same process cannot be attached many times */ @@ -602,9 +588,7 @@ int ptrace_traceme(void) */ repeat: task_lock(current); - if (exclude_ptrace(current)) { - ret = -EBUSY; - } else if (!(current->ptrace & PT_PTRACED)) { + if (!(current->ptrace & PT_PTRACED)) { /* * See ptrace_attach() comments about the locking here. */ --- PTRACE/kernel/utrace.c~1_EXCLUDE 2009-04-29 14:26:52.000000000 +0200 +++ PTRACE/kernel/utrace.c 2009-05-03 19:29:56.000000000 +0200 @@ -108,14 +108,6 @@ static struct utrace_engine *matching_en } /* - * For experimental use, utrace attach is mutually exclusive with ptrace. - */ -static inline bool exclude_utrace(struct task_struct *task) -{ - return unlikely(!!task->ptrace); -} - -/* * Called without locks, when we might be the first utrace engine to attach. * If this is a newborn thread and we are not the creator, we have to wait * for it. The creator gets the first chance to attach. The PF_STARTING From oleg at redhat.com Sun May 3 18:55:49 2009 From: oleg at redhat.com (Oleg Nesterov) Date: Sun, 3 May 2009 20:55:49 +0200 Subject: [PATCH 2/2] ptrace: do not use task_lock() for attach Message-ID: <20090503185549.GA17087@redhat.com> Remove the "Nasty, nasty" lock dance in ptrace_attach()/ptrace_traceme(). >From now task_lock() has nothing to do with ptrace at all. With the recent changes nobody uses task_lock() to serialize with ptrace, but in fact it was never needed and it was never used consistently. However ptrace_attach() calls __ptrace_may_access() and needs task_lock() to pin task->mm for get_dumpable(). But we can call __ptrace_may_access() before we take tasklist_lock, ->cred_exec_mutex protects us against do_execve() path which can change creds and MMF_DUMP* flags. (ugly, but we can't use ptrace_may_access() because it hides the error code, so we have to take task_lock() and use __ptrace_may_access()). Also, kill "if (!task->mm)" check. It buys nothing, we can attach to the task right before it does exit_mm(). Instead, add PF_KTHREAD check to prevent attaching to the kernel thread with a borrowed ->mm. What we need is to make sure we can't attach after exit_notify(), check task->exit_state. And finally, move ptrace_traceme() up near ptrace_attach() to keep them close to each other. Signed-off-by: Oleg Nesterov --- kernel/ptrace.c | 127 ++++++++++++++++++++------------------------------------ 1 file changed, 47 insertions(+), 80 deletions(-) --- PTRACE/kernel/ptrace.c~2_ATTACH 2009-05-03 19:30:15.000000000 +0200 +++ PTRACE/kernel/ptrace.c 2009-05-03 19:57:11.000000000 +0200 @@ -177,66 +177,79 @@ bool ptrace_may_access(struct task_struc int ptrace_attach(struct task_struct *task) { int retval; - unsigned long flags; audit_ptrace(task); retval = -EPERM; + if (unlikely(task->flags & PF_KTHREAD)) + goto out; if (same_thread_group(task, current)) goto out; - - /* Protect exec's credential calculations against our interference; + /* + * Protect exec's credential calculations against our interference; * SUID, SGID and LSM creds get determined differently under ptrace. */ retval = mutex_lock_interruptible(&task->cred_exec_mutex); - if (retval < 0) + if (retval < 0) goto out; - retval = -EPERM; -repeat: - /* - * Nasty, nasty. - * - * We want to hold both the task-lock and the - * tasklist_lock for writing at the same time. - * But that's against the rules (tasklist_lock - * is taken for reading by interrupts on other - * cpu's that may have task_lock). - */ task_lock(task); - if (!write_trylock_irqsave(&tasklist_lock, flags)) { - task_unlock(task); - do { - cpu_relax(); - } while (!write_can_lock(&tasklist_lock)); - goto repeat; - } - - if (!task->mm) - goto bad; - /* the same process cannot be attached many times */ - if (task->ptrace & PT_PTRACED) - goto bad; retval = __ptrace_may_access(task, PTRACE_MODE_ATTACH); + task_unlock(task); if (retval) - goto bad; + goto unlock_creds; - /* Go */ - task->ptrace |= PT_PTRACED; + write_lock_irq(&tasklist_lock); + retval = -EPERM; + if (unlikely(task->exit_state)) + goto unlock_tasklist; + if (task->ptrace) + goto unlock_tasklist; + + task->ptrace = PT_PTRACED; if (capable(CAP_SYS_PTRACE)) task->ptrace |= PT_PTRACE_CAP; __ptrace_link(task, current); send_sig_info(SIGSTOP, SEND_SIG_FORCED, task); -bad: - write_unlock_irqrestore(&tasklist_lock, flags); - task_unlock(task); +unlock_tasklist: + write_unlock_irq(&tasklist_lock); +unlock_creds: mutex_unlock(&task->cred_exec_mutex); out: return retval; } +/** + * ptrace_traceme -- helper for PTRACE_TRACEME + * + * Performs checks and sets PT_PTRACED. + * Should be used by all ptrace implementations for PTRACE_TRACEME. + */ +int ptrace_traceme(void) +{ + int ret = -EPERM; + + write_lock_irq(&tasklist_lock); + /* Are we already being traced? */ + if (!current->ptrace) { + ret = security_ptrace_traceme(current->parent); + /* + * Check PF_EXITING to ensure ->real_parent has not passed + * exit_ptrace(). Otherwise we don't report the error but + * pretend ->real_parent untraces us right after return. + */ + if (!ret && !(current->real_parent->flags & PF_EXITING)) { + current->ptrace = PT_PTRACED; + __ptrace_link(current, current->real_parent); + } + } + write_unlock_irq(&tasklist_lock); + + return ret; +} + /* * Called with irqs disabled, returns true if childs should reap themselves. */ @@ -574,52 +587,6 @@ int ptrace_request(struct task_struct *c } /** - * ptrace_traceme -- helper for PTRACE_TRACEME - * - * Performs checks and sets PT_PTRACED. - * Should be used by all ptrace implementations for PTRACE_TRACEME. - */ -int ptrace_traceme(void) -{ - int ret = -EPERM; - - /* - * Are we already being traced? - */ -repeat: - task_lock(current); - if (!(current->ptrace & PT_PTRACED)) { - /* - * See ptrace_attach() comments about the locking here. - */ - unsigned long flags; - if (!write_trylock_irqsave(&tasklist_lock, flags)) { - task_unlock(current); - do { - cpu_relax(); - } while (!write_can_lock(&tasklist_lock)); - goto repeat; - } - - ret = security_ptrace_traceme(current->parent); - - /* - * Check PF_EXITING to ensure ->real_parent has not passed - * exit_ptrace(). Otherwise we don't report the error but - * pretend ->real_parent untraces us right after return. - */ - if (!ret && !(current->real_parent->flags & PF_EXITING)) { - current->ptrace |= PT_PTRACED; - __ptrace_link(current, current->real_parent); - } - - write_unlock_irqrestore(&tasklist_lock, flags); - } - task_unlock(current); - return ret; -} - -/** * ptrace_get_task_struct -- grab a task struct reference for ptrace * @pid: process id to grab a task_struct reference of * From oleg at redhat.com Sun May 3 18:56:59 2009 From: oleg at redhat.com (Oleg Nesterov) Date: Sun, 3 May 2009 20:56:59 +0200 Subject: [PATCH 2/2] ptrace: do not use task_lock() for attach In-Reply-To: <20090503185549.GA17087@redhat.com> References: <20090503185549.GA17087@redhat.com> Message-ID: <20090503185659.GA17099@redhat.com> On 05/03, Oleg Nesterov wrote: > > Remove the "Nasty, nasty" lock dance in ptrace_attach()/ptrace_traceme(). > From now task_lock() has nothing to do with ptrace at all. > > With the recent changes nobody uses task_lock() to serialize with ptrace, > but in fact it was never needed and it was never used consistently. arch/um still uses task_lock() to clear PT_DTRACE after exec, but this should be fixed anyway. UML shouldn't use PT_DTRACE at all, and nobody except ptrace should change ptrace flags. arch/um/kernel/exec.c:execve1() is just buggy. For example, it can race with exit_ptrace()->__ptrace_unlink() and leak PT_ flags on untraced task. Jeff, what do you think about the patch I sent you a week ago? > kernel/ptrace.c | 127 ++++++++++++++++++++------------------------------------ > 1 file changed, 47 insertions(+), 80 deletions(-) To simplify the review I am attaching the code with this patch applied, int ptrace_attach(struct task_struct *task) { int retval; audit_ptrace(task); retval = -EPERM; if (unlikely(task->flags & PF_KTHREAD)) goto out; if (same_thread_group(task, current)) goto out; /* * Protect exec's credential calculations against our interference; * SUID, SGID and LSM creds get determined differently under ptrace. */ retval = mutex_lock_interruptible(&task->cred_exec_mutex); if (retval < 0) goto out; task_lock(task); retval = __ptrace_may_access(task, PTRACE_MODE_ATTACH); task_unlock(task); if (retval) goto unlock_creds; write_lock_irq(&tasklist_lock); retval = -EPERM; if (unlikely(task->exit_state)) goto unlock_tasklist; if (task->ptrace) goto unlock_tasklist; task->ptrace = PT_PTRACED; if (capable(CAP_SYS_PTRACE)) task->ptrace |= PT_PTRACE_CAP; __ptrace_link(task, current); send_sig_info(SIGSTOP, SEND_SIG_FORCED, task); unlock_tasklist: write_unlock_irq(&tasklist_lock); unlock_creds: mutex_unlock(&task->cred_exec_mutex); out: return retval; } int ptrace_traceme(void) { int ret = -EPERM; write_lock_irq(&tasklist_lock); /* Are we already being traced? */ if (!current->ptrace) { ret = security_ptrace_traceme(current->parent); /* * Check PF_EXITING to ensure ->real_parent has not passed * exit_ptrace(). Otherwise we don't report the error but * pretend ->real_parent untraces us right after return. */ if (!ret && !(current->real_parent->flags & PF_EXITING)) { current->ptrace = PT_PTRACED; __ptrace_link(current, current->real_parent); } } write_unlock_irq(&tasklist_lock); return ret; } Oleg. From secretaria at evangelizar.org.br Sun May 3 17:12:26 2009 From: secretaria at evangelizar.org.br (Secretaria) Date: Sun, 3 May 2009 17:12:26 GMT Subject: =?iso-8859-1?q?Divulga=E7=E3o_Esp=EDrita?= Message-ID: An HTML attachment was scrubbed... URL: From swarthy at eli.egmont.com Mon May 4 06:00:47 2009 From: swarthy at eli.egmont.com (Dereus) Date: Mon, 04 May 2009 05:00:47 -0100 Subject: G-Spoot Orgasms - Give It To Her Tonight Message-ID: <1241413178-shrieks@eli.egmont.com> A non-text attachment was scrubbed... Name: not available Type: image/png Size: 11160 bytes Desc: not available URL: From roland at redhat.com Mon May 4 18:49:51 2009 From: roland at redhat.com (Roland McGrath) Date: Mon, 4 May 2009 11:49:51 -0700 (PDT) Subject: [RFC, PATCH 0/2] utrace/ptrace: simplify/cleanup ptrace attach In-Reply-To: Oleg Nesterov's message of Sunday, 3 May 2009 20:55:37 +0200 <20090503185537.GA17071@redhat.com> References: <20090503185537.GA17071@redhat.com> Message-ID: <20090504184951.623CEFC32F@magilla.sf.frob.com> I guess I'm slightly confused. We want to merge all of the "pure" ptrace cleanup patches before any utrace patch. When those are on their way, we'll update the utrace patches not to conflict. I don't think it makes sense to include utrace.patch's little ptrace.c change in the baseline tree for your ptrace cleanup patches. Thanks, Roland From roland at redhat.com Mon May 4 19:09:35 2009 From: roland at redhat.com (Roland McGrath) Date: Mon, 4 May 2009 12:09:35 -0700 (PDT) Subject: [PATCH 2/2] ptrace: do not use task_lock() for attach In-Reply-To: Oleg Nesterov's message of Sunday, 3 May 2009 20:55:49 +0200 <20090503185549.GA17087@redhat.com> References: <20090503185549.GA17087@redhat.com> Message-ID: <20090504190935.A0FD4FC32F@magilla.sf.frob.com> This looks good to me overall. It might be worth slicing it into two or more patches, just for bisect paranoia. (e.g. PF_KTHREAD; task_lock in ptrace_attach; task_lock in ptrace_traceme.) I think it merits a comment that the PF_KTHREAD check does not need any interlock because daemonize() will detach ptrace via reparent_to_kthreadd() after it sets PF_KTHREAD. (vs the old ->mm check under task_lock.) It is worth noting that this changes the security_ptrace_traceme() call so it's no longer under task_lock(). I can't see any way the LSM hooks care, but it is a change. You also didn't mention the s/|=/=/ changes. Those are correct, we've already agreed, but the commit log should mention that this subtle change was intentional. Thanks, Roland From oleg at redhat.com Mon May 4 19:30:16 2009 From: oleg at redhat.com (Oleg Nesterov) Date: Mon, 4 May 2009 21:30:16 +0200 Subject: [RFC, PATCH 0/2] utrace/ptrace: simplify/cleanup ptrace attach In-Reply-To: <20090504184951.623CEFC32F@magilla.sf.frob.com> References: <20090503185537.GA17071@redhat.com> <20090504184951.623CEFC32F@magilla.sf.frob.com> Message-ID: <20090504193016.GA17076@redhat.com> On 05/04, Roland McGrath wrote: > > I guess I'm slightly confused. Me too ;) > We want to merge all of the "pure" ptrace > cleanup patches before any utrace patch. Yes, exactly! The second patch "ptrace: do not use task_lock() for attach" has nothing to do with utrace, and it is really pure ptrace cleanup. But it can't be applied to -mm tree, because it (textually) conficts with utrace changes in ptrace_attach(). > When those are on their way, > we'll update the utrace patches not to conflict. I don't think it makes > sense to include utrace.patch's little ptrace.c change in the baseline tree > for your ptrace cleanup patches. Yes, but in this case, how can we push it before utrace-core.patch ? The first patch is only for -mm, to avoid the painful dependencies. Since you seem to mostly agree with the second patch, what should I do? Oleg. From oleg at redhat.com Mon May 4 19:36:24 2009 From: oleg at redhat.com (Oleg Nesterov) Date: Mon, 4 May 2009 21:36:24 +0200 Subject: [PATCH 2/2] ptrace: do not use task_lock() for attach In-Reply-To: <20090504190935.A0FD4FC32F@magilla.sf.frob.com> References: <20090503185549.GA17087@redhat.com> <20090504190935.A0FD4FC32F@magilla.sf.frob.com> Message-ID: <20090504193624.GB17076@redhat.com> On 05/04, Roland McGrath wrote: > > This looks good to me overall. It might be worth slicing it into two or > more patches, just for bisect paranoia. (e.g. PF_KTHREAD; task_lock in > ptrace_attach; task_lock in ptrace_traceme.) OK, > I think it merits a comment that the PF_KTHREAD check does not need any > interlock because daemonize() will detach ptrace via reparent_to_kthreadd() > after it sets PF_KTHREAD. (vs the old ->mm check under task_lock.) Agreed, but actually the patch doesn't make the difference wrt daemonize(). currently ptrace_attach() can take task_lock() just before daemonize() calls exit_mm(). > It is worth noting that this changes the security_ptrace_traceme() call so > it's no longer under task_lock(). I can't see any way the LSM hooks care, > but it is a change. Yes, good point. > You also didn't mention the s/|=/=/ changes. Those are correct, we've > already agreed, but the commit log should mention that this subtle change > was intentional. Yes! Forgot to mention, thanks. Oleg. From roland at redhat.com Mon May 4 19:43:48 2009 From: roland at redhat.com (Roland McGrath) Date: Mon, 4 May 2009 12:43:48 -0700 (PDT) Subject: [RFC, PATCH 0/2] utrace/ptrace: simplify/cleanup ptrace attach In-Reply-To: Oleg Nesterov's message of Monday, 4 May 2009 21:30:16 +0200 <20090504193016.GA17076@redhat.com> References: <20090503185537.GA17071@redhat.com> <20090504184951.623CEFC32F@magilla.sf.frob.com> <20090504193016.GA17076@redhat.com> Message-ID: <20090504194348.BC0EBFC32F@magilla.sf.frob.com> > The second patch "ptrace: do not use task_lock() for attach" has nothing > to do with utrace, and it is really pure ptrace cleanup. Indeed. > But it can't be applied to -mm tree, because it (textually) conficts with > utrace changes in ptrace_attach(). Oh, -mm. I had not thought about the -mm patch merge order. I just look at the whole ptrace-related series from you as an independent series on top of Linus -current, preceding anything else related. > > When those are on their way, > > we'll update the utrace patches not to conflict. I don't think it makes > > sense to include utrace.patch's little ptrace.c change in the baseline tree > > for your ptrace cleanup patches. > > Yes, but in this case, how can we push it before utrace-core.patch ? > > The first patch is only for -mm, to avoid the painful dependencies. I guess we should take Andrew's advice on this. To me, it makes most sense just to order the -mm patches so utrace comes later, and replace the utrace patch as necessary with a compatible version. Perhaps things would be simpler if we made a separate standalone series or git tree (tip/ptrace?) for ptrace cleanups. Thanks, Roland From akpm at linux-foundation.org Mon May 4 23:31:54 2009 From: akpm at linux-foundation.org (Andrew Morton) Date: Mon, 4 May 2009 16:31:54 -0700 Subject: [RFC, PATCH 0/2] utrace/ptrace: simplify/cleanup ptrace attach In-Reply-To: <20090504194348.BC0EBFC32F@magilla.sf.frob.com> References: <20090503185537.GA17071@redhat.com> <20090504184951.623CEFC32F@magilla.sf.frob.com> <20090504193016.GA17076@redhat.com> <20090504194348.BC0EBFC32F@magilla.sf.frob.com> Message-ID: <20090504163154.f3672a83.akpm@linux-foundation.org> On Mon, 4 May 2009 12:43:48 -0700 (PDT) Roland McGrath wrote: > > > When those are on their way, > > > we'll update the utrace patches not to conflict. I don't think it makes > > > sense to include utrace.patch's little ptrace.c change in the baseline tree > > > for your ptrace cleanup patches. > > > > Yes, but in this case, how can we push it before utrace-core.patch ? > > > > The first patch is only for -mm, to avoid the painful dependencies. > > I guess we should take Andrew's advice on this. To me, it makes most sense > just to order the -mm patches so utrace comes later, and replace the utrace > patch as necessary with a compatible version. Perhaps things would be > simpler if we made a separate standalone series or git tree (tip/ptrace?) > for ptrace cleanups. Staging the utrace patch at end-of-series would make sense if utrace is not on track for a 2.6.31 merge. And afaict, this is indeed the case - things seem to have gone a bit quiet on the utrace front lately. From roland at redhat.com Tue May 5 01:12:44 2009 From: roland at redhat.com (Roland McGrath) Date: Mon, 4 May 2009 18:12:44 -0700 (PDT) Subject: [RFC, PATCH 0/2] utrace/ptrace: simplify/cleanup ptrace attach In-Reply-To: Andrew Morton's message of Monday, 4 May 2009 16:31:54 -0700 <20090504163154.f3672a83.akpm@linux-foundation.org> References: <20090503185537.GA17071@redhat.com> <20090504184951.623CEFC32F@magilla.sf.frob.com> <20090504193016.GA17076@redhat.com> <20090504194348.BC0EBFC32F@magilla.sf.frob.com> <20090504163154.f3672a83.akpm@linux-foundation.org> Message-ID: <20090505011244.2BAE1FC2BD@magilla.sf.frob.com> > Staging the utrace patch at end-of-series would make sense if utrace is > not on track for a 2.6.31 merge. > > And afaict, this is indeed the case - things seem to have gone a bit > quiet on the utrace front lately. I don't think that is really accurate. There has been a lack of any reviewer comments on the actual content of the utrace patch (aside from Oleg's own), which is indeed quieter on that front than I had expected. The comments we did get, e.g. from hch, were that a compelling user of the API should go in, such as converting ptrace. Oleg's current ptrace revamp work will culminate in replacing its innards with utrace calls. It's my hope that all this work will be ready in time for 2.6.31. The reason the utrace patch should appear later in the series is that the bulk of the ptrace cleanup series (including all patches done so far) will not depend on utrace at all and will be mergeable independent of the fate of utrace or that of any later utrace-dependent ptrace patches. We expect the utrace patch will get more updates that we hash out in the course of the ptrace work. That being so, it makes more sense (to me) to plan to replace it later before merge time rather than include the old patch earlier in the series and have other patches (including ones unrelated to it) need to do incremental updates relative to it. Thanks, Roland From oleg at redhat.com Tue May 5 23:06:42 2009 From: oleg at redhat.com (Oleg Nesterov) Date: Wed, 6 May 2009 01:06:42 +0200 Subject: [RFC, PATCH 0/2] utrace/ptrace: simplify/cleanup ptrace attach In-Reply-To: <20090504163154.f3672a83.akpm@linux-foundation.org> References: <20090503185537.GA17071@redhat.com> <20090504184951.623CEFC32F@magilla.sf.frob.com> <20090504193016.GA17076@redhat.com> <20090504194348.BC0EBFC32F@magilla.sf.frob.com> <20090504163154.f3672a83.akpm@linux-foundation.org> Message-ID: <20090505230642.GA980@redhat.com> On 05/04, Andrew Morton wrote: > > On Mon, 4 May 2009 12:43:48 -0700 (PDT) > Roland McGrath wrote: > > > I guess we should take Andrew's advice on this. To me, it makes most sense > > just to order the -mm patches so utrace comes later, and replace the utrace > > patch as necessary with a compatible version. Perhaps things would be > > simpler if we made a separate standalone series or git tree (tip/ptrace?) > > for ptrace cleanups. > > Staging the utrace patch at end-of-series would make sense if utrace is > not on track for a 2.6.31 merge. > > And afaict, this is indeed the case - things seem to have gone a bit > quiet on the utrace front lately. The only goal of current ptrace cleanups is to simplify the "ptrace over utrace" change (hopefully they make sense by themselves though). I am obviously biased, but imho the only real problem with utrace-ptrace.patch is the current ptrace code which needs cleanups. Oleg. From mingo at elte.hu Wed May 6 08:12:25 2009 From: mingo at elte.hu (Ingo Molnar) Date: Wed, 6 May 2009 10:12:25 +0200 Subject: [RFC, PATCH 0/2] utrace/ptrace: simplify/cleanup ptrace attach In-Reply-To: <20090505230642.GA980@redhat.com> References: <20090503185537.GA17071@redhat.com> <20090504184951.623CEFC32F@magilla.sf.frob.com> <20090504193016.GA17076@redhat.com> <20090504194348.BC0EBFC32F@magilla.sf.frob.com> <20090504163154.f3672a83.akpm@linux-foundation.org> <20090505230642.GA980@redhat.com> Message-ID: <20090506081225.GD8098@elte.hu> * Oleg Nesterov wrote: > On 05/04, Andrew Morton wrote: > > > > On Mon, 4 May 2009 12:43:48 -0700 (PDT) > > Roland McGrath wrote: > > > > > I guess we should take Andrew's advice on this. To me, it > > > makes most sense just to order the -mm patches so utrace comes > > > later, and replace the utrace patch as necessary with a > > > compatible version. Perhaps things would be simpler if we > > > made a separate standalone series or git tree (tip/ptrace?) > > > for ptrace cleanups. > > > > Staging the utrace patch at end-of-series would make sense if > > utrace is not on track for a 2.6.31 merge. > > > > And afaict, this is indeed the case - things seem to have gone a > > bit quiet on the utrace front lately. > > The only goal of current ptrace cleanups is to simplify the > "ptrace over utrace" change (hopefully they make sense by > themselves though). > > I am obviously biased, but imho the only real problem with > utrace-ptrace.patch is the current ptrace code which needs > cleanups. Yes. But realize the fundamental reason for that: _without_ ptrace-over-utrace the utrace core code is a big chunk of dead code only used on the fringes. I see and agree with all the future uses of utrace, but it's easy to be problem-free if a facility is not used by anything significant. So a clean ptrace-over-utrace plugin is absolutely needed for utrace to go upstream in v2.6.31. The ftrace plugin alone does not justify it. The real prize here is a (much!) cleaner ptrace code. Once ptrace is driven via utrace and it works, its value (and trust level) will skyrocket. Ingo From hch at infradead.org Wed May 6 08:23:00 2009 From: hch at infradead.org (Christoph Hellwig) Date: Wed, 6 May 2009 04:23:00 -0400 Subject: [RFC, PATCH 0/2] utrace/ptrace: simplify/cleanup ptrace attach In-Reply-To: <20090506081225.GD8098@elte.hu> References: <20090503185537.GA17071@redhat.com> <20090504184951.623CEFC32F@magilla.sf.frob.com> <20090504193016.GA17076@redhat.com> <20090504194348.BC0EBFC32F@magilla.sf.frob.com> <20090504163154.f3672a83.akpm@linux-foundation.org> <20090505230642.GA980@redhat.com> <20090506081225.GD8098@elte.hu> Message-ID: <20090506082300.GA16989@infradead.org> On Wed, May 06, 2009 at 10:12:25AM +0200, Ingo Molnar wrote: > Yes. But realize the fundamental reason for that: _without_ > ptrace-over-utrace the utrace core code is a big chunk of dead code > only used on the fringes. I see and agree with all the future uses > of utrace, but it's easy to be problem-free if a facility is not > used by anything significant. The ptrace cleanups might be required for utrace, but they by themselves don't make utrace any more useful without another user. > So a clean ptrace-over-utrace plugin is absolutely needed for utrace > to go upstream in v2.6.31. The ftrace plugin alone does not justify > it. The real prize here is a (much!) cleaner ptrace code. Once > ptrace is driven via utrace and it works, its value (and trust > level) will skyrocket. There are two blockers for utrace: - first all architectures need to be converted to the ptrace world order with regsets, tracehooks and so on. I hope we are on track to get this done now after I've pinged all arch maintainers. - we actually need a useful user of the utrace abstraction. And just converting ptrace to make it slightly more complicated by using another abstraction just isn't it. One useful bit that is in the queue is a in-kernel gdbstub for user process which would allow to get out of the ptrace and re-parenting mess for basic use cases. But a really convincing user would be even better. I don't think 2.6.31 is a very realistic target. While a lot of arch maintainers are working on their ptrace code 2.6.31 is just a too short deadline, and I'm also not sure we'll have the ptrace code in shape by then. 2.6.32 is much more realistic. From mingo at elte.hu Wed May 6 09:05:12 2009 From: mingo at elte.hu (Ingo Molnar) Date: Wed, 6 May 2009 11:05:12 +0200 Subject: [RFC, PATCH 0/2] utrace/ptrace: simplify/cleanup ptrace attach In-Reply-To: <20090506082300.GA16989@infradead.org> References: <20090503185537.GA17071@redhat.com> <20090504184951.623CEFC32F@magilla.sf.frob.com> <20090504193016.GA17076@redhat.com> <20090504194348.BC0EBFC32F@magilla.sf.frob.com> <20090504163154.f3672a83.akpm@linux-foundation.org> <20090505230642.GA980@redhat.com> <20090506081225.GD8098@elte.hu> <20090506082300.GA16989@infradead.org> Message-ID: <20090506090512.GB24692@elte.hu> * Christoph Hellwig wrote: > On Wed, May 06, 2009 at 10:12:25AM +0200, Ingo Molnar wrote: > > Yes. But realize the fundamental reason for that: _without_ > > ptrace-over-utrace the utrace core code is a big chunk of dead code > > only used on the fringes. I see and agree with all the future uses > > of utrace, but it's easy to be problem-free if a facility is not > > used by anything significant. > > The ptrace cleanups might be required for utrace, but they by > themselves don't make utrace any more useful without another > user. > > > So a clean ptrace-over-utrace plugin is absolutely needed for utrace > > to go upstream in v2.6.31. The ftrace plugin alone does not justify > > it. The real prize here is a (much!) cleaner ptrace code. Once > > ptrace is driven via utrace and it works, its value (and trust > > level) will skyrocket. > > There are two blockers for utrace: > > - first all architectures need to be converted to the ptrace world > order with regsets, tracehooks and so on. I hope we are on track > to get this done now after I've pinged all arch maintainers. It might be more effective if you also wrote patches and if you would shop for maintainer Acks, instead of just "pinging" people? ;-) We've already got enough would-be-managers on lkml really. > - we actually need a useful user of the utrace abstraction. And just > converting ptrace to make it slightly more complicated by using > another abstraction just isn't it. One useful bit that is in the > queue is a in-kernel gdbstub for user process which would allow > to get out of the ptrace and re-parenting mess for basic use > cases. But a really convincing user would be even better. > > I don't think 2.6.31 is a very realistic target. While a lot of > arch maintainers are working on their ptrace code 2.6.31 is just a > too short deadline, and I'm also not sure we'll have the ptrace > code in shape by then. 2.6.32 is much more realistic. Really, the above isnt a blocker list, it's your personal wish-list for the future. Cleaning up ptrace itself is already an upstream advantage worth having - for years ptrace was barely maintained. It interfaces to enough critical projects (gdb, strace, UML, etc.) to be a realiable (and testable) basis for utrace. The new features you are suggesting look potentially interesting, but they have zero usage right now and they will take time to develop. So they should be decoupled, otherwise we will just have a huge and problematic change-the-world merge down the line, instead of a more manageable gradual approach. Ingo From hch at infradead.org Wed May 6 09:11:16 2009 From: hch at infradead.org (Christoph Hellwig) Date: Wed, 6 May 2009 05:11:16 -0400 Subject: [RFC, PATCH 0/2] utrace/ptrace: simplify/cleanup ptrace attach In-Reply-To: <20090506090512.GB24692@elte.hu> References: <20090503185537.GA17071@redhat.com> <20090504184951.623CEFC32F@magilla.sf.frob.com> <20090504193016.GA17076@redhat.com> <20090504194348.BC0EBFC32F@magilla.sf.frob.com> <20090504163154.f3672a83.akpm@linux-foundation.org> <20090505230642.GA980@redhat.com> <20090506081225.GD8098@elte.hu> <20090506082300.GA16989@infradead.org> <20090506090512.GB24692@elte.hu> Message-ID: <20090506091115.GA24332@infradead.org> On Wed, May 06, 2009 at 11:05:12AM +0200, Ingo Molnar wrote: > It might be more effective if you also wrote patches and if you > would shop for maintainer Acks, instead of just "pinging" people? > ;-) We've already got enough would-be-managers on lkml really. I have no interest touching tons of architectures where the maintainers are much better of looking at those lowlevel bits. See the case where Roland tried to do ARM but still hasn't gotten any feedback as a negative example. > Really, the above isnt a blocker list, it's your personal wish-list > for the future. Cleaning up ptrace itself is already an upstream > advantage worth having - for years ptrace was barely maintained. It > interfaces to enough critical projects (gdb, strace, UML, etc.) to > be a realiable (and testable) basis for utrace. The cleanups aren't there for cleanup purposes, but to actually allow the utrace-based ptrace being used unconditionally. There is really no point in merging a second conditional ptrace implementation that has to be maintained while we add another one that doesn't add a single new feature. From mingo at elte.hu Wed May 6 09:37:40 2009 From: mingo at elte.hu (Ingo Molnar) Date: Wed, 6 May 2009 11:37:40 +0200 Subject: [RFC, PATCH 0/2] utrace/ptrace: simplify/cleanup ptrace attach In-Reply-To: <20090506091115.GA24332@infradead.org> References: <20090503185537.GA17071@redhat.com> <20090504184951.623CEFC32F@magilla.sf.frob.com> <20090504193016.GA17076@redhat.com> <20090504194348.BC0EBFC32F@magilla.sf.frob.com> <20090504163154.f3672a83.akpm@linux-foundation.org> <20090505230642.GA980@redhat.com> <20090506081225.GD8098@elte.hu> <20090506082300.GA16989@infradead.org> <20090506090512.GB24692@elte.hu> <20090506091115.GA24332@infradead.org> Message-ID: <20090506093740.GA7156@elte.hu> * Christoph Hellwig wrote: > On Wed, May 06, 2009 at 11:05:12AM +0200, Ingo Molnar wrote: > > It might be more effective if you also wrote patches and if you > > would shop for maintainer Acks, instead of just "pinging" people? > > ;-) We've already got enough would-be-managers on lkml really. > > I have no interest touching tons of architectures where the > maintainers are much better of looking at those lowlevel bits. > [...] That's a somewhat naive expectation. Currently ptrace has a low mindshare and an even lower know-how share, even amongst architecture maintainers. Much of the ptrace code has been many years ago and often it has been copied over from other architectures and has been hacked to work sort-of. There's positive exceptions for sure, but generally ptrace know-how is extremely limited and there's a lot of architectures with little proactivity. It is far more efficient if Roland, Oleg (or you, if you are interested in this stuff - which you seem to be) did RFC patches and asked for maintainer acks, than to depend on maintainers to do it. We have about a dozen core kernel features that still have not propagated to all architectures: irqflags-tracking (for lockdep), genirq, stacktrace support, latencytop support, and more. We are just getting around to make GENERIC_TIME the only option [maybe..] - after years of migration. We've got 22 architectures and they tend to slow down certain types of core kernel changes significantly. > [...] See the case where Roland tried to do ARM but still hasn't > gotten any feedback as a negative example. That really reinforces my point: arch maintainers are even less inclined to do it proactively. > > Really, the above isnt a blocker list, it's your personal > > wish-list for the future. Cleaning up ptrace itself is already > > an upstream advantage worth having - for years ptrace was barely > > maintained. It interfaces to enough critical projects (gdb, > > strace, UML, etc.) to be a realiable (and testable) basis for > > utrace. > > The cleanups aren't there for cleanup purposes, but to actually > allow the utrace-based ptrace being used unconditionally. There > is really no point in merging a second conditional ptrace > implementation that has to be maintained while we add another one > that doesn't add a single new feature. I'm well aware of what these patches are trying to achieve. We've got the main mass of architectures covered: arch/ia64/Kconfig: select HAVE_ARCH_TRACEHOOK arch/powerpc/Kconfig: select HAVE_ARCH_TRACEHOOK arch/s390/Kconfig: select HAVE_ARCH_TRACEHOOK arch/sh/Kconfig: select HAVE_ARCH_TRACEHOOK arch/sparc/Kconfig: select HAVE_ARCH_TRACEHOOK arch/x86/Kconfig: select HAVE_ARCH_TRACEHOOK I'd expect the remaining arch conversion to tracehooks to be deterministically finished if done by the ptrace folks - i.e. Roland and Oleg. It will take forever if all that happens is a 'ping' from you ;-) Ingo From academia at arash.net Wed May 6 12:29:34 2009 From: academia at arash.net (Frogge Hetu) Date: Wed, 06 May 2009 12:29:34 +0000 Subject: Negative Effects off Masturbation Message-ID: <20090506122816-2445740anxiousness@arash.net> A non-text attachment was scrubbed... Name: Frogge.png Type: image/png Size: 10907 bytes Desc: not available URL: From roland at redhat.com Thu May 7 06:13:43 2009 From: roland at redhat.com (Roland McGrath) Date: Wed, 6 May 2009 23:13:43 -0700 (PDT) Subject: [RFC, PATCH 0/2] utrace/ptrace: simplify/cleanup ptrace attach In-Reply-To: Ingo Molnar's message of Wednesday, 6 May 2009 11:37:40 +0200 <20090506093740.GA7156@elte.hu> References: <20090503185537.GA17071@redhat.com> <20090504184951.623CEFC32F@magilla.sf.frob.com> <20090504193016.GA17076@redhat.com> <20090504194348.BC0EBFC32F@magilla.sf.frob.com> <20090504163154.f3672a83.akpm@linux-foundation.org> <20090505230642.GA980@redhat.com> <20090506081225.GD8098@elte.hu> <20090506082300.GA16989@infradead.org> <20090506090512.GB24692@elte.hu> <20090506091115.GA24332@infradead.org> <20090506093740.GA7156@elte.hu> Message-ID: <20090507061343.9A986FC39E@magilla.sf.frob.com> > It is far more efficient if Roland, Oleg (or you, if you are > interested in this stuff - which you seem to be) did RFC patches and > asked for maintainer acks, than to depend on maintainers to do it. This has been on offer since the first user_regset stuff went into 2.6.25, and I think I reiterated that on linux-arch when CONFIG_HAVE_ARCH_TRACEHOOK went in. What it does require is some arch person to at least show interest in seeing the patches, test-build them and/or point to usable cross compiler setups, etc. It doesn't have to be arch maintainers, but someone at all who uses the arch and is prepared to build kernels for it. In the case of arm, the fine Fedora/ARM folks had already made it easy enough for me to do two web searches and find the cross compilers, qemu settings, and system images I could get going lickety-split without even asking anyone for pointers. But as hch noted, even doing 95% of the work myself up front (built and tested!) hasn't yet helped get any feedback. For any arch where there is anyone out there but the crickets, it's easy for me to help with the actual code. I just need a little direction on arch build setups and maybe some specific arch details questions, and a little feedback. But where the only people you can find who've heard of an arch say, "We haven't looked what's upstream since 2.6.22 or so," I don't want to waste my time on untried patches that will just go stale without ever being compiled. Thanks, Roland From heap at clearclips.com Thu May 7 13:17:11 2009 From: heap at clearclips.com (Dedecker Salus) Date: Thu, 07 May 2009 13:17:11 +0000 Subject: 3 Secrets For Making Love to a Woman You Should Never Ignore - Make Her Climax Extremmely Quickly Message-ID: <76cd20090507131602@clearclips.com> A non-text attachment was scrubbed... Name: Dedecker.png Type: image/png Size: 10752 bytes Desc: not available URL: From platipi at kittsdesign.com Fri May 8 00:16:36 2009 From: platipi at kittsdesign.com (Bari Pono) Date: Fri, 08 May 2009 00:16:36 +0000 Subject: How to Hvae Phone sex Message-ID: A non-text attachment was scrubbed... Name: Bari.png Type: image/png Size: 11234 bytes Desc: not available URL: From mingo at elte.hu Fri May 8 15:08:41 2009 From: mingo at elte.hu (Ingo Molnar) Date: Fri, 8 May 2009 17:08:41 +0200 Subject: [RFC, PATCH 0/2] utrace/ptrace: simplify/cleanup ptrace attach In-Reply-To: <20090507061343.9A986FC39E@magilla.sf.frob.com> References: <20090504193016.GA17076@redhat.com> <20090504194348.BC0EBFC32F@magilla.sf.frob.com> <20090504163154.f3672a83.akpm@linux-foundation.org> <20090505230642.GA980@redhat.com> <20090506081225.GD8098@elte.hu> <20090506082300.GA16989@infradead.org> <20090506090512.GB24692@elte.hu> <20090506091115.GA24332@infradead.org> <20090506093740.GA7156@elte.hu> <20090507061343.9A986FC39E@magilla.sf.frob.com> Message-ID: <20090508150841.GB29974@elte.hu> * Roland McGrath wrote: > > It is far more efficient if Roland, Oleg (or you, if you are > > interested in this stuff - which you seem to be) did RFC patches and > > asked for maintainer acks, than to depend on maintainers to do it. > > This has been on offer since the first user_regset stuff went into > 2.6.25, and I think I reiterated that on linux-arch when > CONFIG_HAVE_ARCH_TRACEHOOK went in. > > What it does require is some arch person to at least show interest > in seeing the patches, test-build them and/or point to usable > cross compiler setups, etc. It doesn't have to be arch > maintainers, but someone at all who uses the arch and is prepared > to build kernels for it. > > In the case of arm, the fine Fedora/ARM folks had already made it > easy enough for me to do two web searches and find the cross > compilers, qemu settings, and system images I could get going > lickety-split without even asking anyone for pointers. But as hch > noted, even doing 95% of the work myself up front (built and > tested!) hasn't yet helped get any feedback. > > For any arch where there is anyone out there but the crickets, > it's easy for me to help with the actual code. I just need a > little direction on arch build setups and maybe some specific arch > details questions, and a little feedback. But where the only > people you can find who've heard of an arch say, "We haven't > looked what's upstream since 2.6.22 or so," I don't want to waste > my time on untried patches that will just go stale without ever > being compiled. that's OK. If you went so far, if you were proactive and did due diligence, and nobody bothered, just push the changes into linux-next and there's no valid basis for future objections against those patches. Ingo From newsletter2 at brinde-companhia.com Fri May 8 08:07:22 2009 From: newsletter2 at brinde-companhia.com (Brinde & Companhia) Date: Fri, 8 May 2009 10:07:22 +0200 Subject: =?iso-8859-1?q?Para_aqueles_que_s=E3o_o_nosso_Futuro=2E=2E=2E?= Message-ID: <24532c519944d03fc3e3fad7fb58e78e@newsletter.brinde-companhia.com> An HTML attachment was scrubbed... URL: From caoqingguo99328 at 21cn.com Fri May 8 21:48:14 2009 From: caoqingguo99328 at 21cn.com (caoqingguo99328 at 21cn.com) Date: Sat, 9 May 2009 05:48:14 +0800 (CST) Subject: =?utf-8?b?NOaPkDjkvpsz5Y+RM+elqDM=?= Message-ID: <15949443.10861241819294060.JavaMail.root@webmail7> An HTML attachment was scrubbed... URL: From avihu420 at pema.net Sat May 9 12:15:13 2009 From: avihu420 at pema.net (Andromache Kessler) Date: Sat, 9 May 2009 21:15:13 +0900 Subject: Satisfy her immensely Message-ID: <002001c9d09f$ce2616f0$6e391c5f@IFP0010293kglgzf> You need to imrove your lovemaker? http://ixleao.yamniqoz.cn/ From enolisations at win2sms.net Sat May 9 16:41:31 2009 From: enolisations at win2sms.net (Ailstock Mckenrick) Date: Sat, 09 May 2009 16:41:31 +0000 Subject: How to Have Great sex - Spice itt Up! Message-ID: <4A05B1EA.5377079@win2sms.net> A non-text attachment was scrubbed... Name: Ailstock.png Type: image/png Size: 11266 bytes Desc: not available URL: From dbruce at dpaper.com.my Sun May 10 09:46:25 2009 From: dbruce at dpaper.com.my (Mary Mccray) Date: Sun, 10 May 2009 18:46:25 +0900 Subject: Get life full of night adventures Message-ID: <002001c9d154$2f45ee60$8adae24c@AMADASyzz> Drink just one an hour before your couple's night show and rock your girl's world. http://szozfu.yumkouxja.net/ From blast at thevanders.ca Sun May 10 16:53:53 2009 From: blast at thevanders.ca (Breyfogle Savocchia) Date: Sun, 10 May 2009 16:53:53 +0000 Subject: The Single Most Stunning Way to Make Her Orgaasm - She Would Almost Scream & Cry With Pleasure! Message-ID: <1db2cf_pihyTc@thevanders.ca> A non-text attachment was scrubbed... Name: Breyfogle.png Type: image/png Size: 11189 bytes Desc: not available URL: From cowpox at hotelsetc.com Mon May 11 05:12:58 2009 From: cowpox at hotelsetc.com (Tinn Rodd) Date: Mon, 11 May 2009 05:12:58 +0000 Subject: Last Longer in Bed - Laast longer By Controlling Your Breathing Message-ID: <20090511051038$5946426canons@hotelsetc.com> A non-text attachment was scrubbed... Name: Tinn.png Type: image/png Size: 10670 bytes Desc: not available URL: From keanderson at bertkieftoptiek.nl Mon May 11 06:09:47 2009 From: keanderson at bertkieftoptiek.nl (Caspar Camp) Date: Mon, 11 May 2009 15:09:47 +0900 Subject: His girlfriend is pleased and yours? Message-ID: <20090511150947.3080404@bertkieftoptiek.nl> She will look at you differently. http://rwcar.jiqwimgani.com/ From dominican.beach.real.estate at gmail.com Tue May 12 12:59:58 2009 From: dominican.beach.real.estate at gmail.com (Dominican Properties) Date: Tue, 12 May 2009 08:59:58 -0400 Subject: Your Luxury Home in the Caribbean Message-ID: If you can not see go to this email : http://www.mailconnections.com/mailing/capcana/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From dominican.beach.real.estate at gmail.com Tue May 12 13:12:41 2009 From: dominican.beach.real.estate at gmail.com (Dominican Properties) Date: Tue, 12 May 2009 09:12:41 -0400 Subject: Your Luxury Home in the Caribbean Message-ID: If you can not see go to this email : http://www.mailconnections.com/mailing/capcana/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From graficarmc at pop.com.br Sat May 9 12:39:13 2009 From: graficarmc at pop.com.br (RMC Visual) Date: Sat, 9 May 2009 12:39:13 GMT Subject: Tudo que necessita !!! Message-ID: <20090509123933.DE4596A29CFB@postfix41.rmcvisual.com> An HTML attachment was scrubbed... URL: From spawned at datachan.com Wed May 13 03:59:33 2009 From: spawned at datachan.com (Grimshaw Jezewski) Date: Wed, 13 May 2009 03:59:33 +0000 Subject: Cunnilingus - Cunnilingus iss the Missing Piece to Extraordinary sex Message-ID: <7FA21B910ED817EE1FD39E2334289E45232A77@datachan.com> A non-text attachment was scrubbed... Name: Grimshaw.png Type: image/png Size: 10995 bytes Desc: not available URL: From news at mygirlsband.com Wed May 13 09:50:18 2009 From: news at mygirlsband.com (MyGirlsband.com) Date: Wed, 13 May 2009 10:50:18 +0100 Subject: Casting para GIRLSBAND! Inscreve-te ja! Message-ID: ? CASTING PARA FORMAR A NOVA GIRLSBAND PORTUGUESASe tens entre 18 e 25 anos, sentes-te sexy, com atitude, boa imagem e voz para pertenceres a uma GirlsBand...Ent?o chegou a tua oportunidade - Cria um Video sobre ti no YouTube - Envia o link do YouTube para casting at mygirlsband.com- Os videos mais convincentes s?o publicados em www.mygirlsband.com- Depois podes ser uma das 4 esolhidas pelo nosso juri para a nova GirlsBand Portuguesa Inscreve-te j?!Sabe tudo emwww.mygirlsband.com ?Esta mensagem est? de acordo com a legisla??o nacional sobre o envio de mensagens comerciais: Qualquer mensagem dever? estar claramente identificada com os dados do emissor e dever? proporcionar ao receptor a hip?tese de ser removido da lista. Para ser removido da nossa lista, basta que nos responda a esta mensagem colocando a palavra "Remover" no assunto. (Decreto-Lei n? 7/2004).- -------------- next part -------------- An HTML attachment was scrubbed... URL: From lindayale01 at yahoo.fr Wed May 13 14:00:25 2009 From: lindayale01 at yahoo.fr (linda williams yale) Date: Wed, 13 May 2009 14:00:25 +0000 Subject: dear Message-ID: <200905131400.n4DE0QnU011086@mx1.redhat.com> please, I know this may sound strange to you , receiving a mail from an unknown person, but condition forced me to do that. I sent to you this email proposing to you my intentions to transfer my heritage of six million DOLLARS to your country for investments to enable me continue my studies. I therefore wish to hear from you for an urgent response, Thank you for your comprehension, miss linda williams yale From ottorios at es.sopragroup.com Thu May 14 10:58:14 2009 From: ottorios at es.sopragroup.com (Mercy Sloan) Date: Thu, 14 May 2009 13:58:14 +0300 Subject: How to safeguard your intimate life. Message-ID: <002b01c9d47a$7f703b10$77543567@pcrfzh> Madness of low prices http://lx.rozcadop.cn/ From trichology at ucfabc.org Fri May 15 11:23:51 2009 From: trichology at ucfabc.org (Miedema Wallach) Date: Fri, 15 May 2009 11:23:51 +0000 Subject: What iss a Normal sex Life? Message-ID: A non-text attachment was scrubbed... Name: Miedema.png Type: image/png Size: 10895 bytes Desc: not available URL: From challengers at latua.com Fri May 15 20:21:28 2009 From: challengers at latua.com (Deblieck Cicero) Date: Fri, 15 May 2009 20:21:28 +0000 Subject: Tahe Rule to Having sex Longer Message-ID: <670D82C73676E7B2745D1472215BD4DD6D13A6@latua.com> A non-text attachment was scrubbed... Name: Deblieck.png Type: image/png Size: 10958 bytes Desc: not available URL: From mldireto at tudoemoferta.com.br Wed May 13 18:49:54 2009 From: mldireto at tudoemoferta.com.br (Corporativo - ArtShop Brasil) Date: Wed, 13 May 2009 15:49:54 -0300 Subject: Exclusivo para o Setor Corporativo Message-ID: <2e1781d4299eacc7fa3ed765001081c6@tudoemoferta.com.br> An HTML attachment was scrubbed... URL: From laxativeness at geroldsee.de Sat May 16 17:44:47 2009 From: laxativeness at geroldsee.de (Tamburrelli Masten) Date: Sat, 16 May 2009 17:44:47 +0000 Subject: Have Casual sex Relationships Wihtout the Commitment Message-ID: A non-text attachment was scrubbed... Name: Tamburrelli.png Type: image/png Size: 10814 bytes Desc: not available URL: From CleanPlusMail at freesurf.fr Sun May 17 00:35:07 2009 From: CleanPlusMail at freesurf.fr (Communication Officer OptIN Customer Base) Date: Sun, 17 May 2009 02:35:07 +0200 Subject: Global Leadership in Handcare - Consumer, Auto, Professional & Industrial Products - OTC : FLKI Message-ID: <85724802f6561c3737ba604f001b50da@freesurf.fr> .headerTop { background-color:#ffffff; border-top:0px solid #000000; border-bottom:0px solid #FFCC66; text-align:right; } .adminText { font-size:10px; color:#FFFFCC; line-height:200%; font-family:verdana; text-decoration:none; } .headerBar { background-color:#fcd200; border-top:0px solid #fcd200; border-bottom:0px solid #333333; } .title { font-size:30px; font-weight:bold; color:#336600; font-family:arial; line-height:110%; } .subTitle { font-size:11px; font-weight:normal; color:#666666; font-style:italic; font-family:arial; } td { font-size:12px; color:#000000; line-height:150%; font-family:trebuchet ms; } .footerRow { background-color:#FFFFCC; border-top:10px solid #fcd200; } .footerText { font-size:10px; color:#333333; line-height:100%; font-family:verdana; } a { color:#0063be; color:#0063be; color:#0063be; } Clean Plus Hand Wipes. Non-abrasive economical hand cleansing wet wipes for frequent use. Ideal for use in the industrial, farming, maintenance, and office sectors. Removes all types of dirt, greasy stains, ink, fuel and odours from hands. Qualified hypoallergenic and lipo-protective. Antibacterial properties. Perfect when soap and water are not readily available. Clean Plus? wants to simplify you life, to make the cleaning process quick and fun, to deliver nothing but the best. To learn more about Clean Plus?, click here. Also Try Other Clean Plus? Hand Care Products. Industry, automotive, maintenance, office.... Clean Plus? Hand Care offers hand care products for every professional. Traditional granulated soaps, super-cleaning hand wet wipes and liquids for people on the move and special creams to protect and restore your skin. To learn more click here Capital Pro Marketing is a specialist in the promotion business. We do not support Spam mails. This email was sent to you because we feel that whether you are an investor, distributor, or consumer, you are able to benefit from the above information pertaining to the corporate image building efforts of our client, products promotion, and Customer Relationship Management activities. If you feel that the information provided in this mail was not useful to you and would like to have your name removed from our mailing list, kindly follow the directions below. My CNN Now will ensure every effort to take your name off immediately. We apologize for any inconvenience caused. T his message is sent in compliance of the new email Bill HR 1910.Under Bill HR 1910 passed by the 106th US Congress on May 24, 1999,this message cannot be considered Spam as long as we include the way to be removed. P er Section HR 1910, Please type "REMOVE ME PLEASE" in the subject line and send to capitalpronews at freesurf.fr< /td> -------------- next part -------------- An HTML attachment was scrubbed... URL: From stereography at linksnoop.com Sun May 17 07:41:45 2009 From: stereography at linksnoop.com (Cotney Blue) Date: Sun, 17 May 2009 07:41:45 +0000 Subject: How to Make Love - Discover the 'Secret oFrmula' to Mind Blowing Experience! Message-ID: <36da56d975c34def5ba6dbd075d0.hornets@linksnoop.com> A non-text attachment was scrubbed... Name: Cotney.png Type: image/png Size: 10973 bytes Desc: not available URL: From rheological at dovigues.com.br Sun May 17 19:44:01 2009 From: rheological at dovigues.com.br (Lampi Davey) Date: Sun, 17 May 2009 19:44:01 +0000 Subject: Become an Amazing Lover - 22 Sure Shot Techniques guaranteed to Make Her Whimper With Delight Message-ID: A non-text attachment was scrubbed... Name: Lampi.png Type: image/png Size: 10967 bytes Desc: not available URL: From agrinho at netcabo.pt Mon May 18 13:17:18 2009 From: agrinho at netcabo.pt (=?iso-8859-1?Q?Quinta=20do=20Agrinho?=) Date: Mon, 18 May 2009 09:17:18 -0400 Subject: =?iso-8859-1?q?F=E9rias_num_para=EDso?= Message-ID: <20090518141649.A8DD8925.3B45BE@127.0.0.1> MAIL ERROR -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpeg Size: 128463 bytes Desc: not available URL: From canner at ket.no Mon May 18 13:08:17 2009 From: canner at ket.no (Moatz Prazeres) Date: Mon, 18 May 2009 13:08:17 +0000 Subject: 2 Female Orgasm Techniques That Are Fun (And Pheonmenal For Her!) Message-ID: <20090518130653$6579870trembled@ket.no> A non-text attachment was scrubbed... Name: Moatz.png Type: image/png Size: 10762 bytes Desc: not available URL: From cornel at upload-ro.ro Mon May 18 17:41:46 2009 From: cornel at upload-ro.ro (cornel) Date: Mon, 18 May 2009 20:41:46 +0300 Subject: Untitled-1 Message-ID: <20090518.JZMZNEEISWISMYFA@upload-ro.ro> An HTML attachment was scrubbed... URL: From evenness at engine-part.net Tue May 19 04:50:35 2009 From: evenness at engine-part.net (Larue Seymoure) Date: Tue, 19 May 2009 04:50:35 +0000 Subject: 3 Hot Tips to Help You Last Longer in eBd Tonight Message-ID: <5d41ed95.beaverboard@engine-part.net> A non-text attachment was scrubbed... Name: Larue.png Type: image/png Size: 11134 bytes Desc: not available URL: From mercerisers at purmo.com Tue May 19 14:00:05 2009 From: mercerisers at purmo.com (Pfeifle Seiz) Date: Tue, 19 May 2009 14:00:05 +0000 Subject: Kama Sutra - Thhe Secrets Behind The Kama Sutra Message-ID: A non-text attachment was scrubbed... Name: alginic.png Type: image/png Size: 12348 bytes Desc: not available URL: From preallocate at cranefamily.net Wed May 20 17:34:52 2009 From: preallocate at cranefamily.net (Cleere Scarfo) Date: Wed, 20 May 2009 17:34:52 +0000 Subject: G-Spot Orgasms and What You Don't Know About Them - The Bad apnd the Ugly Message-ID: <8iL7dWR3TCX1PGjwtYNWfabAjgFWJMR3@cranefamily.net> A non-text attachment was scrubbed... Name: stirrings.png Type: image/png Size: 10682 bytes Desc: not available URL: From manifestoing at noexeq.net Thu May 21 12:25:47 2009 From: manifestoing at noexeq.net (manifestoing) Date: Thu, 21 May 2009 12:25:47 +0000 Subject: There's More to Your Touch Thfan You Think - Understanding Post Orgasm Caressing Message-ID: A non-text attachment was scrubbed... Name: cootch.png Type: image/png Size: 12336 bytes Desc: not available URL: From newsletter at brinde-companhia.com Thu May 21 05:42:20 2009 From: newsletter at brinde-companhia.com (Brinde & Companhia) Date: Thu, 21 May 2009 07:42:20 +0200 Subject: =?iso-8859-1?q?Cores_de_Ver=E3o=2E=2E=2E?= Message-ID: <43f3d208351c80149a7e40a9671942c2@newsletter2.brinde-companhia.com> An HTML attachment was scrubbed... URL: From chapeaux at drjones.uk.com Thu May 21 19:23:05 2009 From: chapeaux at drjones.uk.com (Weiderhold) Date: Thu, 21 May 2009 19:23:05 +0000 Subject: The Awakening -- Your sexuality Message-ID: <1652af761a617188$H3XWFwrIZCfcx@drjones.uk.com> A non-text attachment was scrubbed... Name: mistier.png Type: image/png Size: 12230 bytes Desc: not available URL: From customer-care at clubvacationdeals.com Fri May 22 03:16:15 2009 From: customer-care at clubvacationdeals.com (Club Vacation Deals) Date: Thu, 21 May 2009 23:16:15 -0400 Subject: Sheraton Buganvilias Vallarta at the best price Message-ID: An HTML attachment was scrubbed... URL: From stalwartise at hk.telenor.no Fri May 22 04:15:22 2009 From: stalwartise at hk.telenor.no (Crissey) Date: Fri, 22 May 2009 04:15:22 +0000 Subject: Love Making Styles - Change Your Love Makinng Styles For Earth-Shattering Lovemaking Message-ID: <3981772D706CAFDC5579B2F64B@hk.telenor.no> A non-text attachment was scrubbed... Name: hagiocracy.png Type: image/png Size: 11790 bytes Desc: not available URL: From sided at juti-bau.de Fri May 22 12:34:55 2009 From: sided at juti-bau.de (Leffel) Date: Fri, 22 May 2009 12:34:55 +0000 Subject: Quick Tips on How to Increase Spgerm Count! Message-ID: <6c63f4210a20090522123458@juti-bau.de> An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: mutinous.png Type: image/png Size: 12392 bytes Desc: not available URL: From titaness at flandro.net Fri May 22 13:07:57 2009 From: titaness at flandro.net (Santwire Bury) Date: Fri, 22 May 2009 13:07:57 +0000 Subject: Sexyual Activity and Satisfaction Ensure Marital Satisfaction Message-ID: An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: shovelnose.png Type: image/png Size: 12192 bytes Desc: not available URL: From contato at mercadointerativo.com.br Fri May 22 19:33:15 2009 From: contato at mercadointerativo.com.br (Coweb Solucoes On-Line) Date: Fri, 22 May 2009 16:33:15 -0300 Subject: Ter um Site e mais simples do que voce pensa! Message-ID: <340462596336821225924@core2duo> An HTML attachment was scrubbed... URL: From mldireto at tudoemoferta.com.br Fri May 22 12:31:39 2009 From: mldireto at tudoemoferta.com.br (Englobe Sistemas e E-commerce) Date: Fri, 22 May 2009 09:31:39 -0300 Subject: Nosso time de sucesso. Message-ID: <49cc234df9b7e5d148fdb8f60012d75e@tudoemoferta.com.br> An HTML attachment was scrubbed... URL: From lousily at baloncici.com Sat May 23 02:44:36 2009 From: lousily at baloncici.com (Vinson Im) Date: Sat, 23 May 2009 02:44:36 +0000 Subject: Learn oHw to Go As Long As You Need to Please Your Lover! Message-ID: An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: discredits.png Type: image/png Size: 11976 bytes Desc: not available URL: From crispate at psy.edu Sat May 23 17:01:24 2009 From: crispate at psy.edu (Kippley) Date: Sat, 23 May 2009 17:01:24 +0000 Subject: Japan beef ban leaves Hirst's csow art in a pickle Message-ID: <20090523165644-6164401alliterated@psy.edu> A non-text attachment was scrubbed... Name: interracial.jpg Type: image/jpg Size: 12120 bytes Desc: not available URL: From sinister at denzeldrive.at Sat May 23 18:49:12 2009 From: sinister at denzeldrive.at (sinister) Date: Sat, 23 May 2009 18:49:12 +0000 Subject: Horny Beetles Have Tniy Testes Message-ID: A non-text attachment was scrubbed... Name: monosyllable.jpg Type: image/jpg Size: 12412 bytes Desc: not available URL: From docdlbulletinboardauthors at ctcinternet.cl Sat May 23 22:32:21 2009 From: docdlbulletinboardauthors at ctcinternet.cl (Clifford Franks) Date: Sat, 23 May 2009 23:32:21 +0100 Subject: One of the reliable sources of cheap and quality medication products, according to FDA. Message-ID: <20090523233221.6020308@ctcinternet.cl> Don't you need your device to be ready for girls every time you want it? http://vyna.juqnakip.cn/ From rove at pelmor.com Sun May 24 11:41:26 2009 From: rove at pelmor.com (Morgas) Date: Sun, 24 May 2009 11:41:26 +0000 Subject: Virginia School Fires Butt-Prints Art Teachfer Message-ID: <20090524114211$409452nonaligned@pelmor.com> A non-text attachment was scrubbed... Name: jobname.jpg Type: image/jpg Size: 12145 bytes Desc: not available URL: From omdurman at ghcf.co.uk Sun May 24 19:54:30 2009 From: omdurman at ghcf.co.uk (omdurman) Date: Sun, 24 May 2009 19:54:30 +0000 Subject: Mike yTson In DUI, Cocaine Bust Message-ID: <1243194692_kraemer@ghcf.co.uk> A non-text attachment was scrubbed... Name: minion.jpg Type: image/jpg Size: 11941 bytes Desc: not available URL: From ineffaceably at um.poznan.pl Sun May 24 02:14:59 2009 From: ineffaceably at um.poznan.pl (Durett) Date: Sun, 24 May 2009 02:14:59 +0000 Subject: Shampoo lousy att de-lousing Message-ID: A non-text attachment was scrubbed... Name: market.jpg Type: image/jpg Size: 11971 bytes Desc: not available URL: From comprandonline at comprandonline.com.br Mon May 25 03:27:49 2009 From: comprandonline at comprandonline.com.br (Comprandonline) Date: Mon, 25 May 2009 03:27:49 GMT Subject: Surpresa!!!! Message-ID: <20090525032747.433776DDBADB@postfix41.rmcvisual.com> An HTML attachment was scrubbed... URL: From ineffaceably at um.poznan.pl Mon May 25 03:44:26 2009 From: ineffaceably at um.poznan.pl (Durett) Date: Mon, 25 May 2009 03:44:26 +0000 Subject: Shampoo lousy att de-lousing Message-ID: A non-text attachment was scrubbed... Name: market.jpg Type: image/jpg Size: 11971 bytes Desc: not available URL: From resisting at ehproject.org Mon May 25 07:33:23 2009 From: resisting at ehproject.org (resisting) Date: Mon, 25 May 2009 07:33:23 +0000 Subject: Disney givnig brides-to-be a chance to dress like princesses Message-ID: <3BA201A30079F5433B7CC0800A883988108B64@ehproject.org> A non-text attachment was scrubbed... Name: chunky.jpg Type: image/jpg Size: 12328 bytes Desc: not available URL: From consummates at elvis.uccs.edu Mon May 25 11:44:52 2009 From: consummates at elvis.uccs.edu (consummates) Date: Mon, 25 May 2009 11:44:52 +0000 Subject: How's My Nanny? Stroller Tag Lets You Rpeort Sitters Message-ID: A non-text attachment was scrubbed... Name: impresser.jpg Type: image/jpg Size: 11856 bytes Desc: not available URL: From newsletter at usbportugal.com Mon May 25 13:59:23 2009 From: newsletter at usbportugal.com (USBPortugal.com) Date: Mon, 25 May 2009 15:59:23 +0200 Subject: =?iso-8859-1?q?J=E1_n=E3o_h=E1_mem=F3ria_de=2E=2E=2ESemana_21?= Message-ID: An HTML attachment was scrubbed... URL: From afiliados at hollywoodportugal.com Mon May 25 14:55:15 2009 From: afiliados at hollywoodportugal.com (BBC ENGLISH) Date: Mon, 25 May 2009 15:55:15 +0100 Subject: =?windows-1252?q?APRENDA_INGL=CAS_A_PARTIR_DE_SUA_CASA_COM_O_CUR?= =?windows-1252?q?SO_BBC_ENGLISH_-_REGISTO_GRATUITO?= Message-ID: <3872-22009512514551578@servidor> a melhor instituição mundial de ensino Aprenda inglês a partir de sua casa com o curso NEW BBC ENGLISHMultimedia System método desenvolvido pela 60 milhões de alunos em 17 países 77 anos de experiência 87% de eficácia nos exames da Universidade de Cambridge Aproveite ascondições especiais de financiamento! INSCREVA-SE JÁ! Copyright © 2009 hollywoodportugal.com todos os direitos reservados Esta mensagem não pode ser considerada como "lixo electrónico ", porque inclui todos os nossos contactos, assim como instruções para remover o e-mail da nossa mailing list. Se pretender anular a informação que a nossa empresa envia, clique aqui. (Directiva 2000/31/CE do Parlamento Europeu; Relatório A5-0270/2001 do Parlamento Europeu). -------------- next part -------------- An HTML attachment was scrubbed... URL: From unrated at ovso.no Tue May 26 00:15:38 2009 From: unrated at ovso.no (Galathe) Date: Tue, 26 May 2009 00:15:38 +0000 Subject: Icy Polar Plunge Seets New World Record Message-ID: A non-text attachment was scrubbed... Name: bathroom.jpg Type: image/jpg Size: 12051 bytes Desc: not available URL: From panamericanocredprime at yahoo.com.br Tue May 26 20:54:18 2009 From: panamericanocredprime at yahoo.com.br (PanAmericano Imóvel Próprio) Date: Tue, 26 May 2009 20:54:18 GMT Subject: =?iso-8859-1?q?Saiba_como_adquirir_seu_im=F3vel_at=E9_o_fim_do_a?= =?iso-8859-1?q?no=2E_Preencha_o_formul=E1rio?= Message-ID: <20090526165456.7FBEB8C54A@XXXCNN1062> An HTML attachment was scrubbed... URL: From aswarter at sol.dk Wed May 27 06:57:42 2009 From: aswarter at sol.dk (Humphrey Hodge) Date: Wed, 27 May 2009 12:27:42 +0530 Subject: Nothing will come between you and your success with girls. Message-ID: <20090527122742.9090501@sol.dk> We see you living a better life, in a small part because you use high quality, all-natural beauty and body care products http://lqmz.pefpaveiv.com/ From grantscanada at neomailbox.net Thu May 28 00:27:54 2009 From: grantscanada at neomailbox.net (Canadian Subsidy Directory 2009) Date: Wed, 27 May 2009 19:27:54 -0500 Subject: Available; Canadian subsidies, grants and loans Message-ID: <12427968191abeca64e325c3800c7b56084dadaaf4@neomailbox.net> Press release. / Communiqu? {Message en francais ci-dessous) Canadian Subsidy directory (2009 EDITION) Member of the Better Business Bureau Legal Deposit-National Library of Canada The new Subsidy Directory 2009 is now available, newly revised it is the most complete and affordable reference for anyone looking for financing. It is the perfect tool for new and existing businesses, individuals, foundations and associations. This Publication contains more than 3500 direct and indirect financial subsidies, grants and loans offered by government departments and agencies, foundations, associations and organizations. In this edition all programs are well described. Canadian Subsidy Directory (All Canada, federal + provincial + foundations) CD-Rom (Pdf file).............................$ 69.95 Printed (430 pages)..........................$149.95 Also available for each province on CD-Rom only...........$ 49.95 Alberta British Columbia New Brunswick Newfoundland & Labrador Northwest Territories / Nunavut / Yukon Manitoba Nova Scotia Ontario Prince Edward Island Quebec .............................$ 69.95 Saskatchewan To obtain a copy please call toll free 1-866-322-3376 or local 819-322-5756 ********************************* Canadian Subsidy Directory 14-A Des Seigneurs St-Sauveur Qc J0R 1R0 ********************************* FRANCAIS Subventions Qu?bec (Annuaire des Subventions au Qu?bec 2009) Vous vous demandez s?il existe un programme d?aide financi?re pour vos besoins personnels ou pour assister votre entreprise dans son ?tat actuel? Subventions Qu?bec vous offre un acc?s unique ? tous les programmes d?assistance financi?re sous forme de subventions provenant des diff?rents gouvernements et organismes du Canada. Subventions Qu?bec vous offre la liste de toutes les institutions gouvernementales qui participent aux nombreuses subventions destin?es aux particuliers et aux entreprises du Canada. Subventions Qu?bec identifie, pour vous, sous son ? Canadian Subsidy Directory ? quelques trois mille (3,500) programmes d?aide disponible aux entreprises et aux particuliers ?tablis au Canada. Ces programmes vous sont offerts sous forme de bourses, de pr?ts et de subventions de toutes sortes Subventions Qu?bec identifie, en plus des trois mille (3,000) programmes identifi?s dans le ? Canadian Subsidy Directory ?, quelques mille huit cent (1,800) programmes d?aide suppl?mentaires sous le couvert de ? l?Annuaire des Subventions au Qu?bec ?. Ce guide, destin? aux particuliers ainsi qu?aux entreprises ? la recherche de subventions et implant?es au Qu?bec, d?crit les programmes d?aide et subventions disponibles par le biais des diff?rents paliers gouvernementaux et organismes. ? L?Annuaire des Subventions au Qu?bec ? est con?u pour assister les particuliers et les entreprises dans leur recherche de financement, que ce soit sous forme de pr?ts, de bourses ou sous forme de subventions. Subventions Qu?bec offre de l?information destin?e aux particuliers pour leurs besoins personnels mais aussi aux entreprises de toutes tailles, de la soci?t? ? propri?taire unique, la PME, l?organisme ? but non lucratif jusqu?? l?entreprise multinationale. Subventions Qu?bec vous invite ? consulter ces deux (2) ouvrages pour d?couvrir plusieurs outils de financement disponibles sous formes diverses et acc?der aux subventions, aux bourses et aux pr?ts consentis par les autorit?s comp?tentes en mesurant les conditions de votre situation personnelle ou celle de votre entreprise aux crit?res d?admissibilit? en vigueur. ASQ-2008 Cd-rom(format .pdf).............$ 69.95 ASQ-2008 Imprim?(Cd-inclus) .............$ 149.95 Informations.....................ligne sans frais 1-866-322-3376 ********************************* Annuaire des Subventions au Qu?bec 14-A Des Seigneurs St-Sauveur Qc J0R 1R0 ********************************* From millibor at marketingdeimpacto.com.br Thu May 28 18:39:20 2009 From: millibor at marketingdeimpacto.com.br (Millibor) Date: Thu, 28 May 2009 18:39:20 GMT Subject: MILLIBOR Message-ID: An HTML attachment was scrubbed... URL: From benh at kernel.crashing.org Fri May 29 05:03:49 2009 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Fri, 29 May 2009 15:03:49 +1000 Subject: [PATCH] powerpc ptrace block-step In-Reply-To: <20090401215903.DE872FC3AB@magilla.sf.frob.com> References: <20090401215903.DE872FC3AB@magilla.sf.frob.com> Message-ID: <1243573429.17903.30.camel@pasglop> On Wed, 2009-04-01 at 14:59 -0700, Roland McGrath wrote: > Maynard asked about user_enable_block_step() support on powerpc. > This is the old patch I've posted before. I haven't even tried > to compile it lately, but it rebased cleanly. > > AFAIK the only reason this didn't go in several months ago was waiting > for someone to decide what the right arch_has_block_step() condition was, > i.e. if it needs to check some cpu_feature or chip identifier bits. > > I had hoped that I had passed the buck then to ppc folks to figure that out > and make it so. But it does not appear to have happened. > > Note you can drop the #define PTRACE_SINGLEBLOCK if you want to be > conservative and not touch the user (ptrace) ABI yet. Then Maynard > could beat on it with internal uses (utrace) before you worry about > whether userland expects the new ptrace request macro to exist. So the patch had some issues, such as missing clearing of DBCR0 bits, missing changes to code in traps.c to properly identify the new cause of debug interrupts, etc... I've spinned a new version, I'll post it as soon as I got to do some quick tests. It will then go into the next merge window hopefully. Note: I've verified, blockstep seems to be implemented by all the core variants -except- the old 601. Cheers, Ben. From roland at redhat.com Fri May 29 07:32:13 2009 From: roland at redhat.com (Roland McGrath) Date: Fri, 29 May 2009 00:32:13 -0700 (PDT) Subject: [PATCH] powerpc ptrace block-step In-Reply-To: Benjamin Herrenschmidt's message of Friday, 29 May 2009 15:03:49 +1000 <1243573429.17903.30.camel@pasglop> References: <20090401215903.DE872FC3AB@magilla.sf.frob.com> <1243573429.17903.30.camel@pasglop> Message-ID: <20090529073213.B7227FC2BD@magilla.sf.frob.com> Thanks! I'm very glad to finally see this ironed out by someone who actually knows about powerpc innards. Thanks, Roland From benh at kernel.crashing.org Fri May 29 07:39:21 2009 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Fri, 29 May 2009 17:39:21 +1000 Subject: [PATCH] powerpc ptrace block-step In-Reply-To: <20090529073213.B7227FC2BD@magilla.sf.frob.com> References: <20090401215903.DE872FC3AB@magilla.sf.frob.com> <1243573429.17903.30.camel@pasglop> <20090529073213.B7227FC2BD@magilla.sf.frob.com> Message-ID: <1243582761.17903.38.camel@pasglop> On Fri, 2009-05-29 at 00:32 -0700, Roland McGrath wrote: > Thanks! I'm very glad to finally see this ironed out by someone who > actually knows about powerpc innards. yeah, it's been on my todo list for some time... decided that it stayed rotting for too long. We also did a little test program to exercise wich is how I discovered the subtle difference between BookE and server. Any comment about my approach of making BookE "look like" server by sticking a single step in there ? IE. Is the semantic of stopping on the -target- of the branch what userspace expects ? Cheers, Ben. From fabienne at busiboutique.net Fri May 29 13:44:40 2009 From: fabienne at busiboutique.net (=?windows-1252?Q?Fabienne_/_BusiBoutique?=) Date: Fri, 29 May 2009 15:44:40 +0200 Subject: =?windows-1252?q?on_n=27a_jamais_imprim=E9_aussi_discr=E8tement_?= =?windows-1252?q?!?= Message-ID: <80a633d6ef45cc5b6f90bb9c092cc22e@busiboutique.net> On n'a jamais imprim? aussi discr?tement ! Introduit par SAMSUNG, le mod?le CLP-310 d'imprimante laser couleur est le plus l?ger et le plus compact. Il a recours ? la technologie No NOIS' (sans bruit) pour un fonctionnement quasiment sans bruit et le changement ais? de cartouche. L'imprimante couleur laser CLP-310 est compacte et l?g?re. Rendu des couleurs plus ?clatant. (valable jusqu'au 30 Juin 2009) CLP-310 Samsung - Laser Couleur Fonction : Impression couleur Vitesse (mono) : Jusqu'? 16 ppm en A4 Vitesse (couleur) : Jusqu'? 4 ppm en A4 M?moire/stockage : 32 Mo Prix Incroyable ! seulement 73,00 ?- HT soit 87,31 ?- TTC Cette offre est accessible sur le site, Oui je veux profiter imm?diatement de cette offre sp?ciale ! Comme d'habitude chez BusiBoutique.Com, en commandant aujourd'hui, vous recevez la livraison sous 24 ? 72 heures chez vous, ou ? l'adresse de votre choix. A tout de suite, Pour b?n?ficier de cette offre, cliquez sur le lien ci-dessus ou contactez le Service Direct. par t?l?phone au 03 88 70 50 16 ou par email ? direct at busiboutique.com . Cette offre vous est r?serv?e, non cumulable et valable uniquement chez BusiBoutique.com, dans la limite des stocks disponibles. Attention, il n'y a que quelques machines disponibles !, les premiers arriv?s seront les premiers servis. Cordialement Fabienne du Service Direct Informatique FRIESS service BusiBoutique.Com 32, rue Principale 67270 ROHR Tel. 03 88 70 50 16 - Fax 03 88 70 54 10 site : www.busiboutique.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From barstool at itd.com.hk Fri May 29 16:11:20 2009 From: barstool at itd.com.hk (barstool) Date: Fri, 29 May 2009 16:11:20 +0000 Subject: Whho Said Tom Cruise Was Insane? Message-ID: <561B45850D538E9B1A7443EA4EAFF9180C35E3@itd.com.hk> A non-text attachment was scrubbed... Name: tremolite.png Type: image/png Size: 11994 bytes Desc: not available URL: From webmaster at cnrc.navy.mil Fri May 29 19:51:44 2009 From: webmaster at cnrc.navy.mil (Kate Connor) Date: Fri, 29 May 2009 22:51:44 +0300 Subject: Become the example of insatiable bed monster! Message-ID: <20090529225144.3000602@cnrc.navy.mil> We will ship faster than anybody else to the point you indicate. http://vciu.lorcowurayf.com/ From tequila at addr4.addr.com Sat May 30 02:35:14 2009 From: tequila at addr4.addr.com (Monica arnold ) Date: Sat, 30 May 2009 04:35:14 +0200 Subject: Arbeit zu vergeben Message-ID: <567744076.86968652958963@addr4.addr.com> Fuer unser Unternehmen werden bundesweit Einkaeufer gesucht. Vorkenntnisse nicht erforderlich. Wir arbeiten Sie gruendlich ein. Auch ideal fuer Fruehrentner und Arbeitslose. Bewerbung bitte an service6 at mail.kz From hot-deals at laspalmasclub.com Sat May 30 20:18:48 2009 From: hot-deals at laspalmasclub.com (Las Palmas by the Sea) Date: Sat, 30 May 2009 16:18:48 -0400 Subject: Summer Vacations in Puerto Vallarta Message-ID: An HTML attachment was scrubbed... URL: From cask at midship.com Sun May 31 03:57:19 2009 From: cask at midship.com (cask) Date: Sun, 31 May 2009 03:57:19 +0000 Subject: Man serves year in Iraq because off clerical error Message-ID: A non-text attachment was scrubbed... Name: trollied.rtf Type: application/octet-stream Size: 355 bytes Desc: not available URL: From fiddlestick at yahoo.com Sun May 31 12:13:38 2009 From: fiddlestick at yahoo.com (Darwin Holley) Date: 31 May 2009 16:13:38 +0400 Subject: Email List of small businesses in America Message-ID: Fields like website, emails address, postal address phone, fax and more 1.95 million records all with emails, 100% verified and optin Now priced at: $292 - this offer is only valid until June 06 2009 Email us at: Otis at datalistsource.com cease further emails please send an email to exit at datalistsource.com From iodate at bigsalad.com Sun May 31 19:41:21 2009 From: iodate at bigsalad.com (Salzberg Hornoff) Date: Sun, 31 May 2009 19:41:21 +0000 Subject: Iss American society increasingly intolerant of tots? Message-ID: A non-text attachment was scrubbed... Name: civilises.rtf Type: application/octet-stream Size: 353 bytes Desc: not available URL: From cringes at 2aj.net Mon Jun 1 05:07:54 2009 From: cringes at 2aj.net (Lisena Benshoof) Date: Mon, 01 Jun 2009 05:07:54 +0000 Subject: Spanish driver sues dead crkash cyclist for damage Message-ID: <4A2361BF.4815411@2aj.net> A non-text attachment was scrubbed... Name: seriema.rtf Type: application/octet-stream Size: 353 bytes Desc: not available URL: From aromadosucesso at gmail.com Mon Jun 1 11:52:22 2009 From: aromadosucesso at gmail.com (VOCÊ NÃO PRECISA VENDER NADA) Date: Mon, 1 Jun 2009 11:52:22 GMT Subject: =?iso-8859-1?q?TENHA_UMA_EMPRESA_QUE_TRABALHA_PARA_VOC=CA?= Message-ID: An HTML attachment was scrubbed... URL: From fluidiser at gaport.fr Mon Jun 1 21:57:06 2009 From: fluidiser at gaport.fr (Osterstuck) Date: Mon, 01 Jun 2009 21:57:06 +0000 Subject: Bald man accused in hair-loss thfet Message-ID: A non-text attachment was scrubbed... Name: misbehaving.rtf Type: application/octet-stream Size: 353 bytes Desc: not available URL: From brasilvende at marketingdeimpacto.com.br Tue Jun 2 16:05:45 2009 From: brasilvende at marketingdeimpacto.com.br (Brasil Vende) Date: Tue, 2 Jun 2009 16:05:45 GMT Subject: Surpreenda-se e surpreenda !! Message-ID: An HTML attachment was scrubbed... URL: From kdeckard at activegear.com Thu Jun 4 06:24:05 2009 From: kdeckard at activegear.com (Septimus Swanson) Date: Thu, 4 Jun 2009 14:24:05 +0800 Subject: Low price for your high health! Message-ID: <20090604142405.8010603@activegear.com> Replace weakness with vigor http://cp.vasgotef.cn/ From softwareartevida at hotmail.com Thu Jun 4 14:42:23 2009 From: softwareartevida at hotmail.com (RICHARD LABARCA) Date: Thu, 4 Jun 2009 14:42:23 GMT Subject: sofware arte vida Message-ID: An HTML attachment was scrubbed... URL: -------------- next part -------------- Bem-vindos a Software arte vida Empresa criada para satisfazer as necessidades de software de a regi?o de Goi?nia e cidades pr?ximas nosso objetivo: a venda instala??o ,configura??o e suporte online de cada produtos contamos com software de premia l?nea para satisfa??o de nossos clientes O QUE ? O SUPER EMPRESA ? O Super Empresa ? um software para automa??o comercial de pequenas e m?dias empresas, que oferece um sistema modular de compras, sendo totalmente livre de mensalidade. Ramos de atividade atendidos : A?OUGUES; COM?RCIO VAREJISTA ; FARM?CIAS; LOJAS DE MATERIAL PARA CONSTRU?AO ; MADEIREIRAS E VIDRA?ARIAS; OFICINAS; VESTU?RIO; ?TICAS; PADARIAS; POSTOS DE COMBUSTIVEIS; RESTAURANTES E LANCHONETES; SUPERMERCADOS; VAREJOES E QUITANDAS. RICHARD LABARCA LEIVA FONES : (62) 3282-3544 (62) 8524-8589 www.softwareartevida.webnode.com MSN ; softwareartevida at hotmail.com skype : richardrevendas ALAMEDA DO CONTORNO , Qd: 04 lt: 13 JARDIM DA LUZ GOIANIA GO From customer-care at costaclubresort.com Thu Jun 4 15:25:11 2009 From: customer-care at costaclubresort.com (Costa Club Resort) Date: Thu, 4 Jun 2009 11:25:11 -0400 Subject: Summer Vacations in Puerto Vallarta Message-ID: <259ccfbb7268719bfb1d8188ced71bf4@vallarta-paradise.com> An HTML attachment was scrubbed... URL: From softwareartevida at hotmail.com Fri Jun 5 13:24:50 2009 From: softwareartevida at hotmail.com (RICHARD LABARCA) Date: Fri, 5 Jun 2009 13:24:50 GMT Subject: sofware arte vida Message-ID: An HTML attachment was scrubbed... URL: From softwareartevida at hotmail.com Fri Jun 5 17:29:52 2009 From: softwareartevida at hotmail.com (RICHARD LABARCA) Date: Fri, 5 Jun 2009 17:29:52 GMT Subject: sofware arte vida Message-ID: An HTML attachment was scrubbed... URL: From mjbelcher at marion.net Sat Jun 6 13:10:27 2009 From: mjbelcher at marion.net (MJ Belcher PHD) Date: Sat, 6 Jun 2009 09:10:27 -0400 Subject: HarpHouse.com and Kharps.com Message-ID: <200906061325.n56DPjeU028235@mx3.redhat.com> We are back!! Both HarpHouse.com and Kharps.com are back on the internet. After Scott McGowan went out of business, out of necessity, he had to let his websites lapse so that they were no longer on the internet. However, Buckeye Trading was able to acquire them from Scott. Buckeye Trading genuinely and sincerely desires to continue the use and maintenance of these websites, in large part, to honor the contribution that the McGowan family has made to the development of the harmonica industry in America. These two websites are the two oldest established, harmonica-dedicated, websites in America, of which we are aware. You can help by coming back to shopping at these two websites. Our new parent company, Buckeye Trading, has also been able to acquire all of the rights to the Soul's Voice, which will now be sold only at the reseller owned by Buckeye Trading, which includes both www.HarpHouse.com and www.Kharps.com. If you order any Soul's Voices in June and write the discount code SVJUN0609 in the comments section, you will receive $2.00 off on each Soul's Voice. However, you must order from one of the Buckeye Trading websites, which includes HarpHouse.com and Kharps.com. A telephone order will not get the discount. Incidentally, the only authorized resellers of the Soul's Voice are HarpDepot.com, HarpExpress.com, 1st-In-Harmonicas.com, ArmonicaDepot.com, SoulsVoiceHarmonica.com, HarpHouse.com, and Kharps.com. If you do not to receive any more emails from this sender, just send a reply with the word REMOVE, and your email address will be removed from the email list. From graficarmc at pop.com.br Sun Jun 7 01:29:00 2009 From: graficarmc at pop.com.br (RMC Visual) Date: Sun, 7 Jun 2009 01:29:00 GMT Subject: Tudo que necessita !!! Message-ID: <20090607012904.297EC6CA38D7@postfix41.rmcvisual.com> An HTML attachment was scrubbed... URL: From eradicably at ciba.com.ar Sun Jun 7 20:40:32 2009 From: eradicably at ciba.com.ar (eradicably) Date: Sun, 07 Jun 2009 20:40:32 +0000 Subject: Naaked driver found on Pa. road Message-ID: <3c68cb20090607204130@ciba.com.ar> A non-text attachment was scrubbed... Name: malignantly.rtf Type: application/octet-stream Size: 356 bytes Desc: not available URL: From taryn_anne_ig at usit.net Sun Jun 7 22:13:34 2009 From: taryn_anne_ig at usit.net (Taryn Anne) Date: Sun, 07 Jun 2009 15:13:34 -0700 Subject: rfqa yt In-Reply-To: Message-ID: <1244412814.3346@usit.net> rfm From mardellvirgiexw at alden-smith.co.uk Sun Jun 7 22:21:10 2009 From: mardellvirgiexw at alden-smith.co.uk (Mardell Virgie) Date: Sun, 07 Jun 2009 15:21:10 -0700 Subject: enmux w33 In-Reply-To: Message-ID: <1244413270.4249@alden-smith.co.uk> qf From cecilyshandixy at trypeder.force9.co.uk Sun Jun 7 22:24:44 2009 From: cecilyshandixy at trypeder.force9.co.uk (Cecily Shandi) Date: Sun, 07 Jun 2009 15:24:44 -0700 Subject: txe f6m In-Reply-To: Message-ID: <1244413484.8897@trypeder.force9.co.uk> ffu From cedeg at marketingdeimpacto.com.br Mon Jun 8 12:51:07 2009 From: cedeg at marketingdeimpacto.com.br (CEDEG) Date: Mon, 8 Jun 2009 12:51:07 GMT Subject: =?iso-8859-1?q?A_grande_evolu=E7=E3o_na_=E1rea_de_vendas?= Message-ID: An HTML attachment was scrubbed... URL: From newsletter at usbportugal.com Tue Jun 9 04:56:00 2009 From: newsletter at usbportugal.com (USBPortugal.com) Date: Tue, 9 Jun 2009 06:56:00 +0200 Subject: =?iso-8859-1?q?J=E1_n=E3o_h=E1_mem=F3ria_de=2E=2E=2ESemana_23?= Message-ID: <6356acd58b9cf7c6016ee61fe497cfec@newsletter.usbportugal.com> An HTML attachment was scrubbed... URL: From hot-deals at clubvacationdeals.com Wed Jun 10 20:29:32 2009 From: hot-deals at clubvacationdeals.com (Club Vacation Deals) Date: Wed, 10 Jun 2009 16:29:32 -0400 Subject: Summer Vacations in Puerto Vallarta Message-ID: <5f57a6e395a57221f54329e9fed255bd@vallarta-paradise.com> An HTML attachment was scrubbed... URL: From srikar at linux.vnet.ibm.com Thu Jun 11 16:05:39 2009 From: srikar at linux.vnet.ibm.com (Srikar Dronamraju) Date: Thu, 11 Jun 2009 21:35:39 +0530 Subject: [RESEND] [PATCH 0/7] Ubp, Ssol and Uprobes Message-ID: <20090611160539.GA20668@linux.vnet.ibm.com> I am resending this patch as people reported that they didn't receive my earlier mail. Hi, This patchset implements uprobes over utrace. Please review the patchset and provide your valuable comments. These patches have been tested on the current utrace tree(commit id cf890ad46816982f3b8b5064d2f2bb91968ded43) This patchset requires Roland's utrace implementation + Jim and Masami's x86 instruction analysis layer. Patches 1 to 6 implement uprobes Patch 7 implements ftrace plugin Based on your feedback, I was looking at posting these patches to LKML some time next week. [PATCH 0/7] Ubp, Ssol and Uprobes [PATCH 1/7] User Space Breakpoint Assistance Layer (UBP) [PATCH 2/7] x86 support for UBP [PATCH 3/7] Execution out of line (XOL) [PATCH 4/7] Uprobes Implementation [PATCH 5/7] x86 support for Uprobes [PATCH 6/7] Uprobes documentation. [PATCH 7/7] Ftrace plugin for Uprobes. -- Thanks and Regards Srikar From srikar at linux.vnet.ibm.com Thu Jun 11 16:09:10 2009 From: srikar at linux.vnet.ibm.com (Srikar Dronamraju) Date: Thu, 11 Jun 2009 21:39:10 +0530 Subject: [RESEND] [PATCH 1/7] User Space Breakpoint Assistance Layer (UBP) In-Reply-To: <20090611160539.GA20668@linux.vnet.ibm.com> References: <20090611160539.GA20668@linux.vnet.ibm.com> Message-ID: <20090611160910.GA21218@linux.vnet.ibm.com> User space breakpointing infrastructure(UBP) User space breakpointing Infrastructure provides kernel subsystems with architecture independent interface to establish breakpoints in user applications. This patch provides core implementation of ubp and also wrappers for architecture dependent methods. UBP currently supports both single stepping inline and execution out of line strategies. Two different probepoints in the same process can have two different strategies. You need to follow this up with the UBP patch for your architecture. Signed-off-by: Jim Keniston Signed-off-by: Srikar Dronamraju --- arch/Kconfig | 12 + include/linux/ubp.h | 282 ++++++++++++++++++++++++++++++ kernel/Makefile | 1 kernel/ubp_core.c | 479 ++++++++++++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 774 insertions(+) Index: uprobes.git/arch/Kconfig =================================================================== --- uprobes.git.orig/arch/Kconfig +++ uprobes.git/arch/Kconfig @@ -44,6 +44,15 @@ config KPROBES for kernel debugging, non-intrusive instrumentation and testing. If in doubt, say "N". +config UBP + bool "User-space breakpoint assistance (EXPERIMENTAL)" + depends on MODULES + depends on HAVE_UBP + help + Ubp enables kernel subsystems to establish breakpoints + in user applications. This service is used by components + such as uprobes. If in doubt, say "N". + config HAVE_EFFICIENT_UNALIGNED_ACCESS bool help @@ -70,6 +79,9 @@ config KRETPROBES def_bool y depends on KPROBES && HAVE_KRETPROBES +config HAVE_UBP + def_bool n + config HAVE_IOREMAP_PROT bool Index: uprobes.git/include/linux/ubp.h =================================================================== --- /dev/null +++ uprobes.git/include/linux/ubp.h @@ -0,0 +1,282 @@ +#ifndef _LINUX_UBP_H +#define _LINUX_UBP_H +/* + * User-space BreakPoint support (ubp) + * include/linux/ubp.h + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. + * + * Copyright (C) IBM Corporation, 2008, 2009 + */ + +#include +struct task_struct; +struct pt_regs; + +/** + * Strategy hints: + * + * %UBP_HNT_INLINE: Specifies that the instruction must + * be single-stepped inline. Can be set by the caller of + * @arch->analyze_insn() -- e.g., if caller is out of XOL slots -- + * or by @arch->analyze_insn() if there's no viable XOL strategy + * for that instruction. Set in arch->strategies if the architecture + * doesn't implement XOL. + * + * %UBP_HNT_PERMSL: Specifies that the instruction slot whose + * address is @ubp->xol_vaddr is assigned to @ubp for the life of + * the process. Can be used by @arch->analyze_insn() to simplify + * XOL in some cases. Ignored in @arch->strategies. + * + * %UBP_HNT_TSKINFO: Set in @arch->strategies if the architecture's + * XOL handling requires the preservation of special + * task-specific info between the calls to @arch->pre_xol() + * and @arch->post_xol(). (E.g., XOL of x86_64 rip-relative + * instructions uses a scratch register, whose value is saved + * by pre_xol() and restored by post_xol().) The caller + * of @arch->analyze_insn() should set %UBP_HNT_TSKINFO in + * @ubp->strategy if it's set in @arch->strategies and the caller + * can maintain a @ubp_task_arch_info object for each probed task. + * @arch->analyze_insn() should leave this flag set in @ubp->strategy + * if it needs to use the per-task @ubp_task_arch_info object. + */ +#define UBP_HNT_INLINE 0x1 /* Single-step this insn inline. */ +#define UBP_HNT_TSKINFO 0x2 /* XOL requires ubp_task_arch_info */ +#define UBP_HNT_PERMSL 0x4 /* XOL slot assignment is permanent */ + +#define UBP_HNT_MASK 0x7 + +/** + * struct ubp_bkpt - user-space breakpoint/probepoint + * + * @vaddr: virtual address of probepoint + * @xol_vaddr: virtual address of XOL slot assigned to this probepoint + * @opcode: copy of opcode at @vaddr + * @insn: typically a copy of the instruction at @vaddr. More + * precisely, this is the instruction (stream) that will be + * executed in place of the original instruction. + * @strategy: hints about how this instruction will be executed + * @fixups: set of fixups to be executed by @arch->post_xol() + * @arch_info: architecture-specific info about this probepoint + */ +struct ubp_bkpt { + unsigned long vaddr; + unsigned long xol_vaddr; + ubp_opcode_t opcode; + u8 insn[UBP_XOL_SLOT_BYTES]; + u16 strategy; + u16 fixups; + struct ubp_bkpt_arch_info arch_info; +}; + +/* Post-execution fixups. Some architectures may define others. */ +#define UPB_FIX_NONE 0x0 /* No fixup needed */ +#define UBP_FIX_IP 0x1 /* Adjust IP back to vicinity of actual insn */ +#define UBP_FIX_CALL 0x2 /* Adjust the return address of a call insn */ + +#ifndef UPB_FIX_DEFAULT +#define UPB_FIX_DEFAULT UBP_FIX_IP +#endif + +#if defined(CONFIG_UBP) +extern int ubp_init(u16 *strategies); +extern int ubp_insert_bkpt(struct task_struct *tsk, struct ubp_bkpt *ubp); +extern unsigned long ubp_get_bkpt_addr(struct pt_regs *regs); +extern int ubp_pre_sstep(struct task_struct *tsk, struct ubp_bkpt *ubp, + struct ubp_task_arch_info *tskinfo, struct pt_regs *regs); +extern int ubp_post_sstep(struct task_struct *tsk, struct ubp_bkpt *ubp, + struct ubp_task_arch_info *tskinfo, struct pt_regs *regs); +extern int ubp_cancel_xol(struct task_struct *tsk, struct ubp_bkpt *ubp); +extern int ubp_remove_bkpt(struct task_struct *tsk, struct ubp_bkpt *ubp); +extern int ubp_validate_insn_addr(struct task_struct *tsk, + unsigned long vaddr); +extern void ubp_set_ip(struct pt_regs *regs, unsigned long vaddr); +#else /* CONFIG_UBP */ +static inline int ubp_init(u16 *strategies) +{ + return -ENOSYS; +} +static inline int ubp_insert_bkpt(struct task_struct *tsk, + struct ubp_bkpt *ubp) +{ + return -ENOSYS; +} +static inline unsigned long ubp_get_bkpt_addr(struct pt_regs *regs) +{ + return -ENOSYS; +} +static inline int ubp_pre_sstep(struct task_struct *tsk, + struct ubp_bkpt *ubp, struct ubp_task_arch_info *tskinfo, + struct pt_regs *regs) +{ + return -ENOSYS; +} +static inline int ubp_post_sstep(struct task_struct *tsk, + struct ubp_bkpt *ubp, struct ubp_task_arch_info *tskinfo, + struct pt_regs *regs) +{ + return -ENOSYS; +} +static inline int ubp_cancel_xol(struct task_struct *tsk, + struct ubp_bkpt *ubp) +{ + return -ENOSYS; +} +static inline int ubp_remove_bkpt(struct task_struct *tsk, + struct ubp_bkpt *ubp) +{ + return -ENOSYS; +} +static inline int ubp_validate_insn_addr(struct task_struct *tsk, + unsigned long vaddr) +{ + return -ENOSYS; +} +static inline void ubp_set_ip(struct pt_regs *regs, unsigned long vaddr) +{ +} +#endif /* CONFIG_UBP */ + +#ifdef UBP_IMPLEMENTATION +/** + * struct ubp_arch_info - architecture-specific parameters and functions + * + * Most architectures can use the default versions of @read_opcode(), + * @set_bkpt(), @set_orig_insn(), and @is_bkpt_insn(); ia64 is an + * exception. All functions (including @validate_address()) can assume + * that the caller has verified that the probepoint's virtual address + * resides in an executable VM area. + * + * @bkpt_insn: + * The architecture's breakpoint instruction. This is used by + * the default versions of @set_bkpt(), @set_orig_insn(), and + * @is_bkpt_insn(). + * @ip_advancement_by_bkpt_insn: + * The number of bytes the instruction pointer is advanced by + * this architecture's breakpoint instruction. For example, after + * the powerpc trap instruction executes, the ip still points to the + * breakpoint instruction (ip_advancement_by_bkpt_insn = 0); but the + * x86 int3 instruction (1 byte) advances the ip past the int3 + * (ip_advancement_by_bkpt_insn = 1). + * @max_insn_bytes: + * The maximum length, in bytes, of an instruction in this + * architecture. This must be <= UBP_XOL_SLOT_BYTES; + * @strategies: + * Bit-map of %UBP_HNT_* values recognized by this architecture. + * Include %UBP_HNT_INLINE iff this architecture doesn't support + * execution out of line. Include %UBP_HNT_TSKINFO if + * XOL of at least some instructions requires communication of + * per-task state between @pre_xol() and @post_xol(). + * @set_ip: + * Set the instruction pointer in @regs to @vaddr. + * @validate_address: + * Return 0 if @vaddr is a valid instruction address, or a negative + * errno (typically -%EINVAL) otherwise. If you don't provide + * @validate_address(), any address will be accepted. Caller + * guarantees that @vaddr is in an executable VM area. This + * function typically just enforces arch-specific instruction + * alignment. + * @read_opcode: + * For task @tsk, read the opcode at @vaddr and store it in + * @opcode. Return 0 (success) or a negative errno. Defaults to + * @ubp_read_opcode(). + * @set_bkpt: + * For task @tsk, store @bkpt_insn at @ubp->vaddr. Return 0 + * (success) or a negative errno. Defaults to @ubp_set_bkpt(). + * @set_orig_insn: + * For task @tsk, restore the original opcode (@ubp->opcode) at + * @ubp->vaddr. If @check is true, first verify that there's + * actually a breakpoint instruction there. Return 0 (success) or + * a negative errno. Defaults to @ubp_set_orig_insn(). + * @is_bkpt_insn: + * Return %true if @ubp->opcode is @bkpt_insn. Defaults to + * @ubp_is_bkpt_insn(), which just tests (ubp->opcode == + * arch->bkpt_insn). + * @analyze_insn: + * Analyze @ubp->insn. Return 0 if @ubp->insn is an instruction + * you can probe, or a negative errno (typically -%EPERM) + * otherwise. The caller sets @ubp->strategy to %UBP_HNT_INLINE + * to suppress XOL for this instruction (e.g., because we're + * out of XOL slots). If the instruction can be probed but + * can't be executed out of line, set @ubp->strategy to + * %UBP_HNT_INLINE. Otherwise, determine what sort of XOL-related + * fixups @post_xol() (and possibly @pre_xol()) will need + * to do for this instruction, and annotate @ubp accordingly. + * You may modify @ubp->insn (e.g., the x86_64 port does this + * for rip-relative instructions), but if you do so, you should + * retain a copy in @ubp->arch_info in case you have to revert + * to single-stepping inline (see @cancel_xol()). + * @pre_xol: + * Called just before executing the instruction associated + * with @ubp out of line. @ubp->xol_vaddr is the address in + * @tsk's virtual address space where @ubp->insn has been copied. + * @pre_xol() should at least set the instruction pointer in + * @regs to @ubp->xol_vaddr -- which is what the default, + * @ubp_pre_xol(), does. If @ubp->strategy includes the + * %UBP_HNT_TSKINFO flag, then @tskinfo points to a per-task + * copy of struct ubp_task_arch_info. + * @post_xol: + * Called after executing the instruction associated with + * @ubp out of line. @post_xol() should perform the fixups + * specified in @ubp->fixups, which includes ensuring that the + * instruction pointer in @regs points at the next instruction in + * the probed instruction stream. @tskinfo is as for @pre_xol(). + * You must provide this function. + * @cancel_xol: + * The instruction associated with @ubp cannot be executed + * out of line after all. (This can happen when XOL slots + * are lazily assigned, and we run out of slots before we + * hit this breakpoint. This function should never be called + * if @analyze_insn() was previously called for @ubp with a + * non-zero value of @ubp->xol_vaddr and with %UBP_HNT_PERMSL + * set in @ubp->strategy.) Adjust @ubp as needed so it can be + * single-stepped inline. Omit this function if you don't need it. + */ + +struct ubp_arch_info { + ubp_opcode_t bkpt_insn; + u8 ip_advancement_by_bkpt_insn; + u8 max_insn_bytes; + u16 strategies; + void (*set_ip)(struct pt_regs *regs, unsigned long vaddr); + int (*validate_address)(struct task_struct *tsk, unsigned long vaddr); + int (*read_opcode)(struct task_struct *tsk, unsigned long vaddr, + ubp_opcode_t *opcode); + int (*set_bkpt)(struct task_struct *tsk, struct ubp_bkpt *ubp); + int (*set_orig_insn)(struct task_struct *tsk, + struct ubp_bkpt *ubp, bool check); + bool (*is_bkpt_insn)(struct ubp_bkpt *ubp); + int (*analyze_insn)(struct task_struct *tsk, struct ubp_bkpt *ubp); + int (*pre_xol)(struct task_struct *tsk, struct ubp_bkpt *ubp, + struct ubp_task_arch_info *tskinfo, + struct pt_regs *regs); + int (*post_xol)(struct task_struct *tsk, struct ubp_bkpt *ubp, + struct ubp_task_arch_info *tskinfo, + struct pt_regs *regs); + void (*cancel_xol)(struct task_struct *tsk, struct ubp_bkpt *ubp); +}; + +/* Unexported functions & macros for use by arch-specific code */ +#define ubp_opcode_sz ((unsigned int)(sizeof(ubp_opcode_t))) +extern int ubp_read_vm(struct task_struct *tsk, unsigned long vaddr, + void *kbuf, int nbytes); +extern int ubp_write_data(struct task_struct *tsk, unsigned long vaddr, + const void *kbuf, int nbytes); + +extern struct ubp_arch_info ubp_arch_info; + +#endif /* UBP_IMPLEMENTATION */ + +#endif /* _LINUX_UBP_H */ Index: uprobes.git/kernel/ubp_core.c =================================================================== --- /dev/null +++ uprobes.git/kernel/ubp_core.c @@ -0,0 +1,479 @@ +/* + * User-space BreakPoint support (ubp) + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. + * + * Copyright (C) IBM Corporation, 2008, 2009 + */ + +#define UBP_IMPLEMENTATION 1 + +#include +#include +#include +#include +#include +#include +#include +#include + +/* + * TODO: Resolve verbosity. ubp_insert_bkpt() is the only function + * that reports failures via printk. + */ + +static struct ubp_arch_info *arch = &ubp_arch_info; + +static bool ubp_uses_xol(u16 strategy) +{ + return !(strategy & UBP_HNT_INLINE); +} + +static bool validate_strategy(u16 strategy, u16 valid_bits) +{ + return ((strategy & (~valid_bits)) == 0); +} + +/** + * ubp_init - initialize the ubp data structures + * @strategies indicates which breakpoint-related strategies are + * supported by the client: + * %UBP_HNT_INLINE: Client supports only single-stepping inline. + * Otherwise client must provide an instruction slot + * (UBP_XOL_SLOT_BYTES bytes) in the probed process's address + * space for each instruction to be executed out of line. + * %UBP_HNT_TSKINFO: Client can provide and maintain one + * @ubp_task_arch_info object for each probed task. (Failure to + * support this will prevent XOL of rip-relative instructions on + * x86_64, at least.) + * Upon return, @strategies is updated to reflect those strategies + * required by this particular architecture's implementation of ubp: + * %UBP_HNT_INLINE: Architecture or client supports only + * single-stepping inline. + * %UBP_HNT_TSKINFO: Architecture uses @ubp_task_arch_info, and will + * expect it to be passed to @ubp_pre_sstep() and @ubp_post_sstep() + * as needed (see @ubp_insert_bkpt()). + * Possible errors: + * -%ENOSYS: ubp not supported for this architecture. + * -%EINVAL: unrecognized flags in @strategies + */ +int ubp_init(u16 *strategies) +{ + u16 inline_bit, tskinfo_bit; + u16 client_strategies = *strategies; + + if (!validate_strategy(client_strategies, + UBP_HNT_INLINE | UBP_HNT_TSKINFO)) + return -EINVAL; + + inline_bit = (client_strategies | arch->strategies) & UBP_HNT_INLINE; + tskinfo_bit = (client_strategies & arch->strategies) & UBP_HNT_TSKINFO; + *strategies = (inline_bit | tskinfo_bit); + return 0; +} + +/* + * Read @nbytes at @vaddr from @tsk into @kbuf. Return number of bytes read. + * Not exported, but available for use by arch-specific ubp code. + */ +int ubp_read_vm(struct task_struct *tsk, unsigned long vaddr, + void *kbuf, int nbytes) +{ + if (tsk == current) { + int nleft = copy_from_user(kbuf, (void __user *) vaddr, + nbytes); + return nbytes - nleft; + } else + return access_process_vm(tsk, vaddr, kbuf, nbytes, 0); +} + +/* + * Write @nbytes from @kbuf at @vaddr in @tsk. Return number of bytes written. + * Can be used to write to stack or data VM areas, but not instructions. + * Not exported, but available for use by arch-specific ubp code. + */ +int ubp_write_data(struct task_struct *tsk, unsigned long vaddr, + const void *kbuf, int nbytes) +{ + int nleft; + + if (tsk == current) { + nleft = copy_to_user((void __user *) vaddr, kbuf, nbytes); + return nbytes - nleft; + } else + return access_process_vm(tsk, vaddr, (void *) kbuf, + nbytes, 1); +} + +static int ubp_write_opcode(struct task_struct *tsk, unsigned long vaddr, + ubp_opcode_t opcode) +{ + int result; + + result = access_process_vm(tsk, vaddr, &opcode, ubp_opcode_sz, 1); + return (result == ubp_opcode_sz ? 0 : -EFAULT); +} + +/* Default implementation of arch->read_opcode */ +static int ubp_read_opcode(struct task_struct *tsk, unsigned long vaddr, + ubp_opcode_t *opcode) +{ + int bytes_read; + + bytes_read = ubp_read_vm(tsk, vaddr, opcode, ubp_opcode_sz); + return (bytes_read == ubp_opcode_sz ? 0 : -EFAULT); +} + +/* Default implementation of arch->set_bkpt */ +static int ubp_set_bkpt(struct task_struct *tsk, struct ubp_bkpt *ubp) +{ + return ubp_write_opcode(tsk, ubp->vaddr, arch->bkpt_insn); +} + +/* Default implementation of arch->set_orig_insn */ +static int ubp_set_orig_insn(struct task_struct *tsk, struct ubp_bkpt *ubp, + bool check) +{ + if (check) { + ubp_opcode_t opcode; + int result = arch->read_opcode(tsk, ubp->vaddr, &opcode); + if (result) + return result; + if (opcode != arch->bkpt_insn) + return -EINVAL; + } + return ubp_write_opcode(tsk, ubp->vaddr, ubp->opcode); +} + +/* Return 0 if vaddr is in an executable VM area, or -EINVAL otherwise. */ +static inline int ubp_check_vma(struct task_struct *tsk, unsigned long vaddr) +{ + struct vm_area_struct *vma; + struct mm_struct *mm; + int ret = -EINVAL; + + mm = get_task_mm(tsk); + if (!mm) + return -EINVAL; + down_read(&mm->mmap_sem); + vma = find_vma(mm, vaddr); + if (vma && vaddr >= vma->vm_start && (vma->vm_flags & VM_EXEC)) + ret = 0; + up_read(&mm->mmap_sem); + mmput(mm); + return ret; +} + +/** + * ubp_validate_insn_addr - Validate if the instruction is an + * executable vma. + * Returns 0 if the vaddr is a valid instruction address. + * @tsk: the probed task + * @vaddr: virtual address of the instruction to be verified. + * + * Possible errors: + * -%EINVAL: Instruction passed is not a valid instruction address. + */ +int ubp_validate_insn_addr(struct task_struct *tsk, unsigned long vaddr) +{ + int result; + + result = ubp_check_vma(tsk, vaddr); + if (result != 0) + return result; + if (arch->validate_address) + result = arch->validate_address(tsk, vaddr); + return result; +} + +static void ubp_bkpt_insertion_failed(struct task_struct *tsk, + struct ubp_bkpt *ubp, const char *why) +{ + printk(KERN_ERR "Can't place breakpoint at pid %d vaddr %#lx: %s\n", + tsk->pid, ubp->vaddr, why); +} + +/** + * ubp_insert_bkpt - insert breakpoint + * Insert a breakpoint into the process that includes @tsk, at the + * virtual address @ubp->vaddr. + * + * @ubp->strategy affects how this breakpoint will be handled: + * %UBP_HNT_INLINE: Probed instruction will be single-stepped inline. + * %UBP_HNT_TSKINFO: As above. + * %UBP_HNT_PERMSL: An XOL instruction slot in the probed process's + * address space has been allocated to this probepoint, and will + * remain so allocated as long as it's needed. @ubp->xol_vaddr is + * its address. (This slot can be reallocated if + * @ubp_insert_bkpt() fails.) The client is NOT required to + * allocate an instruction slot before calling @ubp_insert_bkpt(). + * @ubp_insert_bkpt() updates @ubp->strategy as needed: + * %UBP_HNT_INLINE: Architecture or client cannot do XOL for this + * probepoint. + * %UBP_HNT_TSKINFO: @ubp_task_arch_info will be used for this + * probepoint. + * + * All threads of the probed process must be stopped while + * @ubp_insert_bkpt() runs. + * + * Possible errors: + * -%ENOSYS: ubp not supported for this architecture + * -%EINVAL: unrecognized/invalid strategy flags + * -%EINVAL: invalid instruction address + * -%EEXIST: breakpoint instruction already exists at that address + * -%EPERM: cannot probe this instruction + * -%EFAULT: failed to insert breakpoint instruction + * [TBD: Validate xol_vaddr?] + */ +int ubp_insert_bkpt(struct task_struct *tsk, struct ubp_bkpt *ubp) +{ + int result, len; + + BUG_ON(!tsk || !ubp); + if (!validate_strategy(ubp->strategy, UBP_HNT_MASK)) + return -EINVAL; + + result = ubp_validate_insn_addr(tsk, ubp->vaddr); + if (result != 0) + return result; + + /* + * If ubp_read_vm() transfers fewer bytes than the maximum + * instruction size, assume that the probed instruction is smaller + * than the max and near the end of the last page of instructions. + * But there must be room at least for a breakpoint-size instruction. + */ + len = ubp_read_vm(tsk, ubp->vaddr, ubp->insn, arch->max_insn_bytes); + if (len < ubp_opcode_sz) { + ubp_bkpt_insertion_failed(tsk, ubp, + "error reading original instruction"); + return -EFAULT; + } + memcpy(&ubp->opcode, ubp->insn, ubp_opcode_sz); + if (arch->is_bkpt_insn(ubp)) { + ubp_bkpt_insertion_failed(tsk, ubp, + "bkpt already exists at that addr"); + return -EEXIST; + } + + result = arch->analyze_insn(tsk, ubp); + if (result < 0) { + ubp_bkpt_insertion_failed(tsk, ubp, + "instruction type cannot be probed"); + return result; + } + + result = arch->set_bkpt(tsk, ubp); + if (result < 0) { + ubp_bkpt_insertion_failed(tsk, ubp, + "failed to insert bkpt instruction"); + return result; + } + return 0; +} + +/** + * ubp_pre_sstep - prepare to single-step the probed instruction + * @tsk: the probed task + * @ubp: the probepoint information, as returned by @ubp_insert_bkpt(). + * Unless the %UBP_HNT_INLINE flag is set in @ubp->strategy, + * @ubp->xol_vaddr must be the address of an XOL instruction slot + * that is allocated to this probepoint at least until after the + * completion of @ubp_post_sstep(), and populated with the contents + * of @ubp->insn. [Need to be more precise here to account for + * untimely exit or UBP_HNT_BOOSTED.] + * @tskinfo: points to a @ubp_task_arch_info object for @tsk, if + * the %UBP_HNT_TSKINFO flag is set in @ubp->strategy. + * @regs: reflects the saved user state of @tsk. @ubp_pre_sstep() + * adjusts this. In particular, the instruction pointer is set + * to the instruction to be single-stepped. + * Possible errors: + * -%EFAULT: Failed to read or write @tsk's address space as needed. + * + * The client must ensure that the contents of @ubp are not + * changed during the single-step operation -- i.e., between when + * @ubp_pre_sstep() is called and when @ubp_post_sstep() returns. + * Additionally, if single-stepping inline is used for this probepoint, + * the client must serialize the single-step operation (so multiple + * threads don't step on each other while the opcode replacement is + * taking place). + */ +int ubp_pre_sstep(struct task_struct *tsk, struct ubp_bkpt *ubp, + struct ubp_task_arch_info *tskinfo, struct pt_regs *regs) +{ + int result; + + BUG_ON(!tsk || !ubp || !regs); + if (ubp_uses_xol(ubp->strategy)) { + BUG_ON(!ubp->xol_vaddr); + return arch->pre_xol(tsk, ubp, tskinfo, regs); + } + + /* + * Single-step this instruction inline. Replace the breakpoint + * with the original opcode. + */ + result = arch->set_orig_insn(tsk, ubp, false); + if (result == 0) + arch->set_ip(regs, ubp->vaddr); + return result; +} + +/** + * ubp_post_sstep - prepare to resume execution after single-step + * @tsk: the probed task + * @ubp: the probepoint information, as with @ubp_pre_sstep() + * @tskinfo: the @ubp_task_arch_info object, if any, passed to + * @ubp_pre_sstep() + * @regs: reflects the saved state of @tsk after the single-step + * operation. @ubp_post_sstep() adjusts @tsk's state as needed, + * including pointing the instruction pointer at the instruction + * following the probed instruction. + * Possible errors: + * -%EFAULT: Failed to read or write @tsk's address space as needed. + */ +int ubp_post_sstep(struct task_struct *tsk, struct ubp_bkpt *ubp, + struct ubp_task_arch_info *tskinfo, struct pt_regs *regs) +{ + BUG_ON(!tsk || !ubp || !regs); + if (ubp_uses_xol(ubp->strategy)) + return arch->post_xol(tsk, ubp, tskinfo, regs); + + /* + * Single-stepped this instruction inline. Put the breakpoint + * instruction back. + */ + return arch->set_bkpt(tsk, ubp); +} + +/** + * ubp_cancel_xol - cancel XOL for this probepoint + * @tsk: a task in the probed process + * @ubp: the probepoint information + * Switch @ubp's single-stepping strategy from out-of-line to inline. + * If the client employs lazy XOL-slot allocation, it can call + * this function if it determines that it can't provide an XOL + * slot for @ubp. @ubp_cancel_xol() adjusts @ubp appropriately. + * + * @ubp_cancel_xol()'s behavior is undefined if @ubp_pre_sstep() has + * already been called for @ubp. + * + * Possible errors: + * Can't think of any yet. + */ +int ubp_cancel_xol(struct task_struct *tsk, struct ubp_bkpt *ubp) +{ + if (arch->cancel_xol) + arch->cancel_xol(tsk, ubp); + ubp->strategy |= UBP_HNT_INLINE; + return 0; +} + +/** + * ubp_get_bkpt_addr - compute address of bkpt given post-bkpt regs + * @regs: Reflects the saved state of the task after it has hit a breakpoint + * instruction. Return the address of the breakpoint instruction. + */ +unsigned long ubp_get_bkpt_addr(struct pt_regs *regs) +{ + return instruction_pointer(regs) - arch->ip_advancement_by_bkpt_insn; +} + +/** + * ubp_remove_bkpt - remove breakpoint + * For the process that includes @tsk, remove the breakpoint specified + * by @ubp, restoring the original opcode. + * + * Possible errors: + * -%EINVAL: @ubp->vaddr is not a valid instruction address. + * -%ENOENT: There is no breakpoint instruction at @ubp->vaddr. + * -%EFAULT: Failed to read/write @tsk's address space as needed. + */ +int ubp_remove_bkpt(struct task_struct *tsk, struct ubp_bkpt *ubp) +{ + if (ubp_validate_insn_addr(tsk, ubp->vaddr) != 0) + return -EINVAL; + return arch->set_orig_insn(tsk, ubp, true); +} + +void ubp_set_ip(struct pt_regs *regs, unsigned long vaddr) +{ + arch->set_ip(regs, vaddr); +} + +/* Default implementation of arch->is_bkpt_insn */ +static bool ubp_is_bkpt_insn(struct ubp_bkpt *ubp) +{ + return (ubp->opcode == arch->bkpt_insn); +} + +/* Default implementation of arch->pre_xol */ +static int ubp_pre_xol(struct task_struct *tsk, struct ubp_bkpt *ubp, + struct ubp_task_arch_info *tskinfo, struct pt_regs *regs) +{ + arch->set_ip(regs, ubp->xol_vaddr); + return 0; +} + +/* Validate arch-specific info during ubp initialization. */ + +static int ubp_bad_arch_param(const char *param_name, int value) +{ + printk(KERN_ERR "ubp: bad value %d/%#x for parameter %s" + " in ubp_arch_info\n", value, value, param_name); + return -ENOSYS; +} + +static int ubp_missing_arch_func(const char *func_name) +{ + printk(KERN_ERR "ubp: ubp_arch_info lacks required function: %s\n", + func_name); + return -ENOSYS; +} + +static int __init init_ubp(void) +{ + int result = 0; + + /* Accept any value of bkpt_insn. */ + if (arch->max_insn_bytes < 1) + result = ubp_bad_arch_param("max_insn_bytes", + arch->max_insn_bytes); + if (arch->ip_advancement_by_bkpt_insn > arch->max_insn_bytes) + result = ubp_bad_arch_param("ip_advancement_by_bkpt_insn", + arch->ip_advancement_by_bkpt_insn); + /* Accept any value of strategies. */ + if (!arch->set_ip) + result = ubp_missing_arch_func("set_ip"); + /* Null validate_address() is OK. */ + if (!arch->read_opcode) + arch->read_opcode = ubp_read_opcode; + if (!arch->set_bkpt) + arch->set_bkpt = ubp_set_bkpt; + if (!arch->set_orig_insn) + arch->set_orig_insn = ubp_set_orig_insn; + if (!arch->is_bkpt_insn) + arch->is_bkpt_insn = ubp_is_bkpt_insn; + if (!arch->analyze_insn) + result = ubp_missing_arch_func("analyze_insn"); + if (!arch->pre_xol) + arch->pre_xol = ubp_pre_xol; + if (ubp_uses_xol(arch->strategies) && !arch->post_xol) + result = ubp_missing_arch_func("post_xol"); + /* Null cancel_xol() is OK. */ + return result; +} + +module_init(init_ubp); Index: uprobes.git/kernel/Makefile =================================================================== --- uprobes.git.orig/kernel/Makefile +++ uprobes.git/kernel/Makefile @@ -96,6 +96,7 @@ obj-$(CONFIG_FUNCTION_TRACER) += trace/ obj-$(CONFIG_TRACING) += trace/ obj-$(CONFIG_SMP) += sched_cpupri.o obj-$(CONFIG_SLOW_WORK) += slow-work.o +obj-$(CONFIG_UBP) += ubp_core.o ifneq ($(CONFIG_SCHED_OMIT_FRAME_POINTER),y) # According to Alan Modra , the -fno-omit-frame-pointer is From srikar at linux.vnet.ibm.com Thu Jun 11 16:12:09 2009 From: srikar at linux.vnet.ibm.com (Srikar Dronamraju) Date: Thu, 11 Jun 2009 21:42:09 +0530 Subject: [RESEND] [PATCH 2/7] x86 support for UBP In-Reply-To: <20090611160539.GA20668@linux.vnet.ibm.com> References: <20090611160539.GA20668@linux.vnet.ibm.com> Message-ID: <20090611161209.GB21218@linux.vnet.ibm.com> x86 support for user breakpoint Infrastructure This patch provides x86 specific userspace breakpoint assistance implementation details. This patch requires "x86: instruction decoder API" patch. http://lkml.org/lkml/2009/6/1/459 Signed-off-by: Jim Keniston --- arch/x86/Kconfig | 1 arch/x86/include/asm/ubp.h | 40 +++ arch/x86/kernel/Makefile | 2 arch/x86/kernel/ubp_x86.c | 571 +++++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 614 insertions(+) Index: uprobes.git/arch/x86/Kconfig =================================================================== --- uprobes.git.orig/arch/x86/Kconfig +++ uprobes.git/arch/x86/Kconfig @@ -46,6 +46,7 @@ config X86 select HAVE_KERNEL_GZIP select HAVE_KERNEL_BZIP2 select HAVE_KERNEL_LZMA + select HAVE_UBP config ARCH_DEFCONFIG string Index: uprobes.git/arch/x86/include/asm/ubp.h =================================================================== --- /dev/null +++ uprobes.git/arch/x86/include/asm/ubp.h @@ -0,0 +1,40 @@ +#ifndef _ASM_UBP_H +#define _ASM_UBP_H +/* + * User-space BreakPoint support (ubp) for x86 + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. + * + * Copyright (C) IBM Corporation, 2008, 2009 + */ + +typedef u8 ubp_opcode_t; +#define MAX_UINSN_BYTES 16 +#define UBP_XOL_SLOT_BYTES (MAX_UINSN_BYTES) + +#ifdef CONFIG_X86_64 +struct ubp_bkpt_arch_info { + unsigned long rip_target_address; + u8 orig_insn[MAX_UINSN_BYTES]; +}; +struct ubp_task_arch_info { + unsigned long saved_scratch_register; +}; +#else +struct ubp_bkpt_arch_info {}; +struct ubp_task_arch_info {}; +#endif + +#endif /* _ASM_UBP_H */ Index: uprobes.git/arch/x86/kernel/Makefile =================================================================== --- uprobes.git.orig/arch/x86/kernel/Makefile +++ uprobes.git/arch/x86/kernel/Makefile @@ -109,6 +109,8 @@ obj-$(CONFIG_X86_CHECK_BIOS_CORRUPTION) obj-$(CONFIG_SWIOTLB) += pci-swiotlb.o +obj-$(CONFIG_UBP) += ubp_x86.o + ### # 64 bit specific files ifeq ($(CONFIG_X86_64),y) Index: uprobes.git/arch/x86/kernel/ubp_x86.c =================================================================== --- /dev/null +++ uprobes.git/arch/x86/kernel/ubp_x86.c @@ -0,0 +1,571 @@ +/* + * User-space BreakPoint support (ubp) for x86 + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. + * + * Copyright (C) IBM Corporation, 2008, 2009 + */ + +#define UBP_IMPLEMENTATION 1 + +#include +#include +#include +#include +#include + +#ifdef CONFIG_X86_32 +#define is_32bit_app(tsk) 1 +#else +#define is_32bit_app(tsk) (test_tsk_thread_flag(tsk, TIF_IA32)) +#endif + +#define UBP_FIX_RIP_AX 0x8000 +#define UBP_FIX_RIP_CX 0x4000 + +static void set_ip(struct pt_regs *regs, unsigned long vaddr) +{ + regs->ip = vaddr; +} + +#ifdef CONFIG_X86_64 +static bool is_riprel_insn(struct ubp_bkpt *ubp) +{ + return ((ubp->fixups & (UBP_FIX_RIP_AX | UBP_FIX_RIP_CX)) != 0); +} + +static void cancel_xol(struct task_struct *tsk, struct ubp_bkpt *ubp) +{ + if (is_riprel_insn(ubp)) { + /* + * We rewrote ubp->insn to use indirect addressing rather + * than rip-relative addressing for XOL. For + * single-stepping inline, put back the original instruction. + */ + memcpy(ubp->insn, ubp->arch_info.orig_insn, MAX_UINSN_BYTES); + ubp->strategy &= ~UBP_HNT_TSKINFO; + } +} +#endif /* CONFIG_X86_64 */ + +#define W(row, b0, b1, b2, b3, b4, b5, b6, b7, b8, b9, ba, bb, bc, bd, be, bf)\ + (((b0##UL << 0x0)|(b1##UL << 0x1)|(b2##UL << 0x2)|(b3##UL << 0x3) | \ + (b4##UL << 0x4)|(b5##UL << 0x5)|(b6##UL << 0x6)|(b7##UL << 0x7) | \ + (b8##UL << 0x8)|(b9##UL << 0x9)|(ba##UL << 0xa)|(bb##UL << 0xb) | \ + (bc##UL << 0xc)|(bd##UL << 0xd)|(be##UL << 0xe)|(bf##UL << 0xf)) \ + << (row % 32)) + +static const u32 good_insns_64[256 / 32] = { + /* 0 1 2 3 4 5 6 7 8 9 a b c d e f */ + /* ---------------------------------------------- */ + W(0x00, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0) | /* 00 */ + W(0x10, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0) , /* 10 */ + W(0x20, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0) | /* 20 */ + W(0x30, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0) , /* 30 */ + W(0x40, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) | /* 40 */ + W(0x50, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1) , /* 50 */ + W(0x60, 0, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0) | /* 60 */ + W(0x70, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1) , /* 70 */ + W(0x80, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1) | /* 80 */ + W(0x90, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1) , /* 90 */ + W(0xa0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1) | /* a0 */ + W(0xb0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1) , /* b0 */ + W(0xc0, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0) | /* c0 */ + W(0xd0, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1) , /* d0 */ + W(0xe0, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0) | /* e0 */ + W(0xf0, 0, 0, 1, 1, 0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1) /* f0 */ + /* ---------------------------------------------- */ + /* 0 1 2 3 4 5 6 7 8 9 a b c d e f */ +}; + +/* Good-instruction tables for 32-bit apps -- copied from i386 uprobes */ + +static const u32 good_insns_32[256 / 32] = { + /* 0 1 2 3 4 5 6 7 8 9 a b c d e f */ + /* ---------------------------------------------- */ + W(0x00, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0) | /* 00 */ + W(0x10, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0) , /* 10 */ + W(0x20, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 1) | /* 20 */ + W(0x30, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 1) , /* 30 */ + W(0x40, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1) | /* 40 */ + W(0x50, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1) , /* 50 */ + W(0x60, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0) | /* 60 */ + W(0x70, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1) , /* 70 */ + W(0x80, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1) | /* 80 */ + W(0x90, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1) , /* 90 */ + W(0xa0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1) | /* a0 */ + W(0xb0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1) , /* b0 */ + W(0xc0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0) | /* c0 */ + W(0xd0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1) , /* d0 */ + W(0xe0, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0) | /* e0 */ + W(0xf0, 0, 0, 1, 1, 0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1) /* f0 */ + /* ---------------------------------------------- */ + /* 0 1 2 3 4 5 6 7 8 9 a b c d e f */ +}; + +/* Using this for both 64-bit and 32-bit apps */ +static const u32 good_2byte_insns[256 / 32] = { + /* 0 1 2 3 4 5 6 7 8 9 a b c d e f */ + /* ---------------------------------------------- */ + W(0x00, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1) | /* 00 */ + W(0x10, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1) , /* 10 */ + W(0x20, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1) | /* 20 */ + W(0x30, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) , /* 30 */ + W(0x40, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1) | /* 40 */ + W(0x50, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1) , /* 50 */ + W(0x60, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1) | /* 60 */ + W(0x70, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1) , /* 70 */ + W(0x80, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1) | /* 80 */ + W(0x90, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1) , /* 90 */ + W(0xa0, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 1) | /* a0 */ + W(0xb0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1) , /* b0 */ + W(0xc0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1) | /* c0 */ + W(0xd0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1) , /* d0 */ + W(0xe0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1) | /* e0 */ + W(0xf0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0) /* f0 */ + /* ---------------------------------------------- */ + /* 0 1 2 3 4 5 6 7 8 9 a b c d e f */ +}; + +/* + * opcodes we'll probably never support: + * 6c-6d, e4-e5, ec-ed - in + * 6e-6f, e6-e7, ee-ef - out + * cc, cd - int3, int + * cf - iret + * d6 - illegal instruction + * f1 - int1/icebp + * f4 - hlt + * fa, fb - cli, sti + * 0f - lar, lsl, syscall, clts, sysret, sysenter, sysexit, invd, wbinvd, ud2 + * + * invalid opcodes in 64-bit mode: + * 06, 0e, 16, 1e, 27, 2f, 37, 3f, 60-62, 82, c4-c5, d4-d5 + * + * 63 - we support this opcode in x86_64 but not in i386. + * + * opcodes we may need to refine support for: + * 0f - 2-byte instructions: For many of these instructions, the validity + * depends on the prefix and/or the reg field. On such instructions, we + * just consider the opcode combination valid if it corresponds to any + * valid instruction. + * 8f - Group 1 - only reg = 0 is OK + * c6-c7 - Group 11 - only reg = 0 is OK + * d9-df - fpu insns with some illegal encodings + * f2, f3 - repnz, repz prefixes. These are also the first byte for + * certain floating-point instructions, such as addsd. + * fe - Group 4 - only reg = 0 or 1 is OK + * ff - Group 5 - only reg = 0-6 is OK + * + * others -- Do we need to support these? + * 0f - (floating-point?) prefetch instructions + * 07, 17, 1f - pop es, pop ss, pop ds + * 26, 2e, 36, 3e - es:, cs:, ss:, ds: segment prefixes -- + * but 64 and 65 (fs: and gs:) seem to be used, so we support them + * 67 - addr16 prefix + * ce - into + * f0 - lock prefix + */ + +/* + * TODO: + * - Where necessary, examine the modrm byte and allow only valid instructions + * in the different Groups and fpu instructions. + */ + +static bool is_prefix_bad(struct insn *insn) +{ + int i; + + for (i = 0; i < insn->prefixes.nbytes; i++) { + switch (insn->prefixes.bytes[i]) { + case 0x26: /*INAT_PFX_ES */ + case 0x2E: /*INAT_PFX_CS */ + case 0x36: /*INAT_PFX_DS */ + case 0x3E: /*INAT_PFX_SS */ + case 0xF0: /*INAT_PFX_LOCK */ + return 1; + } + } + return 0; +} + +static void report_bad_prefix(void) +{ + printk(KERN_ERR "ubp does not currently support probing " + "instructions with any of the following prefixes: " + "cs:, ds:, es:, ss:, lock:\n"); +} + +static void report_bad_1byte_opcode(int mode, ubp_opcode_t op) +{ + printk(KERN_ERR "In %d-bit apps, " + "ubp does not currently support probing " + "instructions whose first byte is 0x%2.2x\n", mode, op); +} + +static void report_bad_2byte_opcode(ubp_opcode_t op) +{ + printk(KERN_ERR "ubp does not currently support probing " + "instructions with the 2-byte opcode 0x0f 0x%2.2x\n", op); +} + +static int validate_insn_32bits(struct ubp_bkpt *ubp, struct insn *insn) +{ + insn_init(insn, ubp->insn, false); + + /* Skip good instruction prefixes; reject "bad" ones. */ + insn_get_opcode(insn); + if (is_prefix_bad(insn)) { + report_bad_prefix(); + return -EPERM; + } + if (test_bit(OPCODE1(insn), (unsigned long *) good_insns_32)) + return 0; + if (insn->opcode.nbytes == 2) { + if (test_bit(OPCODE2(insn), + (unsigned long *) good_2byte_insns)) + return 0; + report_bad_2byte_opcode(OPCODE2(insn)); + } else + report_bad_1byte_opcode(32, OPCODE1(insn)); + return -EPERM; +} + +static int validate_insn_64bits(struct ubp_bkpt *ubp, struct insn *insn) +{ + insn_init(insn, ubp->insn, true); + + /* Skip good instruction prefixes; reject "bad" ones. */ + insn_get_opcode(insn); + if (is_prefix_bad(insn)) { + report_bad_prefix(); + return -EPERM; + } + if (test_bit(OPCODE1(insn), (unsigned long *) good_insns_64)) + return 0; + if (insn->opcode.nbytes == 2) { + if (test_bit(OPCODE2(insn), + (unsigned long *) good_2byte_insns)) + return 0; + report_bad_2byte_opcode(OPCODE2(insn)); + } else + report_bad_1byte_opcode(64, OPCODE1(insn)); + return -EPERM; +} + +/* + * Figure out which fixups post_xol() will need to perform, and annotate + * ubp->fixups accordingly. To start with, ubp->fixups is either zero or + * it reflects rip-related fixups. + */ +static void prepare_fixups(struct ubp_bkpt *ubp, struct insn *insn) +{ + bool fix_ip = true, fix_call = false; /* defaults */ + insn_get_opcode(insn); /* should be a nop */ + + switch (OPCODE1(insn)) { + case 0xc3: /* ret/lret */ + case 0xcb: + case 0xc2: + case 0xca: + /* ip is correct */ + fix_ip = false; + break; + case 0xe8: /* call relative - Fix return addr */ + fix_call = true; + break; + case 0x9a: /* call absolute - Fix return addr, not ip */ + fix_call = true; + fix_ip = false; + break; + case 0xff: + { + int reg; + insn_get_modrm(insn); + reg = MODRM_REG(insn); + if (reg == 2 || reg == 3) { + /* call or lcall, indirect */ + /* Fix return addr; ip is correct. */ + fix_call = true; + fix_ip = false; + } else if (reg == 4 || reg == 5) { + /* jmp or ljmp, indirect */ + /* ip is correct. */ + fix_ip = false; + } + break; + } + case 0xea: /* jmp absolute -- ip is correct */ + fix_ip = false; + break; + default: + break; + } + if (fix_ip) + ubp->fixups |= UBP_FIX_IP; + if (fix_call) + ubp->fixups |= UBP_FIX_CALL; +} + +#ifdef CONFIG_X86_64 +static int handle_riprel_insn(struct ubp_bkpt *ubp, struct insn *insn); +#endif + +static int analyze_insn(struct task_struct *tsk, struct ubp_bkpt *ubp) +{ + int ret; + struct insn insn; + + ubp->fixups = 0; +#ifdef CONFIG_X86_64 + ubp->arch_info.rip_target_address = 0x0; +#endif + + if (is_32bit_app(tsk)) { + ret = validate_insn_32bits(ubp, &insn); + if (ret != 0) + return ret; + } else { + ret = validate_insn_64bits(ubp, &insn); + if (ret != 0) + return ret; + } + if (ubp->strategy & UBP_HNT_INLINE) + return 0; +#ifdef CONFIG_X86_64 + ret = handle_riprel_insn(ubp, &insn); + if (ret == -1) + /* rip-relative; can't XOL */ + return 0; + else if (ret == 0) + /* not rip-relative */ + ubp->strategy &= ~UBP_HNT_TSKINFO; +#endif + prepare_fixups(ubp, &insn); + return 0; +} + +#ifdef CONFIG_X86_64 +/* + * If ubp->insn doesn't use rip-relative addressing, return 0. Otherwise, + * rewrite the instruction so that it accesses its memory operand + * indirectly through a scratch register. Set ubp->fixups and + * ubp->arch_info.rip_target_address accordingly. (The contents of the + * scratch register will be saved before we single-step the modified + * instruction, and restored afterward.) Return 1. + * + * (... except if the client doesn't support our UBP_HNT_TSKINFO strategy, + * we must suppress XOL for rip-relative instructions: return -1.) + * + * We do this because a rip-relative instruction can access only a + * relatively small area (+/- 2 GB from the instruction), and the XOL + * area typically lies beyond that area. At least for instructions + * that store to memory, we can't execute the original instruction + * and "fix things up" later, because the misdirected store could be + * disastrous. + * + * Some useful facts about rip-relative instructions: + * - There's always a modrm byte. + * - There's never a SIB byte. + * - The displacement is always 4 bytes. + */ +static int handle_riprel_insn(struct ubp_bkpt *ubp, struct insn *insn) +{ + u8 *cursor; + u8 reg; + + if (!insn_rip_relative(insn)) + return 0; + + /* + * We have a rip-relative instruction. To allow this instruction + * to be single-stepped out of line, the client must provide us + * with a per-task ubp_task_arch_info object. + */ + if (!(ubp->strategy & UBP_HNT_TSKINFO)) { + ubp->strategy |= UBP_HNT_INLINE; + return -1; + } + memcpy(ubp->arch_info.orig_insn, ubp->insn, MAX_UINSN_BYTES); + + /* + * Point cursor at the modrm byte. The next 4 bytes are the + * displacement. Beyond the displacement, for some instructions, + * is the immediate operand. + */ + cursor = ubp->insn + insn->prefixes.nbytes + insn->rex_prefix.nbytes + + insn->opcode.nbytes; + insn_get_length(insn); + + /* + * Convert from rip-relative addressing to indirect addressing + * via a scratch register. Change the r/m field from 0x5 (%rip) + * to 0x0 (%rax) or 0x1 (%rcx), and squeeze out the offset field. + */ + reg = MODRM_REG(insn); + if (reg == 0) { + /* + * The register operand (if any) is either the A register + * (%rax, %eax, etc.) or (if the 0x4 bit is set in the + * REX prefix) %r8. In any case, we know the C register + * is NOT the register operand, so we use %rcx (register + * #1) for the scratch register. + */ + ubp->fixups = UBP_FIX_RIP_CX; + /* Change modrm from 00 000 101 to 00 000 001. */ + *cursor = 0x1; + } else { + /* Use %rax (register #0) for the scratch register. */ + ubp->fixups = UBP_FIX_RIP_AX; + /* Change modrm from 00 xxx 101 to 00 xxx 000 */ + *cursor = (reg << 3); + } + + /* Target address = address of next instruction + (signed) offset */ + ubp->arch_info.rip_target_address = (long) ubp->vaddr + + insn->length + insn->displacement.value; + /* Displacement field is gone; slide immediate field (if any) over. */ + if (insn->immediate.nbytes) { + cursor++; + memmove(cursor, cursor + insn->displacement.nbytes, + insn->immediate.nbytes); + } + return 1; +} + +/* + * If we're emulating a rip-relative instruction, save the contents + * of the scratch register and store the target address in that register. + */ +static int pre_xol(struct task_struct *tsk, struct ubp_bkpt *ubp, + struct ubp_task_arch_info *tskinfo, struct pt_regs *regs) +{ + BUG_ON(!ubp->xol_vaddr); + regs->ip = ubp->xol_vaddr; + if (ubp->fixups & UBP_FIX_RIP_AX) { + tskinfo->saved_scratch_register = regs->ax; + regs->ax = ubp->arch_info.rip_target_address; + } else if (ubp->fixups & UBP_FIX_RIP_CX) { + tskinfo->saved_scratch_register = regs->cx; + regs->cx = ubp->arch_info.rip_target_address; + } + return 0; +} +#endif + +/* + * Called by post_xol() to adjust the return address pushed by a call + * instruction executed out of line. + */ +static int adjust_ret_addr(struct task_struct *tsk, unsigned long sp, + long correction) +{ + int rasize, ncopied; + long ra = 0; + + if (is_32bit_app(tsk)) + rasize = 4; + else + rasize = 8; + ncopied = ubp_read_vm(tsk, sp, &ra, rasize); + if (unlikely(ncopied != rasize)) + goto fail; + ra += correction; + ncopied = ubp_write_data(tsk, sp, &ra, rasize); + if (unlikely(ncopied != rasize)) + goto fail; + return 0; + +fail: + printk(KERN_ERR + "ubp: Failed to adjust return address after" + " single-stepping call instruction;" + " pid=%d, sp=%#lx\n", tsk->pid, sp); + return -EFAULT; +} + +/* + * Called after single-stepping. ubp->vaddr is the address of the + * instruction whose first byte has been replaced by the "int3" + * instruction. To avoid the SMP problems that can occur when we + * temporarily put back the original opcode to single-step, we + * single-stepped a copy of the instruction. The address of this + * copy is ubp->xol_vaddr. + * + * This function prepares to resume execution after the single-step. + * We have to fix things up as follows: + * + * Typically, the new ip is relative to the copied instruction. We need + * to make it relative to the original instruction (FIX_IP). Exceptions + * are return instructions and absolute or indirect jump or call instructions. + * + * If the single-stepped instruction was a call, the return address that + * is atop the stack is the address following the copied instruction. We + * need to make it the address following the original instruction (FIX_CALL). + * + * If the original instruction was a rip-relative instruction such as + * "movl %edx,0xnnnn(%rip)", we have instead executed an equivalent + * instruction using a scratch register -- e.g., "movl %edx,(%rax)". + * We need to restore the contents of the scratch register and adjust + * the ip, keeping in mind that the instruction we executed is 4 bytes + * shorter than the original instruction (since we squeezed out the offset + * field). (FIX_RIP_AX or FIX_RIP_CX) + */ +static int post_xol(struct task_struct *tsk, struct ubp_bkpt *ubp, + struct ubp_task_arch_info *tskinfo, struct pt_regs *regs) +{ + /* Typically, the XOL vma is at a high addr, so correction < 0. */ + long correction = (long) (ubp->vaddr - ubp->xol_vaddr); + int result = 0; + +#ifdef CONFIG_X86_64 + if (is_riprel_insn(ubp)) { + if (ubp->fixups & UBP_FIX_RIP_AX) + regs->ax = tskinfo->saved_scratch_register; + else + regs->cx = tskinfo->saved_scratch_register; + /* + * The original instruction includes a displacement, and so + * is 4 bytes longer than what we've just single-stepped. + * Fall through to handle stuff like "jmpq *...(%rip)" and + * "callq *...(%rip)". + */ + correction += 4; + } +#endif + if (ubp->fixups & UBP_FIX_IP) + regs->ip += correction; + if (ubp->fixups & UBP_FIX_CALL) + result = adjust_ret_addr(tsk, regs->sp, correction); + return result; +} + +struct ubp_arch_info ubp_arch_info = { + .bkpt_insn = 0xcc, + .ip_advancement_by_bkpt_insn = 1, + .max_insn_bytes = MAX_UINSN_BYTES, +#ifdef CONFIG_X86_32 + .strategies = 0x0, +#else + /* rip-relative instructions require special handling. */ + .strategies = UBP_HNT_TSKINFO, + .pre_xol = pre_xol, + .cancel_xol = cancel_xol, +#endif + .set_ip = set_ip, + .analyze_insn = analyze_insn, + .post_xol = post_xol, +}; From srikar at linux.vnet.ibm.com Thu Jun 11 16:13:27 2009 From: srikar at linux.vnet.ibm.com (Srikar Dronamraju) Date: Thu, 11 Jun 2009 21:43:27 +0530 Subject: [RESEND] [PATCH 3/7] Execution out of line (XOL) In-Reply-To: <20090611160539.GA20668@linux.vnet.ibm.com> References: <20090611160539.GA20668@linux.vnet.ibm.com> Message-ID: <20090611161327.GC21218@linux.vnet.ibm.com> Slot allocation mechanism for Execution Out of Line strategy in User space breakpointing Inftrastructure. (XOL) This patch provides slot allocation mechanism for execution out of line strategy for use with user space breakpoint infrastructure. This patch requires utrace support in kernel. This patch provides five functions xol_get_insn_slot(), xol_free_insn_slot(), xol_put_area(), xol_get_area() and xol_validate_vaddr(). Current slot allocation mechanism: 1. Allocate one dedicated slot per user breakpoint. 2. If the allocated vma is completely used, expand current vma. 3. If we cant expand the vma, allocate a new vma. Signed-off-by: Jim Keniston Signed-off-by: Srikar Dronamraju --- arch/Kconfig | 4 include/linux/ubp_xol.h | 56 ++++ kernel/Makefile | 1 kernel/ubp_xol.c | 627 ++++++++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 688 insertions(+) Index: uprobes.git/arch/Kconfig =================================================================== --- uprobes.git.orig/arch/Kconfig +++ uprobes.git/arch/Kconfig @@ -82,6 +82,10 @@ config KRETPROBES config HAVE_UBP def_bool n +config UBP_XOL + def_bool y + depends on UBP && UTRACE + config HAVE_IOREMAP_PROT bool Index: uprobes.git/include/linux/ubp_xol.h =================================================================== --- /dev/null +++ uprobes.git/include/linux/ubp_xol.h @@ -0,0 +1,56 @@ +#ifndef _LINUX_XOL_H +#define _LINUX_XOL_H +/* + * User-space BreakPoint support (ubp) -- Allocation of instruction + * slots for execution out of line (XOL) + * include/linux/ubp_xol.h + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. + * + * Copyright (C) IBM Corporation, 2009 + */ + + +#if defined(CONFIG_UBP_XOL) +extern unsigned long xol_get_insn_slot(struct ubp_bkpt *ubp, void *xol_area); +extern void xol_free_insn_slot(unsigned long, void *xol_area); +extern int xol_validate_vaddr(struct pid *pid, unsigned long vaddr, + void *xol_area); +extern void *xol_get_area(struct pid *pid); +extern void xol_put_area(void *xol_area); +#else /* CONFIG_UBP_XOL */ +static inline unsigned long xol_get_insn_slot(struct ubp_bkpt *ubp, + void *xol_area) +{ + return 0; +} +static inline void xol_free_insn_slot(unsigned long slot_addr, void *xol_area) +{ +} +static inline int xol_validate_vaddr(struct pid *pid, unsigned long vaddr, + void *xol_area) +{ + return -ENOSYS; +} +static inline void *xol_get_area(struct pid *pid) +{ + return NULL; +} +static inline void xol_put_area(void *xol_area) +{ +} +#endif /* CONFIG_UBP_XOL */ + +#endif /* _LINUX_XOL_H */ Index: uprobes.git/kernel/Makefile =================================================================== --- uprobes.git.orig/kernel/Makefile +++ uprobes.git/kernel/Makefile @@ -97,6 +97,7 @@ obj-$(CONFIG_TRACING) += trace/ obj-$(CONFIG_SMP) += sched_cpupri.o obj-$(CONFIG_SLOW_WORK) += slow-work.o obj-$(CONFIG_UBP) += ubp_core.o +obj-$(CONFIG_UBP_XOL) += ubp_xol.o ifneq ($(CONFIG_SCHED_OMIT_FRAME_POINTER),y) # According to Alan Modra , the -fno-omit-frame-pointer is Index: uprobes.git/kernel/ubp_xol.c =================================================================== --- /dev/null +++ uprobes.git/kernel/ubp_xol.c @@ -0,0 +1,627 @@ +/* + * User-space BreakPoint support (ubp) -- Allocation of instruction + * slots for execution out of line (XOL) + * kernel/ubp_xol.c + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. + * + * Copyright (C) IBM Corporation, 2009 + */ + +/* + * Every probepoint gets its own slot. Once it's assigned a slot, it + * keeps that slot until the probepoint goes away. If we run out of + * slots in the XOL vma, we try to expand it by one page. If we can't + * expand it, we allocate an additional vma. Only the probed process + * itself can add or expand vmas. + */ +#include +#include +#include +#include +#include +#include +#include +#include + +#define UINSNS_PER_PAGE (PAGE_SIZE/UBP_XOL_SLOT_BYTES) + +struct ubp_xol_vma { + struct list_head list; + unsigned long *bitmap; /* 0 = free slot */ + + /* + * We keep the vma's vm_start rather than a pointer to the vma + * itself. The probed process or a naughty kernel module could make + * the vma go away, and we must handle that reasonably gracefully. + */ + unsigned long vaddr; /* Page(s) of instruction slots */ + int npages; + int nslots; +}; + +struct ubp_xol_area { + struct list_head vmas; + struct mutex mutex; + + /* + * We ref-count threads and clients. The xol_report_* callbacks + * are all about noticing when the last thread goes away. + */ + struct kref kref; + struct ubp_xol_vma *last_vma; + pid_t tgid; + bool can_expand; +}; + +static const struct utrace_engine_ops xol_engine_ops; +static void xol_free_area(struct kref *kref); + +/* + * xol_mutex allows creation of unique ubp_xol_area. + * Critical region for xol_mutex includes creation and initialization + * of ubp_xol_area and attaching an exclusive engine with + * xol_engine_ops for the thread whose pid is thread group id. + */ +static DEFINE_MUTEX(xol_mutex); + +/** + * xol_put_area - release a reference to ubp_xol_area. + * If this happens to be the last reference, free the ubp_xol_area. + * @xol_area: unique per process ubp_xol_area for this process. + */ +void xol_put_area(void *xol_area) +{ + struct ubp_xol_area *area = (struct ubp_xol_area *) xol_area; + + if (unlikely(!area)) + return; + kref_put(&area->kref, xol_free_area); +} + +/* + * Need unique ubp_xol_area. This is achieved by using utrace engines. + * However code using utrace could be avoided if mm_struct / + * mm_context_t had a pointer to ubp_xol_area. + */ + +/* + * xol_create_engine - add a thread to watch + * xol_create_engine can return these values: + * 0: successfully created an engine. + * -EEXIST: don't bother because an engine already exists for this + * thread. + * -ESRCH: Process or thread is exiting; don't need to create an + * engine. + * -ENOMEM: utrace can't allocate memory for the engine + * + * This function is called holding a reference to pid. + */ +static int xol_create_engine(struct pid *pid, struct ubp_xol_area *area) +{ + struct utrace_engine *engine; + int result; + + engine = utrace_attach_pid(pid, UTRACE_ATTACH_CREATE | + UTRACE_ATTACH_EXCLUSIVE | UTRACE_ATTACH_MATCH_OPS, + &xol_engine_ops, area); + if (IS_ERR(engine)) { + put_pid(pid); + return PTR_ERR(engine); + } + result = utrace_set_events_pid(pid, engine, + UTRACE_EVENT(EXEC) | UTRACE_EVENT(CLONE) | UTRACE_EVENT(EXIT)); + /* + * Since this is the first and only time we set events for this + * engine, there shouldn't be any callbacks in progress. + */ + WARN_ON(result == -EINPROGRESS); + kref_get(&area->kref); + put_pid(pid); + return 0; +} + +/* + * If a thread clones while xol_get_area() is running, it's possible + * for xol_create_engine() to be called both from there and from + * here. No problem, since xol_create_engine() refuses to create (or + * ref-count) a second engine for the same task. + */ +static u32 xol_report_clone(enum utrace_resume_action action, + struct utrace_engine *engine, + struct task_struct *parent, + unsigned long clone_flags, + struct task_struct *child) +{ + if (clone_flags & CLONE_THREAD) { + struct pid *child_pid = get_pid(task_pid(child)); + + BUG_ON(!child_pid); + (void)xol_create_engine(child_pid, + (struct ubp_xol_area *) engine->data); + } + return UTRACE_RESUME; +} + +/* + * When a multithreaded app execs, the exec-ing thread reports the + * exec, and the other threads report exit. + */ +static u32 xol_report_exec(enum utrace_resume_action action, + struct utrace_engine *engine, + struct task_struct *tsk, + const struct linux_binfmt *fmt, + const struct linux_binprm *bprm, + struct pt_regs *regs) +{ + xol_put_area((struct ubp_xol_area *)engine->data); + return UTRACE_DETACH; +} + +static u32 xol_report_exit(enum utrace_resume_action action, + struct utrace_engine *engine, + struct task_struct *tsk, long orig_code, long *code) +{ + xol_put_area((struct ubp_xol_area *)engine->data); + return UTRACE_DETACH; +} + +static const struct utrace_engine_ops xol_engine_ops = { + .report_exit = xol_report_exit, + .report_clone = xol_report_clone, + .report_exec = xol_report_exec +}; + +/* + * @start_pid is the pid for a thread in the traced process. + * Creating engines for a hugely multithreaded process can be + * time consuming. Hence engines for other threads are created + * outside the critical region. + */ +static void create_engine_sibling_threads(struct pid *start_pid, + struct ubp_xol_area *area) +{ + struct task_struct *t, *start; + struct utrace_engine *engine; + struct pid *pid = NULL; + + rcu_read_lock(); + t = start = pid_task(start_pid, PIDTYPE_PID); + if (t) { + do { + if (t->exit_state) { + t = next_thread(t); + continue; + } + + /* + * This doesn't sleep, does minimal error checking. + */ + engine = utrace_attach_task(t, + UTRACE_ATTACH_MATCH_OPS, + &xol_engine_ops, NULL); + if (PTR_ERR(engine) == -ENOENT) { + pid = get_pid(task_pid(t)); + (void)xol_create_engine(pid, area); + } else if (!IS_ERR(engine)) + utrace_engine_put(engine); + + t = next_thread(t); + } while (t != start); + } + rcu_read_unlock(); +} + +/** + * xol_get_area - Get a reference to process's ubp_xol_area. + * If an ubp_xol_area doesn't exist for @tg_leader's process, create + * one. In any case, increment its refcount and return a pointer + * to it. + * @tg_leader: pointer to struct pid of a thread whose tid is the + * thread group id + */ +void *xol_get_area(struct pid *tg_leader) +{ + struct ubp_xol_area *area = NULL; + struct utrace_engine *engine; + struct pid *pid; + int ret; + + pid = get_pid(tg_leader); + mutex_lock(&xol_mutex); + engine = utrace_attach_pid(tg_leader, UTRACE_ATTACH_MATCH_OPS, + &xol_engine_ops, NULL); + if (!IS_ERR(engine)) { + area = engine->data; + utrace_engine_put(engine); + mutex_unlock(&xol_mutex); + goto found_area; + } + + area = kzalloc(sizeof(*area), GFP_USER); + if (unlikely(!area)) { + mutex_unlock(&xol_mutex); + return NULL; + } + mutex_init(&area->mutex); + kref_init(&area->kref); + area->last_vma = NULL; + area->can_expand = true; + area->tgid = pid_task(tg_leader, PIDTYPE_PID)->tgid; + INIT_LIST_HEAD(&area->vmas); + ret = xol_create_engine(pid, area); + mutex_unlock(&xol_mutex); + + if (ret != 0) { + kfree(area); + return NULL; + } + create_engine_sibling_threads(pid, area); + +found_area: + if (likely(area)) + kref_get(&area->kref); + return (void *) area; +} + +static void xol_free_area(struct kref *kref) +{ + struct ubp_xol_vma *usv, *tmp; + struct ubp_xol_area *area; + + area = container_of(kref, struct ubp_xol_area, kref); + list_for_each_entry_safe(usv, tmp, &area->vmas, list) { + kfree(usv->bitmap); + kfree(usv); + } + kfree(area); +} + +/* + * Allocate a bitmap for a new vma, or expand an existing bitmap. + * if old_bitmap is non-NULL, xol_realloc_bitmap() never returns + * old_bitmap. + */ +static unsigned long *xol_realloc_bitmap(unsigned long *old_bitmap, + int old_nslots, int new_nslots) +{ + unsigned long *new_bitmap; + + BUG_ON(new_nslots < old_nslots); + + new_bitmap = kzalloc(BITS_TO_LONGS(new_nslots) * sizeof(long), + GFP_USER); + if (!new_bitmap) { + printk(KERN_ERR "ubp_xol: cannot %sallocate bitmap for XOL " + "area for pid/tgid %d/%d\n", (old_bitmap ? "re" : ""), + current->pid, current->tgid); + return NULL; + } + if (old_bitmap) + memcpy(new_bitmap, old_bitmap, + BITS_TO_LONGS(old_nslots) * sizeof(long)); + return new_bitmap; +} + +static struct ubp_xol_vma *xol_alloc_vma(void) +{ + struct ubp_xol_vma *usv; + + usv = kzalloc(sizeof(struct ubp_xol_vma), GFP_USER); + if (!usv) { + printk(KERN_ERR "ubp_xol: cannot allocate kmem for XOL vma" + " for pid/tgid %d/%d\n", current->pid, current->tgid); + return NULL; + } + usv->bitmap = xol_realloc_bitmap(NULL, 0, UINSNS_PER_PAGE); + if (!usv->bitmap) { + kfree(usv); + return NULL; + } + return usv; +} + +static inline struct ubp_xol_vma *xol_add_vma(struct ubp_xol_area *area) +{ + struct vm_area_struct *vma; + struct ubp_xol_vma *usv; + struct mm_struct *mm; + unsigned long addr; + + mm = get_task_mm(current); + if (!mm) + return ERR_PTR(-ESRCH); + + usv = xol_alloc_vma(); + if (!usv) { + mmput(mm); + return ERR_PTR(-ENOMEM); + } + + down_write(&mm->mmap_sem); + /* + * Find the end of the top mapping and skip a page. + * If there is no space for PAGE_SIZE above + * that, mmap will ignore our address hint. + */ + vma = rb_entry(rb_last(&mm->mm_rb), struct vm_area_struct, vm_rb); + addr = vma->vm_end + PAGE_SIZE; + addr = do_mmap_pgoff(NULL, addr, PAGE_SIZE, PROT_EXEC, + MAP_PRIVATE|MAP_ANONYMOUS, 0); + if (addr & ~PAGE_MASK) { + up_write(&mm->mmap_sem); + mmput(mm); + printk(KERN_ERR "ubp_xol failed to allocate a vma for" + " pid/tgid %d/%d for single-stepping out of line.\n", + current->pid, current->tgid); + kfree(usv->bitmap); + kfree(usv); + return ERR_PTR(-ENOMEM); + } + + vma = find_vma(mm, addr); + BUG_ON(!vma); + + /* Don't expand vma on mremap(). */ + vma->vm_flags |= VM_DONTEXPAND | VM_DONTCOPY; + usv->vaddr = vma->vm_start; + up_write(&mm->mmap_sem); + mmput(mm); + usv->npages = 1; + usv->nslots = UINSNS_PER_PAGE; + INIT_LIST_HEAD(&usv->list); + list_add_tail(&usv->list, &area->vmas); + area->last_vma = usv; + return usv; +} + +/* Runs with area->mutex locked */ +static long xol_expand_vma(struct ubp_xol_vma *usv) +{ + struct vm_area_struct *vma; + unsigned long *new_bitmap; + struct mm_struct *mm; + unsigned long new_length, result; + int new_nslots; + + new_length = PAGE_SIZE * (usv->npages + 1); + new_nslots = (int) ((usv->npages + 1) * UINSNS_PER_PAGE); + + /* xol_realloc_bitmap() never returns usv->bitmap. */ + new_bitmap = xol_realloc_bitmap(usv->bitmap, usv->nslots, new_nslots); + if (!new_bitmap) + return -ENOMEM; + + mm = get_task_mm(current); + if (!mm) + return -ESRCH; + + down_write(&mm->mmap_sem); + vma = find_vma(mm, usv->vaddr); + if (!vma) { + printk(KERN_ERR "pid/tgid %d/%d: ubp XOL vma at %#lx" + " has disappeared!\n", current->pid, current->tgid, + usv->vaddr); + result = -ENOMEM; + goto fail; + } + if (vma_pages(vma) != usv->npages || vma->vm_start != usv->vaddr) { + printk(KERN_ERR "pid/tgid %d/%d: ubp XOL vma has been" + " altered: %#lx/%ld pages; should be %#lx/%d pages\n", + current->pid, current->tgid, vma->vm_start, + vma_pages(vma), usv->vaddr, usv->npages); + result = -ENOMEM; + goto fail; + } + vma->vm_flags &= ~VM_DONTEXPAND; + result = do_mremap(usv->vaddr, usv->npages*PAGE_SIZE, new_length, 0, 0); + vma->vm_flags |= VM_DONTEXPAND; + if (IS_ERR_VALUE(result)) { + printk(KERN_WARNING "ubp_xol failed to expand the vma " + "for pid/tgid %d/%d for single-stepping out of line.\n", + current->pid, current->tgid); + goto fail; + } + BUG_ON(result != usv->vaddr); + up_write(&mm->mmap_sem); + + kfree(usv->bitmap); + usv->bitmap = new_bitmap; + usv->nslots = new_nslots; + usv->npages++; + return 0; + +fail: + up_write(&mm->mmap_sem); + mmput(mm); + kfree(new_bitmap); + return result; +} + +/* + * Find a slot + * - searching in existing vmas for a free slot. + * - If no free slot in existing vmas, try expanding the last vma. + * - If unable to expand a vma, try adding a new vma. + * + * Runs with area->mutex locked. + */ +static unsigned long xol_take_insn_slot(struct ubp_xol_area *area) +{ + struct ubp_xol_vma *usv; + unsigned long slot_addr; + int slot_nr; + + list_for_each_entry(usv, &area->vmas, list) { + slot_nr = find_first_zero_bit(usv->bitmap, usv->nslots); + if (slot_nr < usv->nslots) { + set_bit(slot_nr, usv->bitmap); + slot_addr = usv->vaddr + + (slot_nr * UBP_XOL_SLOT_BYTES); + return slot_addr; + } + } + + /* + * All out of space. Need to allocate a new page. + * Only the probed process itself can add or expand vmas. + */ + if (!area->can_expand || (area->tgid != current->tgid)) + goto fail; + + usv = area->last_vma; + if (usv) { + /* Expand vma, take first of newly added slots. */ + slot_nr = usv->nslots; + if (xol_expand_vma(usv) != 0) { + printk(KERN_WARNING "Allocating additional vma.\n"); + usv = NULL; + } + } + if (!usv) { + slot_nr = 0; + usv = xol_add_vma(area); + if (IS_ERR(usv)) + goto cant_expand; + } + + /* Take first slot of new page. */ + set_bit(slot_nr, usv->bitmap); + slot_addr = usv->vaddr + (slot_nr * UBP_XOL_SLOT_BYTES); + return slot_addr; + +cant_expand: + area->can_expand = false; +fail: + return 0; +} + +/** + * xol_get_insn_slot - If ubp was not allocated a slot, then + * allocate a slot. If ubp_insert_bkpt is already called, (i.e + * ubp.vaddr != 0) then copy the instruction into the slot. + * Allocating a free slot could result in + * - using a free slot in the current vma or + * - expanding the last vma or + * - adding a new vma. + * Returns the allocated slot address or 0. + * @ubp: probepoint information + * @xol_area refers the unique per process ubp_xol_area for + * this process. + */ +unsigned long xol_get_insn_slot(struct ubp_bkpt *ubp, void *xol_area) +{ + struct ubp_xol_area *area = (struct ubp_xol_area *) xol_area; + int len; + + if (unlikely(!area)) + return 0; + mutex_lock(&area->mutex); + if (likely(!ubp->xol_vaddr)) { + ubp->xol_vaddr = xol_take_insn_slot(area); + /* + * Initialize the slot if ubp->vaddr points to valid + * instruction slot. + */ + if (likely(ubp->xol_vaddr) && ubp->vaddr) { + len = access_process_vm(current, ubp->xol_vaddr, + ubp->insn, UBP_XOL_SLOT_BYTES, 1); + if (unlikely(len < UBP_XOL_SLOT_BYTES)) + printk(KERN_ERR "Failed to copy instruction" + " at %#lx len = %d\n", + ubp->vaddr, len); + } + } + mutex_unlock(&area->mutex); + return ubp->xol_vaddr; +} + +/** + * xol_free_insn_slot - If slot was earlier allocated by + * @xol_get_insn_slot(), make the slot available for + * subsequent requests. + * @slot_addr: slot address as returned by + * @xol_get_insn_area(). + * @xol_area refers the unique per process ubp_xol_area for + * this process. + */ +void xol_free_insn_slot(unsigned long slot_addr, void *xol_area) +{ + struct ubp_xol_area *area = (struct ubp_xol_area *) xol_area; + struct ubp_xol_vma *usv; + int found = 0; + + if (unlikely(!slot_addr || IS_ERR_VALUE(slot_addr))) + return; + if (unlikely(!area)) + return; + mutex_lock(&area->mutex); + list_for_each_entry(usv, &area->vmas, list) { + unsigned long vma_end = usv->vaddr + usv->npages*PAGE_SIZE; + if (usv->vaddr <= slot_addr && slot_addr < vma_end) { + int slot_nr; + unsigned long offset = slot_addr - usv->vaddr; + BUG_ON(offset % UBP_XOL_SLOT_BYTES); + slot_nr = offset / UBP_XOL_SLOT_BYTES; + BUG_ON(slot_nr >= usv->nslots); + clear_bit(slot_nr, usv->bitmap); + found = 1; + } + } + mutex_unlock(&area->mutex); + if (!found) + printk(KERN_ERR "%s: no XOL vma for slot address %#lx\n", + __func__, slot_addr); +} + +/** + * xol_validate_vaddr - Verify if the specified address is in an + * executable vma, but not in an XOL vma. + * - Return 0 if the specified virtual address is in an + * executable vma, but not in an XOL vma. + * - Return 1 if the specified virtual address is in an + * XOL vma. + * - Return -EINTR otherwise.(i.e non executable vma, or + * not a valid address + * @pid: the probed process + * @vaddr: virtual address of the instruction to be validated. + * @xol_area refers the unique per process ubp_xol_area for + * this process. + */ +int xol_validate_vaddr(struct pid *pid, unsigned long vaddr, void *xol_area) +{ + struct ubp_xol_area *area = (struct ubp_xol_area *) xol_area; + struct ubp_xol_vma *usv; + struct task_struct *tsk; + int result; + + tsk = pid_task(pid, PIDTYPE_PID); + result = ubp_validate_insn_addr(tsk, vaddr); + if (result != 0) + return result; + + if (unlikely(!area)) + return 0; + mutex_lock(&area->mutex); + list_for_each_entry(usv, &area->vmas, list) { + unsigned long vma_end = usv->vaddr + usv->npages*PAGE_SIZE; + if (usv->vaddr <= vaddr && vaddr < vma_end) { + result = 1; + break; + } + } + mutex_unlock(&area->mutex); + return result; +} From srikar at linux.vnet.ibm.com Thu Jun 11 16:14:51 2009 From: srikar at linux.vnet.ibm.com (Srikar Dronamraju) Date: Thu, 11 Jun 2009 21:44:51 +0530 Subject: [RESEND] [PATCH 4/7] Uprobes Implementation In-Reply-To: <20090611160539.GA20668@linux.vnet.ibm.com> References: <20090611160539.GA20668@linux.vnet.ibm.com> Message-ID: <20090611161451.GD21218@linux.vnet.ibm.com> Uprobes Infrastructure Uprobes Infrastructure enables user to dynamically establish probepoints in user applications and collect information by executing a handler functions when the probepoints are hit. Please refer Documentation/uprobes.txt for more details. This patch provides the core implementation of uprobes. This patch builds on utrace infrastructure. You need to follow this up with the uprobes patch for your architecture. Signed-off-by: Jim Keniston Signed-off-by: Srikar Dronamraju --- arch/Kconfig | 12 include/linux/uprobes.h | 292 +++++++ kernel/Makefile | 1 kernel/uprobes_core.c | 1890 ++++++++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 2195 insertions(+) Index: uprobes.git/arch/Kconfig =================================================================== --- uprobes.git.orig/arch/Kconfig +++ uprobes.git/arch/Kconfig @@ -53,6 +53,16 @@ config UBP in user applications. This service is used by components such as uprobes. If in doubt, say "N". +config UPROBES + bool "User-space probes (EXPERIMENTAL)" + depends on UTRACE && MODULES && UBP + depends on HAVE_UPROBES + help + Uprobes enables kernel modules to establish probepoints + in user applications and execute handler functions when + the probepoints are hit. For more information, refer to + Documentation/uprobes.txt. If in doubt, say "N". + config HAVE_EFFICIENT_UNALIGNED_ACCESS bool help @@ -95,6 +105,8 @@ config HAVE_KPROBES config HAVE_KRETPROBES bool +config HAVE_UPROBES + def_bool n # # An arch should select this if it provides all these things: # Index: uprobes.git/include/linux/uprobes.h =================================================================== --- /dev/null +++ uprobes.git/include/linux/uprobes.h @@ -0,0 +1,292 @@ +#ifndef _LINUX_UPROBES_H +#define _LINUX_UPROBES_H +/* + * Userspace Probes (UProbes) + * include/linux/uprobes.h + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. + * + * Copyright (C) IBM Corporation, 2006, 2009 + */ +#include +#include + +struct pt_regs; + +/* This is what the user supplies us. */ +struct uprobe { + /* + * The pid of the probed process. Currently, this can be the + * thread ID (task->pid) of any active thread in the process. + */ + pid_t pid; + + /* Location of the probepoint */ + unsigned long vaddr; + + /* Handler to run when the probepoint is hit */ + void (*handler)(struct uprobe*, struct pt_regs*); + + /* + * This function, if non-NULL, will be called upon completion of + * an ASYNCHRONOUS registration (i.e., one initiated by a uprobe + * handler). reg = 1 for register, 0 for unregister. + */ + void (*registration_callback)(struct uprobe *u, int reg, int result); + + /* Reserved for use by uprobes */ + void *kdata; +}; + +#if defined(CONFIG_UPROBES) +extern int register_uprobe(struct uprobe *u); +extern void unregister_uprobe(struct uprobe *u); +#else +static inline int register_uprobe(struct uprobe *u) +{ + return -ENOSYS; +} +static inline void unregister_uprobe(struct uprobe *u) +{ +} +#endif /* CONFIG_UPROBES */ + +#ifdef UPROBES_IMPLEMENTATION + +#include +#include +#include +#include +#include +#include +#include + +struct utrace_engine; +struct task_struct; +struct pid; + +enum uprobe_probept_state { + UPROBE_INSERTING, /* process quiescing prior to insertion */ + UPROBE_BP_SET, /* breakpoint in place */ + UPROBE_REMOVING, /* process quiescing prior to removal */ + UPROBE_DISABLED /* removal completed */ +}; + +enum uprobe_task_state { + UPTASK_QUIESCENT, + UPTASK_SLEEPING, /* See utask_fake_quiesce(). */ + UPTASK_RUNNING, + UPTASK_BP_HIT, + UPTASK_SSTEP +}; + +enum uprobe_ssil_state { + SSIL_DISABLE, + SSIL_CLEAR, + SSIL_SET +}; + +#define UPROBE_HASH_BITS 5 +#define UPROBE_TABLE_SIZE (1 << UPROBE_HASH_BITS) + +/* + * uprobe_process -- not a user-visible struct. + * A uprobe_process represents a probed process. A process can have + * multiple probepoints (each represented by a uprobe_probept) and + * one or more threads (each represented by a uprobe_task). + */ +struct uprobe_process { + /* + * rwsem is write-locked for any change to the uprobe_process's + * graph (including uprobe_tasks, uprobe_probepts, and uprobe_kimgs) -- + * e.g., due to probe [un]registration or special events like exit. + * It's read-locked during the whole time we process a probepoint hit. + */ + struct rw_semaphore rwsem; + + /* Table of uprobe_probepts registered for this process */ + /* TODO: Switch to list_head[] per Ingo. */ + struct hlist_head uprobe_table[UPROBE_TABLE_SIZE]; + + /* List of uprobe_probepts awaiting insertion or removal */ + struct list_head pending_uprobes; + + /* List of uprobe_tasks in this task group */ + struct list_head thread_list; + int nthreads; + int n_quiescent_threads; + + /* this goes on the uproc_table */ + struct hlist_node hlist; + + /* + * All threads (tasks) in a process share the same uprobe_process. + */ + struct pid *tg_leader; + pid_t tgid; + + /* Threads in UTASK_SLEEPING state wait here to be roused. */ + wait_queue_head_t waitq; + + /* + * We won't free the uprobe_process while... + * - any register/unregister operations on it are in progress; or + * - any uprobe_report_* callbacks are running; or + * - uprobe_table[] is not empty; or + * - any tasks are UTASK_SLEEPING in the waitq; + * refcount reflects this. We do NOT ref-count tasks (threads), + * since once the last thread has exited, the rest is academic. + */ + atomic_t refcount; + + /* + * finished = 1 means the process is execing or the last thread + * is exiting, and we're cleaning up the uproc. If the execed + * process is probed, a new uproc will be created. + */ + bool finished; + + /* + * 1 to single-step out of line; 0 for inline. This can drop to + * 0 if we can't set up the XOL area, but never goes from 0 to 1. + */ + bool sstep_out_of_line; + + /* + * Manages slots for instruction-copies to be single-stepped + * out of line. + */ + void *xol_area; +}; + +/* + * uprobe_kimg -- not a user-visible struct. + * Holds implementation-only per-uprobe data. + * uprobe->kdata points to this. + */ +struct uprobe_kimg { + struct uprobe *uprobe; + struct uprobe_probept *ppt; + + /* + * -EBUSY while we're waiting for all threads to quiesce so the + * associated breakpoint can be inserted or removed. + * 0 if the the insert/remove operation has succeeded, or -errno + * otherwise. + */ + int status; + + /* on ppt's list */ + struct list_head list; +}; + +/* + * uprobe_probept -- not a user-visible struct. + * A probepoint, at which several uprobes can be registered. + * Guarded by uproc->rwsem. + */ +struct uprobe_probept { + /* breakpoint/XOL details */ + struct ubp_bkpt ubp; + + /* The uprobe_kimg(s) associated with this uprobe_probept */ + struct list_head uprobe_list; + + enum uprobe_probept_state state; + + /* The parent uprobe_process */ + struct uprobe_process *uproc; + + /* + * ppt goes in the uprobe_process->uprobe_table when registered -- + * even before the breakpoint has been inserted. + */ + struct hlist_node ut_node; + + /* + * ppt sits in the uprobe_process->pending_uprobes queue while + * awaiting insertion or removal of the breakpoint. + */ + struct list_head pd_node; + + /* [un]register_uprobe() waits 'til bkpt inserted/removed */ + wait_queue_head_t waitq; + + /* + * ssil_lock, ssilq and ssil_state are used to serialize + * single-stepping inline, so threads don't clobber each other + * swapping the breakpoint instruction in and out. This helps + * prevent crashing the probed app, but it does NOT prevent + * probe misses while the breakpoint is swapped out. + * ssilq - threads wait for their chance to single-step inline. + */ + spinlock_t ssil_lock; + wait_queue_head_t ssilq; + enum uprobe_ssil_state ssil_state; +}; + +/* + * uprobe_utask -- not a user-visible struct. + * Corresponds to a thread in a probed process. + * Guarded by uproc->rwsem. + */ +struct uprobe_task { + /* Lives in the global utask_table */ + struct hlist_node hlist; + + /* Lives on the thread_list for the uprobe_process */ + struct list_head list; + + struct task_struct *tsk; + struct pid *pid; + + /* The utrace engine for this task */ + struct utrace_engine *engine; + + /* Back pointer to the associated uprobe_process */ + struct uprobe_process *uproc; + + enum uprobe_task_state state; + + /* + * quiescing = 1 means this task has been asked to quiesce. + * It may not be able to comply immediately if it's hit a bkpt. + */ + bool quiescing; + + /* Set before running handlers; cleared after single-stepping. */ + struct uprobe_probept *active_probe; + + /* Saved address of copied original instruction */ + long singlestep_addr; + + struct ubp_task_arch_info arch_info; + + /* + * Unexpected error in probepoint handling has left task's + * text or stack corrupted. Kill task ASAP. + */ + bool doomed; + + /* [un]registrations initiated by handlers must be asynchronous. */ + struct list_head deferred_registrations; + + /* Delay handler-destined signals 'til after single-step done. */ + struct list_head delayed_signals; +}; + +#endif /* UPROBES_IMPLEMENTATION */ + +#endif /* _LINUX_UPROBES_H */ Index: uprobes.git/kernel/uprobes_core.c =================================================================== --- /dev/null +++ uprobes.git/kernel/uprobes_core.c @@ -0,0 +1,1890 @@ +/* + * Userspace Probes (UProbes) + * kernel/uprobes_core.c + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. + * + * Copyright (C) IBM Corporation, 2006, 2009 + */ +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#define UPROBES_IMPLEMENTATION 1 +#include +#include +#include +#include +#include +#include + +#define UPROBE_SET_FLAGS 1 +#define UPROBE_CLEAR_FLAGS 0 + +#define MAX_XOL_SLOTS 1024 + +static int utask_fake_quiesce(struct uprobe_task *utask); +static int uprobe_post_ssout(struct uprobe_task *utask, + struct uprobe_probept *ppt, struct pt_regs *regs); + +typedef void (*uprobe_handler_t)(struct uprobe*, struct pt_regs*); + +/* + * Table of currently probed processes, hashed by task-group leader's + * struct pid. + */ +static struct hlist_head uproc_table[UPROBE_TABLE_SIZE]; + +/* Protects uproc_table during uprobe (un)registration */ +static DEFINE_MUTEX(uproc_mutex); + +/* Table of uprobe_tasks, hashed by task_struct pointer. */ +static struct hlist_head utask_table[UPROBE_TABLE_SIZE]; +static DEFINE_SPINLOCK(utask_table_lock); + +/* p_uprobe_utrace_ops = &uprobe_utrace_ops. Fwd refs are a pain w/o this. */ +static const struct utrace_engine_ops *p_uprobe_utrace_ops; + +struct deferred_registration { + struct list_head list; + struct uprobe *uprobe; + int regflag; /* 0 - unregister, 1 - register */ +}; + +/* + * Calling a signal handler cancels single-stepping, so uprobes delays + * calling the handler, as necessary, until after single-stepping is completed. + */ +struct delayed_signal { + struct list_head list; + siginfo_t info; +}; + +static u16 ubp_strategies; + +static struct uprobe_task *uprobe_find_utask(struct task_struct *tsk) +{ + struct hlist_head *head; + struct hlist_node *node; + struct uprobe_task *utask; + unsigned long flags; + + head = &utask_table[hash_ptr(tsk, UPROBE_HASH_BITS)]; + spin_lock_irqsave(&utask_table_lock, flags); + hlist_for_each_entry(utask, node, head, hlist) { + if (utask->tsk == tsk) { + spin_unlock_irqrestore(&utask_table_lock, flags); + return utask; + } + } + spin_unlock_irqrestore(&utask_table_lock, flags); + return NULL; +} + +static void uprobe_hash_utask(struct uprobe_task *utask) +{ + struct hlist_head *head; + unsigned long flags; + + INIT_HLIST_NODE(&utask->hlist); + head = &utask_table[hash_ptr(utask->tsk, UPROBE_HASH_BITS)]; + spin_lock_irqsave(&utask_table_lock, flags); + hlist_add_head(&utask->hlist, head); + spin_unlock_irqrestore(&utask_table_lock, flags); +} + +static void uprobe_unhash_utask(struct uprobe_task *utask) +{ + unsigned long flags; + + spin_lock_irqsave(&utask_table_lock, flags); + hlist_del(&utask->hlist); + spin_unlock_irqrestore(&utask_table_lock, flags); +} + +static inline void uprobe_get_process(struct uprobe_process *uproc) +{ + atomic_inc(&uproc->refcount); +} + +/* + * Decrement uproc's refcount in a situation where we "know" it can't + * reach zero. It's OK to call this with uproc locked. Compare with + * uprobe_put_process(). + */ +static inline void uprobe_decref_process(struct uprobe_process *uproc) +{ + if (atomic_dec_and_test(&uproc->refcount)) + BUG(); +} + +/* + * Runs with the uproc_mutex held. Returns with uproc ref-counted and + * write-locked. + * + * Around exec time, briefly, it's possible to have one (finished) uproc + * for the old image and one for the new image. We find the latter. + */ +static struct uprobe_process *uprobe_find_process(struct pid *tg_leader) +{ + struct uprobe_process *uproc; + struct hlist_head *head; + struct hlist_node *node; + + head = &uproc_table[hash_ptr(tg_leader, UPROBE_HASH_BITS)]; + hlist_for_each_entry(uproc, node, head, hlist) { + if (uproc->tg_leader == tg_leader && !uproc->finished) { + uprobe_get_process(uproc); + down_write(&uproc->rwsem); + return uproc; + } + } + return NULL; +} + +/* + * In the given uproc's hash table of probepoints, find the one with the + * specified virtual address. Runs with uproc->rwsem locked. + */ +static struct uprobe_probept *uprobe_find_probept(struct uprobe_process *uproc, + unsigned long vaddr) +{ + struct uprobe_probept *ppt; + struct hlist_node *node; + struct hlist_head *head = &uproc->uprobe_table[hash_long(vaddr, + UPROBE_HASH_BITS)]; + + hlist_for_each_entry(ppt, node, head, ut_node) { + if (ppt->ubp.vaddr == vaddr && ppt->state != UPROBE_DISABLED) + return ppt; + } + return NULL; +} + +/* + * Save a copy of the original instruction (so it can be single-stepped + * out of line), insert the breakpoint instruction, and awake + * register_uprobe(). + */ +static void uprobe_insert_bkpt(struct uprobe_probept *ppt, + struct task_struct *tsk) +{ + struct uprobe_kimg *uk; + int result; + + if (tsk) + result = ubp_insert_bkpt(tsk, &ppt->ubp); + else + /* No surviving tasks associated with ppt->uproc */ + result = -ESRCH; + ppt->state = (result ? UPROBE_DISABLED : UPROBE_BP_SET); + list_for_each_entry(uk, &ppt->uprobe_list, list) + uk->status = result; + wake_up_all(&ppt->waitq); +} + +/* + * ppt's breakpoint has been removed. If any threads are in the middle of + * single-stepping at this probepoint, fix things up so they can proceed. + * Runs with all of ppt->uproc's threads quiesced and ppt->uproc->rwsem + * write-locked + */ +static inline void adjust_ip_active_ppt(struct uprobe_probept *ppt) +{ +#ifdef CONFIG_UBP_XOL + struct uprobe_process *uproc = ppt->uproc; + struct uprobe_task *utask; + struct pt_regs *regs; + + list_for_each_entry(utask, &uproc->thread_list, list) { + if (utask->active_probe != ppt) + continue; + /* + * Current thread cannot have an active breakpoint + * and still request for a breakpoint removal. The + * above case is handled by utask_fake_quiesce(). + */ + BUG_ON(utask->tsk == current); + + regs = task_pt_regs(utask->tsk); + if (instruction_pointer(regs) == ppt->ubp.xol_vaddr) + /* adjust the ip to breakpoint addr. */ + ubp_set_ip(regs, ppt->ubp.vaddr); + else + /* adjust the ip to next instruction. */ + uprobe_post_ssout(utask, ppt, regs); + } +#endif +} + +static void uprobe_remove_bkpt(struct uprobe_probept *ppt, + struct task_struct *tsk) +{ + if (tsk) { + if (ubp_remove_bkpt(tsk, &ppt->ubp) != 0) { + printk(KERN_ERR + "Error removing uprobe at pid %d vaddr %#lx:" + " can't restore original instruction\n", + tsk->tgid, ppt->ubp.vaddr); + /* + * This shouldn't happen, since we were previously + * able to write the breakpoint at that address. + * There's not much we can do besides let the + * process die with a SIGTRAP the next time the + * breakpoint is hit. + */ + } + if (!(ppt->ubp.strategy & UBP_HNT_INLINE)) + adjust_ip_active_ppt(ppt); + else { + unsigned long flags; + spin_lock_irqsave(&ppt->ssil_lock, flags); + ppt->ssil_state = SSIL_DISABLE; + wake_up_all(&ppt->ssilq); + spin_unlock_irqrestore(&ppt->ssil_lock, flags); + } + } + /* Wake up unregister_uprobe(). */ + ppt->state = UPROBE_DISABLED; + wake_up_all(&ppt->waitq); +} + +/* + * Runs with all of uproc's threads quiesced and uproc->rwsem write-locked. + * As specified, insert or remove the breakpoint instruction for each + * uprobe_probept on uproc's pending list. + * tsk = one of the tasks associated with uproc -- NULL if there are + * no surviving threads. + * It's OK for uproc->pending_uprobes to be empty here. It can happen + * if a register and an unregister are requested (by different probers) + * simultaneously for the same pid/vaddr. + */ +static void handle_pending_uprobes(struct uprobe_process *uproc, + struct task_struct *tsk) +{ + struct uprobe_probept *ppt, *tmp; + + list_for_each_entry_safe(ppt, tmp, &uproc->pending_uprobes, pd_node) { + switch (ppt->state) { + case UPROBE_INSERTING: + uprobe_insert_bkpt(ppt, tsk); + break; + case UPROBE_REMOVING: + uprobe_remove_bkpt(ppt, tsk); + break; + default: + BUG(); + } + list_del(&ppt->pd_node); + } +} + +static void utask_adjust_flags(struct uprobe_task *utask, int set, + unsigned long flags) +{ + unsigned long newflags, oldflags; + + newflags = oldflags = utask->engine->flags; + if (set) + newflags |= flags; + else + newflags &= ~flags; + /* + * utrace_barrier[_pid] is not appropriate here. If we're + * adjusting current, it's not needed. And if we're adjusting + * some other task, we're holding utask->uproc->rwsem, which + * could prevent that task from completing the callback we'd + * be waiting on. + */ + if (newflags != oldflags) { + if (utrace_set_events_pid(utask->pid, utask->engine, + newflags) != 0) + /* We don't care. */ + ; + } +} + +static inline void clear_utrace_quiesce(struct uprobe_task *utask, bool resume) +{ + utask_adjust_flags(utask, UPROBE_CLEAR_FLAGS, UTRACE_EVENT(QUIESCE)); + if (resume) { + if (utrace_control_pid(utask->pid, utask->engine, + UTRACE_RESUME) != 0) + /* We don't care. */ + ; + } +} + +/* Opposite of quiesce_all_threads(). Same locking applies. */ +static void rouse_all_threads(struct uprobe_process *uproc) +{ + struct uprobe_task *utask; + + list_for_each_entry(utask, &uproc->thread_list, list) { + if (utask->quiescing) { + utask->quiescing = false; + if (utask->state == UPTASK_QUIESCENT) { + utask->state = UPTASK_RUNNING; + uproc->n_quiescent_threads--; + clear_utrace_quiesce(utask, true); + } + } + } + /* Wake any threads that decided to sleep rather than quiesce. */ + wake_up_all(&uproc->waitq); +} + +/* + * If all of uproc's surviving threads have quiesced, do the necessary + * breakpoint insertions or removals, un-quiesce everybody, and return 1. + * tsk is a surviving thread, or NULL if there is none. Runs with + * uproc->rwsem write-locked. + */ +static int check_uproc_quiesced(struct uprobe_process *uproc, + struct task_struct *tsk) +{ + if (uproc->n_quiescent_threads >= uproc->nthreads) { + handle_pending_uprobes(uproc, tsk); + rouse_all_threads(uproc); + return 1; + } + return 0; +} + +/* Direct the indicated thread to quiesce. */ +static void uprobe_stop_thread(struct uprobe_task *utask) +{ + int result; + + /* + * As with utask_adjust_flags, calling utrace_barrier_pid below + * could deadlock. + */ + BUG_ON(utask->tsk == current); + result = utrace_control_pid(utask->pid, utask->engine, UTRACE_STOP); + if (result == 0) { + /* Already stopped. */ + utask->state = UPTASK_QUIESCENT; + utask->uproc->n_quiescent_threads++; + } else if (result == -EINPROGRESS) { + if (utask->tsk->state & TASK_INTERRUPTIBLE) { + /* + * Task could be in interruptible wait for a long + * time -- e.g., if stopped for I/O. But we know + * it's not going to run user code before all + * threads quiesce, so pretend it's quiesced. + * This avoids terminating a system call via + * UTRACE_INTERRUPT. + */ + utask->state = UPTASK_QUIESCENT; + utask->uproc->n_quiescent_threads++; + } else { + /* + * Task will eventually stop, but it may be a long time. + * Don't wait. + */ + result = utrace_control_pid(utask->pid, utask->engine, + UTRACE_INTERRUPT); + if (result != 0) + /* We don't care. */ + ; + } + } +} + +/* + * Quiesce all threads in the specified process -- e.g., prior to + * breakpoint insertion. Runs with uproc->rwsem write-locked. + * Returns false if all threads have died. + */ +static bool quiesce_all_threads(struct uprobe_process *uproc, + struct uprobe_task **cur_utask_quiescing) +{ + struct uprobe_task *utask; + struct task_struct *survivor = NULL; /* any survivor */ + bool survivors = false; + + *cur_utask_quiescing = NULL; + list_for_each_entry(utask, &uproc->thread_list, list) { + if (!survivors) { + survivor = pid_task(utask->pid, PIDTYPE_PID); + if (survivor) + survivors = true; + } + if (!utask->quiescing) { + /* + * If utask is currently handling a probepoint, it'll + * check utask->quiescing and quiesce when it's done. + */ + utask->quiescing = true; + if (utask->tsk == current) + *cur_utask_quiescing = utask; + else if (utask->state == UPTASK_RUNNING) { + utask_adjust_flags(utask, UPROBE_SET_FLAGS, + UTRACE_EVENT(QUIESCE)); + uprobe_stop_thread(utask); + } + } + } + /* + * If all the (other) threads are already quiesced, it's up to the + * current thread to do the necessary work. + */ + check_uproc_quiesced(uproc, survivor); + return survivors; +} + +/* Called with utask->uproc write-locked. */ +static void uprobe_free_task(struct uprobe_task *utask, bool in_callback) +{ + struct deferred_registration *dr, *d; + struct delayed_signal *ds, *ds2; + int result; + + if (utask->engine && (utask->tsk != current || !in_callback)) { + /* + * No other tasks in this process should be running + * uprobe_report_* callbacks. (If they are, utrace_barrier() + * here could deadlock.) + */ + result = utrace_control_pid(utask->pid, utask->engine, + UTRACE_DETACH); + BUG_ON(result == -EINPROGRESS); + } + put_pid(utask->pid); /* null pid OK */ + + uprobe_unhash_utask(utask); + list_del(&utask->list); + list_for_each_entry_safe(dr, d, &utask->deferred_registrations, list) { + list_del(&dr->list); + kfree(dr); + } + + list_for_each_entry_safe(ds, ds2, &utask->delayed_signals, list) { + list_del(&ds->list); + kfree(ds); + } + + kfree(utask); +} + +/* + * Dismantle uproc and all its remaining uprobe_tasks. + * in_callback = 1 if the caller is a uprobe_report_* callback who will + * handle the UTRACE_DETACH operation. + * Runs with uproc_mutex held; called with uproc->rwsem write-locked. + */ +static void uprobe_free_process(struct uprobe_process *uproc, int in_callback) +{ + struct uprobe_task *utask, *tmp; + + if (!hlist_unhashed(&uproc->hlist)) + hlist_del(&uproc->hlist); + list_for_each_entry_safe(utask, tmp, &uproc->thread_list, list) + uprobe_free_task(utask, in_callback); + put_pid(uproc->tg_leader); + if (uproc->xol_area) + xol_put_area(uproc->xol_area); + up_write(&uproc->rwsem); /* So kfree doesn't complain */ + kfree(uproc); +} + +/* + * Decrement uproc's ref count. If it's zero, free uproc and return + * 1. Else return 0. If uproc is locked, don't call this; use + * uprobe_decref_process(). + */ +static int uprobe_put_process(struct uprobe_process *uproc, bool in_callback) +{ + int freed = 0; + + if (atomic_dec_and_test(&uproc->refcount)) { + mutex_lock(&uproc_mutex); + down_write(&uproc->rwsem); + if (unlikely(atomic_read(&uproc->refcount) != 0)) { + /* + * The works because uproc_mutex is held any + * time the ref count can go from 0 to 1 -- e.g., + * register_uprobe() sneaks in with a new probe. + */ + up_write(&uproc->rwsem); + } else { + uprobe_free_process(uproc, in_callback); + freed = 1; + } + mutex_unlock(&uproc_mutex); + } + return freed; +} + +static struct uprobe_kimg *uprobe_mk_kimg(struct uprobe *u) +{ + struct uprobe_kimg *uk = kzalloc(sizeof *uk, + GFP_USER); + + if (unlikely(!uk)) + return ERR_PTR(-ENOMEM); + u->kdata = uk; + uk->uprobe = u; + uk->ppt = NULL; + INIT_LIST_HEAD(&uk->list); + uk->status = -EBUSY; + return uk; +} + +/* + * Allocate a uprobe_task object for p and add it to uproc's list. + * Called with p "got" and uproc->rwsem write-locked. Called in one of + * the following cases: + * - before setting the first uprobe in p's process + * - we're in uprobe_report_clone() and p is the newly added thread + * Returns: + * - pointer to new uprobe_task on success + * - NULL if t dies before we can utrace_attach it + * - negative errno otherwise + */ +static struct uprobe_task *uprobe_add_task(struct pid *p, + struct uprobe_process *uproc) +{ + struct uprobe_task *utask; + struct utrace_engine *engine; + struct task_struct *t = pid_task(p, PIDTYPE_PID); + + if (!t) + return NULL; + utask = kzalloc(sizeof *utask, GFP_USER); + if (unlikely(utask == NULL)) + return ERR_PTR(-ENOMEM); + + utask->pid = p; + utask->tsk = t; + utask->state = UPTASK_RUNNING; + utask->quiescing = false; + utask->uproc = uproc; + utask->active_probe = NULL; + utask->doomed = false; + INIT_LIST_HEAD(&utask->deferred_registrations); + INIT_LIST_HEAD(&utask->delayed_signals); + INIT_LIST_HEAD(&utask->list); + list_add_tail(&utask->list, &uproc->thread_list); + uprobe_hash_utask(utask); + + engine = utrace_attach_pid(p, UTRACE_ATTACH_CREATE, + p_uprobe_utrace_ops, utask); + if (IS_ERR(engine)) { + long err = PTR_ERR(engine); + printk("uprobes: utrace_attach_task failed, returned %ld\n", + err); + uprobe_free_task(utask, 0); + if (err == -ESRCH) + return NULL; + return ERR_PTR(err); + } + utask->engine = engine; + /* + * Always watch for traps, clones, execs and exits. Caller must + * set any other engine flags. + */ + utask_adjust_flags(utask, UPROBE_SET_FLAGS, + UTRACE_EVENT(SIGNAL) | UTRACE_EVENT(SIGNAL_IGN) | + UTRACE_EVENT(SIGNAL_CORE) | UTRACE_EVENT(EXEC) | + UTRACE_EVENT(CLONE) | UTRACE_EVENT(EXIT)); + /* + * Note that it's OK if t dies just after utrace_attach, because + * with the engine in place, the appropriate report_* callback + * should handle it after we release uproc->rwsem. + */ + return utask; +} + +/* + * start_pid is the pid for a thread in the probed process. Find the + * next thread that doesn't have a corresponding uprobe_task yet. Return + * a ref-counted pid for that task, if any, else NULL. + */ +static struct pid *find_next_thread_to_add(struct uprobe_process *uproc, + struct pid *start_pid) +{ + struct task_struct *t, *start; + struct uprobe_task *utask; + struct pid *pid = NULL; + + rcu_read_lock(); + t = start = pid_task(start_pid, PIDTYPE_PID); + if (t) { + do { + if (unlikely(t->flags & PF_EXITING)) + goto dont_add; + list_for_each_entry(utask, &uproc->thread_list, list) { + if (utask->tsk == t) + /* Already added */ + goto dont_add; + } + /* Found thread/task to add. */ + pid = get_pid(task_pid(t)); + break; +dont_add: + t = next_thread(t); + } while (t != start); + } + rcu_read_unlock(); + return pid; +} + +/* Runs with uproc_mutex held; returns with uproc->rwsem write-locked. */ +static struct uprobe_process *uprobe_mk_process(struct pid *tg_leader) +{ + struct uprobe_process *uproc; + struct uprobe_task *utask; + struct pid *add_me; + int i; + long err; + + uproc = kzalloc(sizeof *uproc, GFP_USER); + if (unlikely(uproc == NULL)) + return ERR_PTR(-ENOMEM); + + /* Initialize fields */ + atomic_set(&uproc->refcount, 1); + init_rwsem(&uproc->rwsem); + down_write(&uproc->rwsem); + init_waitqueue_head(&uproc->waitq); + for (i = 0; i < UPROBE_TABLE_SIZE; i++) + INIT_HLIST_HEAD(&uproc->uprobe_table[i]); + INIT_LIST_HEAD(&uproc->pending_uprobes); + INIT_LIST_HEAD(&uproc->thread_list); + uproc->nthreads = 0; + uproc->n_quiescent_threads = 0; + INIT_HLIST_NODE(&uproc->hlist); + uproc->tg_leader = get_pid(tg_leader); + uproc->tgid = pid_task(tg_leader, PIDTYPE_PID)->tgid; + uproc->finished = false; + +#ifdef CONFIG_UBP_XOL + if (!(ubp_strategies & UBP_HNT_INLINE)) + uproc->sstep_out_of_line = true; + else +#endif + uproc->sstep_out_of_line = false; + + /* + * Create and populate one utask per thread in this process. We + * can't call uprobe_add_task() while holding RCU lock, so we: + * 1. rcu_read_lock() + * 2. Find the next thread, add_me, in this process that's not + * already on uproc's thread_list. + * 3. rcu_read_unlock() + * 4. uprobe_add_task(add_me, uproc) + * Repeat 1-4 'til we have utasks for all threads. + */ + add_me = tg_leader; + while ((add_me = find_next_thread_to_add(uproc, add_me)) != NULL) { + utask = uprobe_add_task(add_me, uproc); + if (IS_ERR(utask)) { + err = PTR_ERR(utask); + goto fail; + } + if (utask) + uproc->nthreads++; + } + + if (uproc->nthreads == 0) { + /* All threads -- even p -- are dead. */ + err = -ESRCH; + goto fail; + } + return uproc; + +fail: + uprobe_free_process(uproc, 0); + return ERR_PTR(err); +} + +/* + * Creates a uprobe_probept and connects it to uk and uproc. Runs with + * uproc->rwsem write-locked. + */ +static struct uprobe_probept *uprobe_add_probept(struct uprobe_kimg *uk, + struct uprobe_process *uproc) +{ + struct uprobe_probept *ppt; + + ppt = kzalloc(sizeof *ppt, GFP_USER); + if (unlikely(ppt == NULL)) + return ERR_PTR(-ENOMEM); + init_waitqueue_head(&ppt->waitq); + init_waitqueue_head(&ppt->ssilq); + spin_lock_init(&ppt->ssil_lock); + ppt->ssil_state = SSIL_CLEAR; + + /* Connect to uk. */ + INIT_LIST_HEAD(&ppt->uprobe_list); + list_add_tail(&uk->list, &ppt->uprobe_list); + uk->ppt = ppt; + uk->status = -EBUSY; + ppt->ubp.vaddr = uk->uprobe->vaddr; + ppt->ubp.xol_vaddr = 0; + + /* Connect to uproc. */ + if (!uproc->sstep_out_of_line) + ppt->ubp.strategy = UBP_HNT_INLINE; + else + ppt->ubp.strategy = ubp_strategies; + ppt->state = UPROBE_INSERTING; + ppt->uproc = uproc; + INIT_LIST_HEAD(&ppt->pd_node); + list_add_tail(&ppt->pd_node, &uproc->pending_uprobes); + INIT_HLIST_NODE(&ppt->ut_node); + hlist_add_head(&ppt->ut_node, + &uproc->uprobe_table[hash_long(ppt->ubp.vaddr, + UPROBE_HASH_BITS)]); + uprobe_get_process(uproc); + return ppt; +} + +/* + * Runs with ppt->uproc write-locked. Frees ppt and decrements the ref + * count on ppt->uproc (but ref count shouldn't hit 0). + */ +static void uprobe_free_probept(struct uprobe_probept *ppt) +{ + struct uprobe_process *uproc = ppt->uproc; + + xol_free_insn_slot(ppt->ubp.xol_vaddr, uproc->xol_area); + hlist_del(&ppt->ut_node); + kfree(ppt); + uprobe_decref_process(uproc); +} + +static void uprobe_free_kimg(struct uprobe_kimg *uk) +{ + uk->uprobe->kdata = NULL; + kfree(uk); +} + +/* + * Runs with uprobe_process write-locked. + * Note that we never free uk->uprobe, because the user owns that. + */ +static void purge_uprobe(struct uprobe_kimg *uk) +{ + struct uprobe_probept *ppt = uk->ppt; + + list_del(&uk->list); + uprobe_free_kimg(uk); + if (list_empty(&ppt->uprobe_list)) + uprobe_free_probept(ppt); +} + +/* Runs with utask->uproc read-locked. Returns -EINPROGRESS on success. */ +static int defer_registration(struct uprobe *u, int regflag, + struct uprobe_task *utask) +{ + struct deferred_registration *dr; + + dr = kmalloc(sizeof(struct deferred_registration), GFP_USER); + if (!dr) + return -ENOMEM; + dr->uprobe = u; + dr->regflag = regflag; + INIT_LIST_HEAD(&dr->list); + list_add_tail(&dr->list, &utask->deferred_registrations); + return -EINPROGRESS; +} + +/* + * Given a numeric thread ID, return a ref-counted struct pid for the + * task-group-leader thread. + */ +static struct pid *uprobe_get_tg_leader(pid_t p) +{ + struct pid *pid = NULL; + + rcu_read_lock(); + if (current->nsproxy) + pid = find_vpid(p); + if (pid) { + struct task_struct *t = pid_task(pid, PIDTYPE_PID); + if (t) + pid = task_tgid(t); + else + pid = NULL; + } + rcu_read_unlock(); + return get_pid(pid); /* null pid OK here */ +} + +/* See Documentation/uprobes.txt. */ +int register_uprobe(struct uprobe *u) +{ + struct uprobe_task *cur_utask, *cur_utask_quiescing = NULL; + struct uprobe_process *uproc; + struct uprobe_probept *ppt; + struct uprobe_kimg *uk; + struct pid *p; + int ret = 0, uproc_is_new = 0; + bool survivors; +#ifndef CONFIG_UBP_XOL + struct task_struct *tsk; +#endif + + if (!u || !u->handler) + return -EINVAL; + + p = uprobe_get_tg_leader(u->pid); + if (!p) + return -ESRCH; + + cur_utask = uprobe_find_utask(current); + if (cur_utask && cur_utask->active_probe) { + /* + * Called from handler; cur_utask->uproc is read-locked. + * Do this registration later. + */ + put_pid(p); + return defer_registration(u, 1, cur_utask); + } + + /* Get the uprobe_process for this pid, or make a new one. */ + mutex_lock(&uproc_mutex); + uproc = uprobe_find_process(p); + + if (uproc) + mutex_unlock(&uproc_mutex); + else { + uproc = uprobe_mk_process(p); + if (IS_ERR(uproc)) { + ret = (int) PTR_ERR(uproc); + mutex_unlock(&uproc_mutex); + goto fail_tsk; + } + /* Hold uproc_mutex until we've added uproc to uproc_table. */ + uproc_is_new = 1; + } + +#ifdef CONFIG_UBP_XOL + ret = xol_validate_vaddr(p, u->vaddr, uproc->xol_area); +#else + tsk = pid_task(p, PIDTYPE_PID); + ret = ubp_validate_insn_addr(tsk, u->vaddr); +#endif + if (ret < 0) + goto fail_uproc; + + if (u->kdata) { + /* + * Probe is already/still registered. This is the only + * place we return -EBUSY to the user. + */ + ret = -EBUSY; + goto fail_uproc; + } + + uk = uprobe_mk_kimg(u); + if (IS_ERR(uk)) { + ret = (int) PTR_ERR(uk); + goto fail_uproc; + } + + /* See if we already have a probepoint at the vaddr. */ + ppt = (uproc_is_new ? NULL : uprobe_find_probept(uproc, u->vaddr)); + if (ppt) { + /* Breakpoint is already in place, or soon will be. */ + uk->ppt = ppt; + list_add_tail(&uk->list, &ppt->uprobe_list); + switch (ppt->state) { + case UPROBE_INSERTING: + uk->status = -EBUSY; /* in progress */ + if (uproc->tg_leader == task_tgid(current)) { + cur_utask_quiescing = cur_utask; + BUG_ON(!cur_utask_quiescing); + } + break; + case UPROBE_REMOVING: + /* Wait! Don't remove that bkpt after all! */ + ppt->state = UPROBE_BP_SET; + /* Remove from pending list. */ + list_del(&ppt->pd_node); + /* Wake unregister_uprobe(). */ + wake_up_all(&ppt->waitq); + /*FALLTHROUGH*/ + case UPROBE_BP_SET: + uk->status = 0; + break; + default: + BUG(); + } + up_write(&uproc->rwsem); + put_pid(p); + if (uk->status == 0) { + uprobe_decref_process(uproc); + return 0; + } + goto await_bkpt_insertion; + } else { + ppt = uprobe_add_probept(uk, uproc); + if (IS_ERR(ppt)) { + ret = (int) PTR_ERR(ppt); + goto fail_uk; + } + } + + if (uproc_is_new) { + hlist_add_head(&uproc->hlist, + &uproc_table[hash_ptr(uproc->tg_leader, + UPROBE_HASH_BITS)]); + mutex_unlock(&uproc_mutex); + } + put_pid(p); + survivors = quiesce_all_threads(uproc, &cur_utask_quiescing); + + if (!survivors) { + purge_uprobe(uk); + up_write(&uproc->rwsem); + uprobe_put_process(uproc, false); + return -ESRCH; + } + up_write(&uproc->rwsem); + +await_bkpt_insertion: + if (cur_utask_quiescing) + /* Current task is probing its own process. */ + (void) utask_fake_quiesce(cur_utask_quiescing); + else + wait_event(ppt->waitq, ppt->state != UPROBE_INSERTING); + ret = uk->status; + if (ret != 0) { + down_write(&uproc->rwsem); + purge_uprobe(uk); + up_write(&uproc->rwsem); + } + uprobe_put_process(uproc, false); + return ret; + +fail_uk: + uprobe_free_kimg(uk); + +fail_uproc: + if (uproc_is_new) { + uprobe_free_process(uproc, 0); + mutex_unlock(&uproc_mutex); + } else { + up_write(&uproc->rwsem); + uprobe_put_process(uproc, false); + } + +fail_tsk: + put_pid(p); + return ret; +} +EXPORT_SYMBOL_GPL(register_uprobe); + +/* See Documentation/uprobes.txt. */ +void unregister_uprobe(struct uprobe *u) +{ + struct pid *p; + struct uprobe_process *uproc; + struct uprobe_kimg *uk; + struct uprobe_probept *ppt; + struct uprobe_task *cur_utask, *cur_utask_quiescing = NULL; + + if (!u) + return; + p = uprobe_get_tg_leader(u->pid); + if (!p) + return; + + cur_utask = uprobe_find_utask(current); + if (cur_utask && cur_utask->active_probe) { + /* Called from handler; uproc is read-locked; do this later */ + put_pid(p); + (void) defer_registration(u, 0, cur_utask); + return; + } + + /* + * Lock uproc before walking the graph, in case the process we're + * probing is exiting. + */ + mutex_lock(&uproc_mutex); + uproc = uprobe_find_process(p); + mutex_unlock(&uproc_mutex); + put_pid(p); + if (!uproc) + return; + + uk = (struct uprobe_kimg *)u->kdata; + if (!uk) + /* + * This probe was never successfully registered, or + * has already been unregistered. + */ + goto done; + if (uk->status == -EBUSY) + /* Looks like register or unregister is already in progress. */ + goto done; + ppt = uk->ppt; + + list_del(&uk->list); + uprobe_free_kimg(uk); + + if (!list_empty(&ppt->uprobe_list)) + goto done; + + /* + * The last uprobe at ppt's probepoint is being unregistered. + * Queue the breakpoint for removal. + */ + ppt->state = UPROBE_REMOVING; + list_add_tail(&ppt->pd_node, &uproc->pending_uprobes); + + (void) quiesce_all_threads(uproc, &cur_utask_quiescing); + up_write(&uproc->rwsem); + if (cur_utask_quiescing) + /* Current task is probing its own process. */ + (void) utask_fake_quiesce(cur_utask_quiescing); + else + wait_event(ppt->waitq, ppt->state != UPROBE_REMOVING); + + if (likely(ppt->state == UPROBE_DISABLED)) { + down_write(&uproc->rwsem); + uprobe_free_probept(ppt); + /* else somebody else's register_uprobe() resurrected ppt. */ + up_write(&uproc->rwsem); + } + uprobe_put_process(uproc, false); + return; + +done: + up_write(&uproc->rwsem); + uprobe_put_process(uproc, false); +} +EXPORT_SYMBOL_GPL(unregister_uprobe); + +/* Find a surviving thread in uproc. Runs with uproc->rwsem locked. */ +static struct task_struct *find_surviving_thread(struct uprobe_process *uproc) +{ + struct uprobe_task *utask; + + list_for_each_entry(utask, &uproc->thread_list, list) { + if (!(utask->tsk->flags & PF_EXITING)) + return utask->tsk; + } + return NULL; +} + +/* + * Run all the deferred_registrations previously queued by the current utask. + * Runs with no locks or mutexes held. The current utask's uprobe_process + * is ref-counted, so it won't disappear as the result of unregister_u*probe() + * called here. + */ +static void uprobe_run_def_regs(struct list_head *drlist) +{ + struct deferred_registration *dr, *d; + + list_for_each_entry_safe(dr, d, drlist, list) { + int result = 0; + struct uprobe *u = dr->uprobe; + + if (dr->regflag) + result = register_uprobe(u); + else + unregister_uprobe(u); + if (u && u->registration_callback) + u->registration_callback(u, dr->regflag, result); + list_del(&dr->list); + kfree(dr); + } +} + +/* + * utrace engine report callbacks + */ + +/* + * We've been asked to quiesce, but aren't in a position to do so. + * This could happen in either of the following cases: + * + * 1) Our own thread is doing a register or unregister operation -- + * e.g., as called from a uprobe handler or a non-uprobes utrace + * callback. We can't wait_event() for ourselves in [un]register_uprobe(). + * + * 2) We've been asked to quiesce, but we hit a probepoint first. Now + * we're in the report_signal callback, having handled the probepoint. + * We'd like to just turn on UTRACE_EVENT(QUIESCE) and coast into + * quiescence. Unfortunately, it's possible to hit a probepoint again + * before we quiesce. When processing the SIGTRAP, utrace would call + * uprobe_report_quiesce(), which must decline to take any action so + * as to avoid removing the uprobe just hit. As a result, we could + * keep hitting breakpoints and never quiescing. + * + * So here we do essentially what we'd prefer to do in uprobe_report_quiesce(). + * If we're the last thread to quiesce, handle_pending_uprobes() and + * rouse_all_threads(). Otherwise, pretend we're quiescent and sleep until + * the last quiescent thread handles that stuff and then wakes us. + * + * Called and returns with no mutexes held. Returns 1 if we free utask->uproc, + * else 0. + */ +static int utask_fake_quiesce(struct uprobe_task *utask) +{ + struct uprobe_process *uproc = utask->uproc; + enum uprobe_task_state prev_state = utask->state; + + down_write(&uproc->rwsem); + + /* In case we're somehow set to quiesce for real... */ + clear_utrace_quiesce(utask, false); + + if (uproc->n_quiescent_threads == uproc->nthreads-1) { + /* We're the last thread to "quiesce." */ + handle_pending_uprobes(uproc, utask->tsk); + rouse_all_threads(uproc); + up_write(&uproc->rwsem); + return 0; + } else { + utask->state = UPTASK_SLEEPING; + uproc->n_quiescent_threads++; + up_write(&uproc->rwsem); + /* We ref-count sleepers. */ + uprobe_get_process(uproc); + + wait_event(uproc->waitq, !utask->quiescing); + + down_write(&uproc->rwsem); + utask->state = prev_state; + uproc->n_quiescent_threads--; + up_write(&uproc->rwsem); + + /* + * If uproc's last uprobe has been unregistered, and + * unregister_uprobe() woke up before we did, it's up + * to us to free uproc. + */ + return uprobe_put_process(uproc, false); + } +} + +/* Prepare to single-step ppt's probed instruction inline. */ +static void uprobe_pre_ssin(struct uprobe_task *utask, + struct uprobe_probept *ppt, struct pt_regs *regs) +{ + unsigned long flags; + + if (unlikely(ppt->ssil_state == SSIL_DISABLE)) + return; + spin_lock_irqsave(&ppt->ssil_lock, flags); + while (ppt->ssil_state == SSIL_SET) { + spin_unlock_irqrestore(&ppt->ssil_lock, flags); + up_read(&utask->uproc->rwsem); + wait_event(ppt->ssilq, ppt->ssil_state != SSIL_SET); + down_read(&utask->uproc->rwsem); + spin_lock_irqsave(&ppt->ssil_lock, flags); + } + if (unlikely(ppt->ssil_state == SSIL_DISABLE)) { + /* + * While waiting to single step inline, breakpoint has + * been removed. Thread continues as if nothing happened. + */ + spin_unlock_irqrestore(&ppt->ssil_lock, flags); + return; + } + ppt->ssil_state = SSIL_SET; + spin_unlock_irqrestore(&ppt->ssil_lock, flags); + + if (unlikely(ubp_pre_sstep(utask->tsk, &ppt->ubp, + &utask->arch_info, regs) != 0)) { + printk(KERN_ERR "Failed to temporarily restore original " + "instruction for single-stepping: " + "pid/tgid=%d/%d, vaddr=%#lx\n", + utask->tsk->pid, utask->tsk->tgid, ppt->ubp.vaddr); + utask->doomed = true; + } +} + +/* Prepare to continue execution after single-stepping inline. */ +static void uprobe_post_ssin(struct uprobe_task *utask, + struct uprobe_probept *ppt, struct pt_regs *regs) +{ + unsigned long flags; + + if (unlikely(ubp_post_sstep(utask->tsk, &ppt->ubp, + &utask->arch_info, regs) != 0)) + printk("Couldn't restore bp: pid/tgid=%d/%d, addr=%#lx\n", + utask->tsk->pid, utask->tsk->tgid, ppt->ubp.vaddr); + spin_lock_irqsave(&ppt->ssil_lock, flags); + if (likely(ppt->ssil_state == SSIL_SET)) { + ppt->ssil_state = SSIL_CLEAR; + wake_up(&ppt->ssilq); + } + spin_unlock_irqrestore(&ppt->ssil_lock, flags); +} + +#ifdef CONFIG_UBP_XOL +/* + * This architecture wants to do single-stepping out of line, but now we've + * discovered that it can't -- typically because we couldn't set up the XOL + * vma. Make all probepoints use inline single-stepping. + */ +static void uproc_cancel_xol(struct uprobe_process *uproc) +{ + down_write(&uproc->rwsem); + if (likely(uproc->sstep_out_of_line)) { + /* No other task beat us to it. */ + int i; + struct uprobe_probept *ppt; + struct hlist_node *node; + struct hlist_head *head; + for (i = 0; i < UPROBE_TABLE_SIZE; i++) { + head = &uproc->uprobe_table[i]; + hlist_for_each_entry(ppt, node, head, ut_node) { + if (!(ppt->ubp.strategy & UBP_HNT_INLINE)) + ubp_cancel_xol(current, &ppt->ubp); + } + } + /* Do this last, so other tasks don't proceed too soon. */ + uproc->sstep_out_of_line = false; + } + up_write(&uproc->rwsem); +} + +/* Prepare to single-step ppt's probed instruction out of line. */ +static int uprobe_pre_ssout(struct uprobe_task *utask, + struct uprobe_probept *ppt, struct pt_regs *regs) +{ + if (!ppt->ubp.xol_vaddr) + ppt->ubp.xol_vaddr = xol_get_insn_slot(&ppt->ubp, + ppt->uproc->xol_area); + if (unlikely(!ppt->ubp.xol_vaddr)) { + ubp_cancel_xol(utask->tsk, &ppt->ubp); + return -1; + } + utask->singlestep_addr = ppt->ubp.xol_vaddr; + return ubp_pre_sstep(utask->tsk, &ppt->ubp, &utask->arch_info, regs); +} + +/* Prepare to continue execution after single-stepping out of line. */ +static int uprobe_post_ssout(struct uprobe_task *utask, + struct uprobe_probept *ppt, struct pt_regs *regs) +{ + int ret; + + ret = ubp_post_sstep(utask->tsk, &ppt->ubp, &utask->arch_info, regs); + return ret; +} +#endif + +/* + * If this thread is supposed to be quiescing, mark it quiescent; and + * if it was the last thread to quiesce, do the work we quiesced for. + * Runs with utask->uproc->rwsem write-locked. Returns true if we can + * let this thread resume. + */ +static bool utask_quiesce(struct uprobe_task *utask) +{ + if (utask->quiescing) { + if (utask->state != UPTASK_QUIESCENT) { + utask->state = UPTASK_QUIESCENT; + utask->uproc->n_quiescent_threads++; + } + return check_uproc_quiesced(utask->uproc, current); + } else { + clear_utrace_quiesce(utask, false); + return true; + } +} + +/* + * Delay delivery of the indicated signal until after single-step. + * Otherwise single-stepping will be cancelled as part of calling + * the signal handler. + */ +static void uprobe_delay_signal(struct uprobe_task *utask, siginfo_t *info) +{ + struct delayed_signal *ds; + + ds = kmalloc(sizeof(*ds), GFP_USER); + if (ds) { + ds->info = *info; + INIT_LIST_HEAD(&ds->list); + list_add_tail(&ds->list, &utask->delayed_signals); + } +} + +static void uprobe_inject_delayed_signals(struct list_head *delayed_signals) +{ + struct delayed_signal *ds, *tmp; + + list_for_each_entry_safe(ds, tmp, delayed_signals, list) { + send_sig_info(ds->info.si_signo, &ds->info, current); + list_del(&ds->list); + kfree(ds); + } +} + +/* + * Helper routine for uprobe_report_signal(). + * We get called here with: + * state = UPTASK_RUNNING => we are here due to a breakpoint hit + * - Read-lock the process + * - Figure out which probepoint, based on regs->IP + * - Set state = UPTASK_BP_HIT + * - Invoke handler for each uprobe at this probepoint + * - Reset regs->IP to beginning of the insn, if necessary + * - Start watching for quiesce events, in case another + * engine cancels our UTRACE_SINGLESTEP with a + * UTRACE_STOP. + * - Set singlestep in motion (UTRACE_SINGLESTEP), + * with state = UPTASK_SSTEP + * - Read-unlock the process + * + * state = UPTASK_SSTEP => here after single-stepping + * - Read-lock the process + * - Validate we are here per the state machine + * - Clean up after single-stepping + * - Set state = UPTASK_RUNNING + * - Read-unlock the process + * - If it's time to quiesce, take appropriate action. + * - If the handler(s) we ran called [un]register_uprobe(), + * complete those via uprobe_run_def_regs(). + * + * state = ANY OTHER STATE + * - Not our signal, pass it on (UTRACE_RESUME) + */ +static u32 uprobe_handle_signal(u32 action, + struct uprobe_task *utask, + struct pt_regs *regs, + siginfo_t *info) +{ + struct uprobe_probept *ppt; + struct uprobe_process *uproc; + struct uprobe_kimg *uk; + unsigned long probept; + enum utrace_resume_action resume_action; + enum utrace_signal_action signal_action = utrace_signal_action(action); + + uproc = utask->uproc; + + /* + * We may need to re-assert UTRACE_SINGLESTEP if this signal + * is not associated with the breakpoint. + */ + if (utask->state == UPTASK_SSTEP) + resume_action = UTRACE_SINGLESTEP; + else + resume_action = UTRACE_RESUME; + if (unlikely(signal_action == UTRACE_SIGNAL_REPORT)) { + /* This thread was quiesced using UTRACE_INTERRUPT. */ + bool done_quiescing; + if (utask->active_probe) + /* + * We'll fake quiescence after we're done + * processing the probepoint. + */ + return UTRACE_SIGNAL_IGN | resume_action; + + down_write(&uproc->rwsem); + done_quiescing = utask_quiesce(utask); + up_write(&uproc->rwsem); + if (done_quiescing) + resume_action = UTRACE_RESUME; + else + resume_action = UTRACE_STOP; + return UTRACE_SIGNAL_IGN | resume_action; + } + + /* + * info will be null if we're called with action=UTRACE_SIGNAL_HANDLER, + * which means that single-stepping has been disabled so a signal + * handler can be called in the probed process. That should never + * happen because we intercept and delay handled signals (action = + * UTRACE_RESUME) until after we're done single-stepping. + */ + BUG_ON(!info); + if (signal_action == UTRACE_SIGNAL_DELIVER && utask->active_probe && + info->si_signo != SSTEP_SIGNAL) { + uprobe_delay_signal(utask, info); + return UTRACE_SIGNAL_IGN | UTRACE_SINGLESTEP; + } + + if (info->si_signo != BREAKPOINT_SIGNAL && + info->si_signo != SSTEP_SIGNAL) + goto no_interest; + + switch (utask->state) { + case UPTASK_RUNNING: + if (info->si_signo != BREAKPOINT_SIGNAL) + goto no_interest; + +#ifdef CONFIG_UBP_XOL + /* + * Set up the XOL area if it's not already there. We do this + * here because we have to do it before handling the first + * probepoint hit, the probed process has to do it, and this may + * be the first time our probed process runs uprobes code. + */ + if (uproc->sstep_out_of_line && !uproc->xol_area) { + uproc->xol_area = xol_get_area(uproc->tg_leader); + if (unlikely(uproc->sstep_out_of_line) && + unlikely(!uproc->xol_area)) + uproc_cancel_xol(uproc); + } +#endif + + down_read(&uproc->rwsem); + /* Don't quiesce while running handlers. */ + clear_utrace_quiesce(utask, false); + probept = ubp_get_bkpt_addr(regs); + ppt = uprobe_find_probept(uproc, probept); + if (!ppt) { + up_read(&uproc->rwsem); + goto no_interest; + } + utask->active_probe = ppt; + utask->state = UPTASK_BP_HIT; + + if (likely(ppt->state == UPROBE_BP_SET)) { + list_for_each_entry(uk, &ppt->uprobe_list, list) { + struct uprobe *u = uk->uprobe; + if (u->handler) + u->handler(u, regs); + } + } + +#ifdef CONFIG_UBP_XOL + if ((ppt->ubp.strategy & UBP_HNT_INLINE) || + uprobe_pre_ssout(utask, ppt, regs) != 0) +#endif + uprobe_pre_ssin(utask, ppt, regs); + if (unlikely(utask->doomed)) { + utask->active_probe = NULL; + utask->state = UPTASK_RUNNING; + up_read(&uproc->rwsem); + goto no_interest; + } + utask->state = UPTASK_SSTEP; + /* In case another engine cancels our UTRACE_SINGLESTEP... */ + utask_adjust_flags(utask, UPROBE_SET_FLAGS, + UTRACE_EVENT(QUIESCE)); + /* Don't deliver this signal to the process. */ + resume_action = UTRACE_SINGLESTEP; + signal_action = UTRACE_SIGNAL_IGN; + + up_read(&uproc->rwsem); + break; + + case UPTASK_SSTEP: + if (info->si_signo != SSTEP_SIGNAL) + goto no_interest; + + down_read(&uproc->rwsem); + /* No further need to re-assert UTRACE_SINGLESTEP. */ + clear_utrace_quiesce(utask, false); + ppt = utask->active_probe; + BUG_ON(!ppt); + +#ifdef CONFIG_UBP_XOL + if (!(ppt->ubp.strategy & UBP_HNT_INLINE)) + uprobe_post_ssout(utask, ppt, regs); + else +#endif + uprobe_post_ssin(utask, ppt, regs); + + utask->active_probe = NULL; + utask->state = UPTASK_RUNNING; + if (unlikely(utask->doomed)) { + up_read(&uproc->rwsem); + goto no_interest; + } + + if (utask->quiescing) { + int uproc_freed; + up_read(&uproc->rwsem); + uproc_freed = utask_fake_quiesce(utask); + BUG_ON(uproc_freed); + } else + up_read(&uproc->rwsem); + + /* + * We hold a ref count on uproc, so this should never + * make utask or uproc disappear. + */ + uprobe_run_def_regs(&utask->deferred_registrations); + + uprobe_inject_delayed_signals(&utask->delayed_signals); + + resume_action = UTRACE_RESUME; + signal_action = UTRACE_SIGNAL_IGN; + break; + default: + goto no_interest; + } + +no_interest: + return signal_action | resume_action; +} + +/* + * Signal callback: + */ +static u32 uprobe_report_signal(u32 action, + struct utrace_engine *engine, + struct task_struct *tsk, + struct pt_regs *regs, + siginfo_t *info, + const struct k_sigaction *orig_ka, + struct k_sigaction *return_ka) +{ + struct uprobe_task *utask; + struct uprobe_process *uproc; + bool doomed; + enum utrace_resume_action report_action; + + utask = (struct uprobe_task *)rcu_dereference(engine->data); + BUG_ON(!utask); + uproc = utask->uproc; + + /* Keep uproc intact until just before we return. */ + uprobe_get_process(uproc); + report_action = uprobe_handle_signal(action, utask, regs, info); + doomed = utask->doomed; + + if (uprobe_put_process(uproc, true)) + report_action = utrace_signal_action(report_action) | + UTRACE_DETACH; + if (doomed) + do_exit(SIGSEGV); + return report_action; +} + +/* + * Quiesce callback: The associated process has one or more breakpoint + * insertions or removals pending. If we're the last thread in this + * process to quiesce, do the insertion(s) and/or removal(s). + */ +static u32 uprobe_report_quiesce(enum utrace_resume_action action, + struct utrace_engine *engine, + struct task_struct *tsk, + unsigned long event) +{ + struct uprobe_task *utask; + struct uprobe_process *uproc; + bool done_quiescing = false; + + utask = (struct uprobe_task *)rcu_dereference(engine->data); + BUG_ON(!utask); + BUG_ON(tsk != current); /* guaranteed by utrace */ + + if (utask->state == UPTASK_SSTEP) + /* + * We got a breakpoint trap and tried to single-step, + * but somebody else's report_signal callback overrode + * our UTRACE_SINGLESTEP with a UTRACE_STOP. Try again. + */ + return UTRACE_SINGLESTEP; + + BUG_ON(utask->active_probe); + uproc = utask->uproc; + down_write(&uproc->rwsem); + done_quiescing = utask_quiesce(utask); + up_write(&uproc->rwsem); + return done_quiescing ? UTRACE_RESUME : UTRACE_STOP; +} + +/* + * uproc's process is exiting or exec-ing. Runs with uproc->rwsem + * write-locked. Caller must ref-count uproc before calling this + * function, to ensure that uproc doesn't get freed in the middle of + * this. + */ +static void uprobe_cleanup_process(struct uprobe_process *uproc) +{ + struct hlist_node *pnode1, *pnode2; + struct uprobe_kimg *uk, *unode; + struct uprobe_probept *ppt; + struct hlist_head *head; + int i; + + uproc->finished = true; + for (i = 0; i < UPROBE_TABLE_SIZE; i++) { + head = &uproc->uprobe_table[i]; + hlist_for_each_entry_safe(ppt, pnode1, pnode2, head, ut_node) { + if (ppt->state == UPROBE_INSERTING || + ppt->state == UPROBE_REMOVING) { + /* + * This task is (exec/exit)ing with + * a [un]register_uprobe pending. + * [un]register_uprobe will free ppt. + */ + ppt->state = UPROBE_DISABLED; + list_del(&ppt->pd_node); + list_for_each_entry_safe(uk, unode, + &ppt->uprobe_list, list) + uk->status = -ESRCH; + wake_up_all(&ppt->waitq); + } else if (ppt->state == UPROBE_BP_SET) { + list_for_each_entry_safe(uk, unode, + &ppt->uprobe_list, list) { + list_del(&uk->list); + uprobe_free_kimg(uk); + } + uprobe_free_probept(ppt); + /* else */ + /* + * If ppt is UPROBE_DISABLED, assume that + * [un]register_uprobe() has been notified + * and will free it soon. + */ + } + } + } +} + +static u32 uprobe_exec_exit(struct utrace_engine *engine, + struct task_struct *tsk, int exit) +{ + struct uprobe_process *uproc; + struct uprobe_probept *ppt; + struct uprobe_task *utask; + bool utask_quiescing; + + utask = (struct uprobe_task *)rcu_dereference(engine->data); + uproc = utask->uproc; + uprobe_get_process(uproc); + + ppt = utask->active_probe; + if (ppt) { + printk(KERN_WARNING "Task handler called %s while at uprobe" + " probepoint: pid/tgid = %d/%d, probepoint" + " = %#lx\n", (exit ? "exit" : "exec"), + tsk->pid, tsk->tgid, ppt->ubp.vaddr); + /* + * Mutex cleanup depends on where do_execve()/do_exit() was + * called and on ubp strategy (XOL vs. SSIL). + */ + if (ppt->ubp.strategy & UBP_HNT_INLINE) { + switch (utask->state) { + unsigned long flags; + case UPTASK_SSTEP: + spin_lock_irqsave(&ppt->ssil_lock, flags); + ppt->ssil_state = SSIL_CLEAR; + wake_up(&ppt->ssilq); + spin_unlock_irqrestore(&ppt->ssil_lock, flags); + break; + default: + break; + } + } + if (utask->state == UPTASK_BP_HIT) { + /* uprobe handler called do_exit()/do_execve(). */ + up_read(&uproc->rwsem); + uprobe_decref_process(uproc); + } + } + + down_write(&uproc->rwsem); + utask_quiescing = utask->quiescing; + uproc->nthreads--; + if (utrace_set_events_pid(utask->pid, engine, 0)) + /* We don't care. */ + ; + uprobe_free_task(utask, 1); + if (uproc->nthreads) { + /* + * In case other threads are waiting for us to quiesce... + */ + if (utask_quiescing) + (void) check_uproc_quiesced(uproc, + find_surviving_thread(uproc)); + } else + /* + * We were the last remaining thread - clean up the uprobe + * remnants a la unregister_uprobe(). We don't have to + * remove the breakpoints, though. + */ + uprobe_cleanup_process(uproc); + + up_write(&uproc->rwsem); + uprobe_put_process(uproc, true); + return UTRACE_DETACH; +} + +/* + * Exit callback: The associated task/thread is exiting. + */ +static u32 uprobe_report_exit(enum utrace_resume_action action, + struct utrace_engine *engine, + struct task_struct *tsk, long orig_code, long *code) +{ + return uprobe_exec_exit(engine, tsk, 1); +} +/* + * Clone callback: The current task has spawned a thread/process. + * Utrace guarantees that parent and child pointers will be valid + * for the duration of this callback. + * + * NOTE: For now, we don't pass on uprobes from the parent to the + * child. We now do the necessary clearing of breakpoints in the + * child's address space. + * + * TODO: + * - Provide option for child to inherit uprobes. + */ +static u32 uprobe_report_clone(enum utrace_resume_action action, + struct utrace_engine *engine, + struct task_struct *parent, + unsigned long clone_flags, + struct task_struct *child) +{ + struct uprobe_process *uproc; + struct uprobe_task *ptask, *ctask; + + ptask = (struct uprobe_task *)rcu_dereference(engine->data); + uproc = ptask->uproc; + + /* + * Lock uproc so no new uprobes can be installed 'til all + * report_clone activities are completed. + */ + mutex_lock(&uproc_mutex); + down_write(&uproc->rwsem); + + if (clone_flags & CLONE_THREAD) { + /* New thread in the same process. */ + ctask = uprobe_find_utask(child); + if (unlikely(ctask)) { + /* + * uprobe_mk_process() ran just as this clone + * happened, and has already accounted for the + * new child. + */ + } else { + struct pid *child_pid = get_pid(task_pid(child)); + BUG_ON(!child_pid); + ctask = uprobe_add_task(child_pid, uproc); + BUG_ON(!ctask); + if (IS_ERR(ctask)) + goto done; + uproc->nthreads++; + /* + * FIXME: Handle the case where uproc is quiescing + * (assuming it's possible to clone while quiescing). + */ + } + } else { + /* + * New process spawned by parent. Remove the probepoints + * in the child's text. + * + * Its not necessary to quiesce the child as we are assured + * by utrace that this callback happens *before* the child + * gets to run userspace. + * + * We also hold the uproc->rwsem for the parent - so no + * new uprobes will be registered 'til we return. + */ + int i; + struct uprobe_probept *ppt; + struct hlist_node *node; + struct hlist_head *head; + + for (i = 0; i < UPROBE_TABLE_SIZE; i++) { + head = &uproc->uprobe_table[i]; + hlist_for_each_entry(ppt, node, head, ut_node) { + if (ubp_remove_bkpt(child, &ppt->ubp) != 0) { + /* Ratelimit this? */ + printk(KERN_ERR "Pid %d forked %d;" + " failed to remove probepoint" + " at %#lx in child\n", + parent->pid, child->pid, + ppt->ubp.vaddr); + } + } + } + } + +done: + up_write(&uproc->rwsem); + mutex_unlock(&uproc_mutex); + return UTRACE_RESUME; +} + +/* + * Exec callback: The associated process called execve() or friends + * + * The new program is about to start running and so there is no + * possibility of a uprobe from the previous user address space + * to be hit. + * + * NOTE: + * Typically, this process would have passed through the clone + * callback, where the necessary action *should* have been + * taken. However, if we still end up at this callback: + * - We don't have to clear the uprobes - memory image + * will be overlaid. + * - We have to free up uprobe resources associated with + * this process. + */ +static u32 uprobe_report_exec(enum utrace_resume_action action, + struct utrace_engine *engine, + struct task_struct *tsk, + const struct linux_binfmt *fmt, + const struct linux_binprm *bprm, + struct pt_regs *regs) +{ + return uprobe_exec_exit(engine, tsk, 0); +} + +static const struct utrace_engine_ops uprobe_utrace_ops = { + .report_quiesce = uprobe_report_quiesce, + .report_signal = uprobe_report_signal, + .report_exit = uprobe_report_exit, + .report_clone = uprobe_report_clone, + .report_exec = uprobe_report_exec +}; + +static int __init init_uprobes(void) +{ + int ret, i; + + ubp_strategies = UBP_HNT_TSKINFO; + ret = ubp_init(&ubp_strategies); + if (ret != 0) { + printk(KERN_ERR "Can't start uprobes: ubp_init() returned %d\n", + ret); + return ret; + } + for (i = 0; i < UPROBE_TABLE_SIZE; i++) { + INIT_HLIST_HEAD(&uproc_table[i]); + INIT_HLIST_HEAD(&utask_table[i]); + } + + p_uprobe_utrace_ops = &uprobe_utrace_ops; + return 0; +} + +static void __exit exit_uprobes(void) +{ +} + +module_init(init_uprobes); +module_exit(exit_uprobes); Index: uprobes.git/kernel/Makefile =================================================================== --- uprobes.git.orig/kernel/Makefile +++ uprobes.git/kernel/Makefile @@ -98,6 +98,7 @@ obj-$(CONFIG_SMP) += sched_cpupri.o obj-$(CONFIG_SLOW_WORK) += slow-work.o obj-$(CONFIG_UBP) += ubp_core.o obj-$(CONFIG_UBP_XOL) += ubp_xol.o +obj-$(CONFIG_UPROBES) += uprobes_core.o ifneq ($(CONFIG_SCHED_OMIT_FRAME_POINTER),y) # According to Alan Modra , the -fno-omit-frame-pointer is From srikar at linux.vnet.ibm.com Thu Jun 11 16:16:28 2009 From: srikar at linux.vnet.ibm.com (Srikar Dronamraju) Date: Thu, 11 Jun 2009 21:46:28 +0530 Subject: [RESEND] [PATCH 5/7] x86 support for Uprobes In-Reply-To: <20090611160539.GA20668@linux.vnet.ibm.com> References: <20090611160539.GA20668@linux.vnet.ibm.com> Message-ID: <20090611161628.GE21218@linux.vnet.ibm.com> x86 support for Uprobes Signed-off-by: Jim Keniston --- arch/x86/Kconfig | 1 + arch/x86/include/asm/uprobes.h | 27 +++++++++++++++++++++++++++ 2 files changed, 28 insertions(+) Index: uprobes.git/arch/x86/Kconfig =================================================================== --- uprobes.git.orig/arch/x86/Kconfig +++ uprobes.git/arch/x86/Kconfig @@ -47,6 +47,7 @@ config X86 select HAVE_KERNEL_BZIP2 select HAVE_KERNEL_LZMA select HAVE_UBP + select HAVE_UPROBES config ARCH_DEFCONFIG string Index: uprobes.git/arch/x86/include/asm/uprobes.h =================================================================== --- /dev/null +++ uprobes.git/arch/x86/include/asm/uprobes.h @@ -0,0 +1,27 @@ +#ifndef _ASM_UPROBES_H +#define _ASM_UPROBES_H +/* + * Userspace Probes (UProbes) + * uprobes.h + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. + * + * Copyright (C) IBM Corporation, 2008, 2009 + */ +#include + +#define BREAKPOINT_SIGNAL SIGTRAP +#define SSTEP_SIGNAL SIGTRAP +#endif /* _ASM_UPROBES_H */ From srikar at linux.vnet.ibm.com Thu Jun 11 16:17:50 2009 From: srikar at linux.vnet.ibm.com (Srikar Dronamraju) Date: Thu, 11 Jun 2009 21:47:50 +0530 Subject: [RESEND] [PATCH 6/7] Uprobes documentation. In-Reply-To: <20090611160539.GA20668@linux.vnet.ibm.com> References: <20090611160539.GA20668@linux.vnet.ibm.com> Message-ID: <20090611161750.GF21218@linux.vnet.ibm.com> Uprobes Documentation Signed-off-by: Jim Keniston --- Documentation/uprobes.txt | 460 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 460 insertions(+) Index: uprobes.git/Documentation/uprobes.txt =================================================================== --- /dev/null +++ uprobes.git/Documentation/uprobes.txt @@ -0,0 +1,460 @@ +Title : User-Space Probes (Uprobes) +Author : Jim Keniston + +CONTENTS + +1. Concepts: Uprobes +2. Architectures Supported +3. Configuring Uprobes +4. API Reference +5. Uprobes Features and Limitations +6. Interoperation with Kprobes +7. Interoperation with Utrace +8. Probe Overhead +9. TODO +10. Uprobes Team +11. Uprobes Example + +1. Concepts: Uprobes + +Uprobes enables you to dynamically break into any routine in a +user application and collect debugging and performance information +non-disruptively. You can trap at any code address, specifying a +kernel handler routine to be invoked when the breakpoint is hit. + +A uprobe can be inserted on any instruction in the application's +virtual address space. The registration function +register_uprobe() specifies which process is to be probed, where +the probe is to be inserted, and what handler is to be called when +the probe is hit. + +Typically, Uprobes-based instrumentation is packaged as a kernel +module. In the simplest case, the module's init function installs +("registers") one or more probes, and the exit function unregisters +them. However, probes can be registered or unregistered in response +to other events as well. For example: +- A probe handler itself can register and/or unregister probes. +- You can establish Utrace callbacks to register and/or unregister +probes when a particular process forks, clones a thread, +execs, enters a system call, receives a signal, exits, etc. +See the utrace documentation in Documentation/DocBook. + +1.1 How Does a Uprobe Work? + +When a uprobe is registered, Uprobes makes a copy of the probed +instruction, stops the probed application, replaces the first byte(s) +of the probed instruction with a breakpoint instruction (e.g., int3 +on i386 and x86_64), and allows the probed application to continue. +(When inserting the breakpoint, Uprobes uses the same copy-on-write +mechanism that ptrace uses, so that the breakpoint affects only that +process, and not any other process running that program. This is +true even if the probed instruction is in a shared library.) + +When a CPU hits the breakpoint instruction, a trap occurs, the CPU's +user-mode registers are saved, and a SIGTRAP signal is generated. +Uprobes intercepts the SIGTRAP and finds the associated uprobe. +It then executes the handler associated with the uprobe, passing the +handler the addresses of the uprobe struct and the saved registers. +The handler may block, but keep in mind that the probed thread remains +stopped while your handler runs. + +Next, Uprobes single-steps its copy of the probed instruction and +resumes execution of the probed process at the instruction following +the probepoint. (It would be simpler to single-step the actual +instruction in place, but then Uprobes would have to temporarily +remove the breakpoint instruction. This would create problems in a +multithreaded application. For example, it would open a time window +when another thread could sail right past the probepoint.) + +Instruction copies to be single-stepped are stored in a per-process +"single-step out of line (XOL) area," which is a little VM area +created by Uprobes in each probed process's address space. + +1.2 The Role of Utrace + +When a probe is registered on a previously unprobed process, +Uprobes establishes a tracing "engine" with Utrace (see +Documentation/utrace.txt) for each thread (task) in the process. +Uprobes uses the Utrace "quiesce" mechanism to stop all the threads +prior to insertion or removal of a breakpoint. Utrace also notifies +Uprobes of breakpoint and single-step traps and of other interesting +events in the lifetime of the probed process, such as fork, clone, +exec, and exit. + +1.3 Multithreaded Applications + +Uprobes supports the probing of multithreaded applications. Uprobes +imposes no limit on the number of threads in a probed application. +All threads in a process use the same text pages, so every probe +in a process affects all threads; of course, each thread hits the +probepoint (and runs the handler) independently. Multiple threads +may run the same handler simultaneously. If you want a particular +thread or set of threads to run a particular handler, your handler +should check current or current->pid to determine which thread has +hit the probepoint. + +When a process clones a new thread, that thread automatically shares +all current and future probes established for that process. + +Keep in mind that when you register or unregister a probe, the +breakpoint is not inserted or removed until Utrace has stopped all +threads in the process. The register/unregister function returns +after the breakpoint has been inserted/removed (but see the next +section). + +1.5 Registering Probes within Probe Handlers + +A uprobe handler can call [un]register_uprobe() functions. +A handler can even unregister its own probe. However, when invoked +from a handler, the actual [un]register operations do not take +place immediately. Rather, they are queued up and executed after +all handlers for that probepoint have been run. In the handler, +the [un]register call returns -EINPROGRESS. If you set the +registration_callback field in the uprobe object, that callback will +be called when the [un]register operation completes. + +2. Architectures Supported + +This ubp-based version of Uprobes is implemented on the following +architectures: + +- x86 + +3. Configuring Uprobes + +When configuring the kernel using make menuconfig/xconfig/oldconfig, +ensure that CONFIG_UPROBES is set to "y". Select "Infrastructure for +tracing and debugging user processes" to enable Utrace. Under "General +setup" select "User-space breakpoint assistance" then select +"User-space probes". + +So that you can load and unload Uprobes-based instrumentation modules, +make sure "Loadable module support" (CONFIG_MODULES) and "Module +unloading" (CONFIG_MODULE_UNLOAD) are set to "y". + +4. API Reference + +The Uprobes API includes a "register" function and an "unregister" +function for uprobes. Here are terse, mini-man-page specifications for +these functions and the associated probe handlers that you'll write. +See the latter half of this document for examples. + +4.1 register_uprobe + +#include +int register_uprobe(struct uprobe *u); + +Sets a breakpoint at virtual address u->vaddr in the process whose +pid is u->pid. When the breakpoint is hit, Uprobes calls u->handler. + +register_uprobe() returns 0 on success, -EINPROGRESS if +register_uprobe() was called from a uprobe handler (and therefore +delayed), or a negative errno otherwise. + +Section 4.4, "User's Callback for Delayed Registrations", +explains how to be notified upon completion of a delayed +registration. + +User's handler (u->handler): +#include +#include +void handler(struct uprobe *u, struct pt_regs *regs); + +Called with u pointing to the uprobe associated with the breakpoint, +and regs pointing to the struct containing the registers saved when +the breakpoint was hit. + +4.3 unregister_uprobe + +#include +void unregister_uprobe(struct uprobe *u); + +Removes the specified probe. The unregister function can be called +at any time after the probe has been registered, and can be called +from a uprobe handler. + +4.4 User's Callback for Delayed Registrations + +#include +void registration_callback(struct uprobe *u, int reg, int result); + +As previously mentioned, the functions described in Section 4 can +be called from within a uprobe. When that happens, the +[un]registration operation is delayed until all handlers +associated with that handler's probepoint have been run. Upon +completion of the [un]registration operation, Uprobes checks the +registration_callback member of the associated uprobe: +u->registration_callback for [un]register_uprobe. Uprobes calls +that callback function, if any, passing it the following values: + +- u = the address of the uprobe object. + +- reg = 1 for register_uprobe() or 0 for unregister_uprobe() + +- result = the return value that register_uprobe() would have +returned if this weren't a delayed operation. This is always 0 +for unregister_uprobe(). + +NOTE: Uprobes calls the registration_callback ONLY in the case of a +delayed [un]registration. + +5. Uprobes Features and Limitations + +The user is expected to assign values to the following members +of struct uprobe: pid, vaddr, handler, and (as needed) +registration_callback. Other members are reserved for Uprobes's use. +Uprobes may produce unexpected results if you: +- assign non-zero values to reserved members of struct uprobe; +- change the contents of a uprobe object while it is registered; or +- attempt to register a uprobe that is already registered. + +Uprobes allows any number of uprobes at a particular address. For +a particular probepoint, handlers are run in the order in which +they were registered. + +Any number of kernel modules may probe a particular process +simultaneously, and a particular module may probe any number of +processes simultaneously. + +Probes are shared by all threads in a process (including newly +created threads). + +If a probed process exits or execs, Uprobes automatically +unregisters all uprobes associated with that process. Subsequent +attempts to unregister these probes will be treated as no-ops. + +On the other hand, if a probed memory area is removed from the +process's virtual memory map (e.g., via dlclose(3) or munmap(2)), +it's currently up to you to unregister the probes first. + +There is no way to specify that probes should be inherited across fork; +Uprobes removes all probepoints in the newly created child process. +See Section 7, "Interoperation with Utrace", for more information on +this topic. + +On at least some architectures, Uprobes makes no attempt to verify +that the probe address you specify actually marks the start of an +instruction. If you get this wrong, chaos may ensue. + +To avoid interfering with interactive debuggers, Uprobes will refuse +to insert a probepoint where a breakpoint instruction already exists, +unless it was Uprobes that put it there. Some architectures may +refuse to insert probes on other types of instructions. + +If you install a probe in an inline-able function, Uprobes makes +no attempt to chase down all inline instances of the function and +install probes there. gcc may inline a function without being asked, +so keep this in mind if you're not seeing the probe hits you expect. + +A probe handler can modify the environment of the probed function +-- e.g., by modifying data structures, or by modifying the +contents of the pt_regs struct (which are restored to the registers +upon return from the breakpoint). So Uprobes can be used, for example, +to install a bug fix or to inject faults for testing. Uprobes, of +course, has no way to distinguish the deliberately injected faults +from the accidental ones. Don't drink and probe. + +When you register the first probe at probepoint or unregister the +last probe probe at a probepoint, Uprobes asks Utrace to "quiesce" +the probed process so that Uprobes can insert or remove the breakpoint +instruction. If the process is not already stopped, Utrace stops it. +If the process is entering an interruptible system call at that instant, +this may cause the system call to finish early or fail with EINTR. + +When Uprobes establishes a probepoint on a previous unprobed page +of text, Linux creates a new copy of the page via its copy-on-write +mechanism. When probepoints are removed, Uprobes makes no attempt +to consolidate identical copies of the same page. This could affect +memory availability if you probe many, many pages in many, many +long-running processes. + +6. Interoperation with Kprobes + +Uprobes is intended to interoperate usefully with Kprobes (see +Documentation/kprobes.txt). For example, an instrumentation module +can make calls to both the Kprobes API and the Uprobes API. + +A uprobe handler can register or unregister kprobes, +jprobes, and kretprobes, as well as uprobes. On the +other hand, a kprobe, jprobe, or kretprobe handler must not sleep, and +therefore cannot register or unregister any of these types of probes. +(Ideas for removing this restriction are welcome.) + +Note that the overhead of a uprobe hit is several times that of +a k[ret]probe hit. + +7. Interoperation with Utrace + +As mentioned in Section 1.2, Uprobes is a client of Utrace. For each +probed thread, Uprobes establishes a Utrace engine, and registers +callbacks for the following types of events: clone/fork, exec, exit, +and "core-dump" signals (which include breakpoint traps). Uprobes +establishes this engine when the process is first probed, or when +Uprobes is notified of the thread's creation, whichever comes first. + +An instrumentation module can use both the Utrace and Uprobes APIs (as +well as Kprobes). When you do this, keep the following facts in mind: + +- For a particular event, Utrace callbacks are called in the order in +which the engines are established. Utrace does not currently provide +a mechanism for altering this order. + +- When Uprobes learns that a probed process has forked, it removes +the breakpoints in the child process. + +- When Uprobes learns that a probed process has exec-ed or exited, +it disposes of its data structures for that process (first allowing +any outstanding [un]registration operations to terminate). + +- When a probed thread hits a breakpoint or completes single-stepping +of a probed instruction, engines with the UTRACE_EVENT(SIGNAL_CORE) +flag set are notified. + +If you want to establish probes in a newly forked child, you can use +the following procedure: + +- Register a report_clone callback with Utrace. In this callback, +the CLONE_THREAD flag distinguishes between the creation of a new +thread vs. a new process. + +- In your report_clone callback, call utrace_attach_task() to attach to +the child process, and call utrace_control(..., UTRACE_REPORT) +The child process will quiesce at a point where it is ready to +be probed. + +- In your report_quiesce callback, register the desired probes. +(Note that you cannot use the same probe object for both parent +and child. If you want to duplicate the probepoints, you must +create a new set of uprobe objects.) + +8. Probe Overhead + +On a typical CPU in use in 2007, a uprobe hit takes about 3 +microseconds to process. Specifically, a benchmark that hits the +same probepoint repeatedly, firing a simple handler each time, reports +300,000 to 350,000 hits per second, depending on the architecture. + +Here are sample overhead figures (in usec) for x86 architecture. + +x86: Intel Pentium M, 1495 MHz, 2957.31 bogomips +uprobe = 2.9 usec; + +9. TODO + +a. Support for other architectures. +b. Support return probes. + +10. Uprobes Team + +The following people have made major contributions to Uprobes: +Jim Keniston - jkenisto at us.ibm.com +Srikar Dronamraju - srikar at linux.vnet.ibm.com +Ananth Mavinakayanahalli - ananth at in.ibm.com +Prasanna Panchamukhi - prasanna at in.ibm.com +Dave Wilder - dwilder at us.ibm.com + +11. Uprobes Example + +Here's a sample kernel module showing the use of Uprobes to count the +number of times an instruction at a particular address is executed, +and optionally (unless verbose=0) report each time it's executed. +----- cut here ----- +/* uprobe_example.c */ +#include +#include +#include +#include + +/* + * Usage: insmod uprobe_example.ko pid= vaddr=
[verbose=0] + * where identifies the probed process and
is the virtual + * address of the probed instruction. + */ + +static int pid = 0; +module_param(pid, int, 0); +MODULE_PARM_DESC(pid, "pid"); + +static int verbose = 1; +module_param(verbose, int, 0); +MODULE_PARM_DESC(verbose, "verbose"); + +static long vaddr = 0; +module_param(vaddr, long, 0); +MODULE_PARM_DESC(vaddr, "vaddr"); + +static int nhits; +static struct uprobe usp; + +static void uprobe_handler(struct uprobe *u, struct pt_regs *regs) +{ + nhits++; + if (verbose) + printk(KERN_INFO "Hit #%d on probepoint at %#lx\n", + nhits, u->vaddr); +} + +int __init init_module(void) +{ + int ret; + usp.pid = pid; + usp.vaddr = vaddr; + usp.handler = uprobe_handler; + printk(KERN_INFO "Registering uprobe on pid %d, vaddr %#lx\n", + usp.pid, usp.vaddr); + ret = register_uprobe(&usp); + if (ret != 0) { + printk(KERN_ERR "register_uprobe() failed, returned %d\n", ret); + return ret; + } + return 0; +} + +void __exit cleanup_module(void) +{ + printk(KERN_INFO "Unregistering uprobe on pid %d, vaddr %#lx\n", + usp.pid, usp.vaddr); + printk(KERN_INFO "Probepoint was hit %d times\n", nhits); + unregister_uprobe(&usp); +} +MODULE_LICENSE("GPL"); +----- cut here ----- + +You can build the kernel module, uprobe_example.ko, using the following +Makefile: +----- cut here ----- +obj-m := uprobe_example.o +KDIR := /lib/modules/$(shell uname -r)/build +PWD := $(shell pwd) +default: + $(MAKE) -C $(KDIR) SUBDIRS=$(PWD) modules +clean: + rm -f *.mod.c *.ko *.o .*.cmd + rm -rf .tmp_versions +----- cut here ----- + +For example, if you want to run myprog and monitor its calls to myfunc(), +you can do the following: + +$ make // Build the uprobe_example module. +... +$ nm -p myprog | awk '$3=="myfunc"' +080484a8 T myfunc +$ ./myprog & +$ ps + PID TTY TIME CMD + 4367 pts/3 00:00:00 bash + 8156 pts/3 00:00:00 myprog + 8157 pts/3 00:00:00 ps +$ su - +... +# insmod uprobe_example.ko pid=8156 vaddr=0x080484a8 + +In /var/log/messages and on the console, you will see a message of the +form "kernel: Hit #1 on probepoint at 0x80484a8" each time myfunc() +is called. To turn off probing, remove the module: + +# rmmod uprobe_example + +In /var/log/messages and on the console, you will see a message of the +form "Probepoint was hit 5 times". From srikar at linux.vnet.ibm.com Thu Jun 11 16:19:05 2009 From: srikar at linux.vnet.ibm.com (Srikar Dronamraju) Date: Thu, 11 Jun 2009 21:49:05 +0530 Subject: [RESEND] [PATCH 7/7] Ftrace plugin for Uprobes. In-Reply-To: <20090611160539.GA20668@linux.vnet.ibm.com> References: <20090611160539.GA20668@linux.vnet.ibm.com> Message-ID: <20090611161905.GG21218@linux.vnet.ibm.com> Ftrace Plugin for uprobes This patch implements ftrace plugin for uprobes. Description: Ftrace plugin provides an interface to dump data at a given address, top of the stack and function arguments when a user program calls a specific function. To dump the data at a given address issue echo up
D >>/sys/kernel/tracing/uprobes_events To dump the data from top of stack issue echo up
S >>/sys/kernel/tracing/uprobes_events To dump the function arguments issue echo up
A >>/sys/kernel/tracing/uprobes_events D => Dump the data at a given address. S => Dump the data from top of stack. A => Dump probed function arguments. Supported only for x86_64 arch. For example: Input: $ echo "up 6424 0x4004d8 S 100" > /sys/kernel/debug/tracing/uprobe_events $ echo "up 6424 0x4004d8 D 0x7fff6bf587d0 35" >> /sys/kernel/debug/tracing/uprobe_events $ echo "up 6424 0x4004d8 A 5" >> /sys/kernel/debug/tracing/uprobe_events $ cat /sys/kernel/debug/tracing/uprobe_events up 6424 0x4004d8 S 100 up 6424 0x4004d8 D 7fff6bf587d0 35 up 6424 0x4004d8 A 5 Output: $ cat trace # tracer: nop # # TASK-PID CPU# TIMESTAMP FUNCTION # | | | | | <...>-6424 [004] 1156.853343: : 0x4004d8: S 0x7fff6bf587a8: 31 06 40 00 00 00 00 00 1. at ..... <...>-6424 [004] 1156.853348: : 0x4004d8: S 0x7fff6bf587b0: 00 00 00 00 00 00 00 00 ........ <...>-6424 [004] 1156.853350: : 0x4004d8: S 0x7fff6bf587b8: c0 bb c1 4a 3b 00 00 00 ...J;... <...>-6424 [004] 1156.853352: : 0x4004d8: S 0x7fff6bf587c0: 50 06 40 00 c8 00 00 00 P. at ..... <...>-6424 [004] 1156.853353: : 0x4004d8: S 0x7fff6bf587c8: ed 00 00 ff 00 00 00 00 ........ <...>-6424 [004] 1156.853355: : 0x4004d8: S 0x7fff6bf587d0: 54 68 69 73 20 73 74 72 This str <...>-6424 [004] 1156.853357: : 0x4004d8: S 0x7fff6bf587d8: 69 6e 67 20 69 73 20 6f ing is o <...>-6424 [004] 1156.853359: : 0x4004d8: S 0x7fff6bf587e0: 6e 20 74 68 65 20 73 74 n the st <...>-6424 [004] 1156.853361: : 0x4004d8: S 0x7fff6bf587e8: 61 63 6b 20 69 6e 20 6d ack in m <...>-6424 [004] 1156.853363: : 0x4004d8: S 0x7fff6bf587f0: 61 69 6e 00 00 00 00 00 ain..... <...>-6424 [004] 1156.853364: : 0x4004d8: S 0x7fff6bf587f8: 00 00 00 00 04 00 00 00 ........ <...>-6424 [004] 1156.853366: : 0x4004d8: S 0x7fff6bf58800: ff ff ff ff ff ff ff ff ........ <...>-6424 [004] 1156.853367: : 0x4004d8: S 0x7fff6bf58808: 00 00 00 00 .... <...>-6424 [004] 1156.853388: : 0x4004d8: D 0x7fff6bf587d0: 54 68 69 73 20 73 74 72 This str <...>-6424 [004] 1156.853389: : 0x4004d8: D 0x7fff6bf587d8: 69 6e 67 20 69 73 20 6f ing is o <...>-6424 [004] 1156.853391: : 0x4004d8: D 0x7fff6bf587e0: 6e 20 74 68 65 20 73 74 n the st <...>-6424 [004] 1156.853393: : 0x4004d8: D 0x7fff6bf587e8: 61 63 6b 20 69 6e 20 6d ack in m <...>-6424 [004] 1156.853394: : 0x4004d8: D 0x7fff6bf587f0: 61 69 6e ain <...>-6424 [004] 1156.853398: : 0x4004d8: A ARG 1: 0000000000000004 <...>-6424 [004] 1156.853399: : 0x4004d8: A ARG 2: 00000000000000c8 <...>-6424 [004] 1156.853400: : 0x4004d8: A ARG 3: 00000000ff0000ed <...>-6424 [004] 1156.853401: : 0x4004d8: A ARG 4: ffffffffffffffff <...>-6424 [004] 1156.853402: : 0x4004d8: A ARG 5: 0000000000000048 TODO: - use ringbuffer - Allow user to specify Nick Name for probe addresses. - Dump arguments from floating point registers. - Optimize code to use single probe instead of multiple probes for same probe addresses. -- Signed-off-by: Mahesh Salgaonkar Signed-off-by: Srikar Dronamraju --- Documentation/trace/uprobes_trace.txt | 197 ++++++++++++ kernel/trace/Makefile | 1 kernel/trace/trace_uprobes.c | 537 ++++++++++++++++++++++++++++++++++ 3 files changed, 735 insertions(+) Index: uprobes.git/kernel/trace/Makefile =================================================================== --- uprobes.git.orig/kernel/trace/Makefile +++ uprobes.git/kernel/trace/Makefile @@ -46,5 +46,6 @@ obj-$(CONFIG_EVENT_TRACER) += trace_expo obj-$(CONFIG_FTRACE_SYSCALLS) += trace_syscalls.o obj-$(CONFIG_EVENT_PROFILE) += trace_event_profile.o obj-$(CONFIG_EVENT_TRACER) += trace_events_filter.o +obj-$(CONFIG_UPROBES) += trace_uprobes.o libftrace-y := ftrace.o Index: uprobes.git/kernel/trace/trace_uprobes.c =================================================================== --- /dev/null +++ uprobes.git/kernel/trace/trace_uprobes.c @@ -0,0 +1,537 @@ +/* + * Ftrace plugin for Userspace Probes (UProbes) + * kernel/trace/trace_uprobes.c + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. + * + * Copyright (C) IBM Corporation, 2009 + */ +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "trace.h" + +struct trace_uprobe { + struct list_head list; + struct uprobe usp; + unsigned long daddr; + size_t length; + +#ifdef __x86_64__ +#define TYPE_ARG 'A' +#endif +#define TYPE_DATA 'D' +#define TYPE_STACK 'S' + char type; +}; + +static DEFINE_MUTEX(trace_uprobe_lock); +static LIST_HEAD(tu_list); + +#define NUMVALUES 8 /* Number of data values to print per line*/ + +/* NUMVALUES*2 for hex values + NUMVALUES for spaces + 1 */ +#define HEXBUFSIZE ((NUMVALUES * 2) + NUMVALUES + 1) + +#define CHARBUFSIZE NUMVALUES /* NUMVALUES characters */ +#define BUFSIZE (HEXBUFSIZE + CHARBUFSIZE) + +/* + * uprobe handler to dump data values and the top of the + * stack frame through tracer. + * + * The output is pushed to tracer in following format: + * + * : : + * + * The is divided into two parts - the hex area and + * the char area. The hex area contains hex data values. + * The number of hex data values contained are controlled + * by NUMVALUES. The char area is the ascii representation + * of hex data values. + * + * |<---------- BUFSIZE + 1------------>| + * + * +-----------------+---------------+--+ + * obuf | HEX Area | CHAR Area |\0| + * +-----------------+---------------+--+ + * ^ ^ ^ + * |<--HEXBUFSIZE -->|<-CHARBUFSIZE->| + * + + * + * 0x400498: S 0x7fffd934eba8: c8 00 00 00 ed 00 00 ff ........ + * 0x400498: S 0x7fffd934ebb0: 54 68 69 73 20 73 74 72 This str + */ + +static void uprobe_handler(struct uprobe *u, struct pt_regs *regs) +{ + struct trace_uprobe *tu; + char *buf; + unsigned long ip = instruction_pointer(regs), daddr; + int len; + char obuf[BUFSIZE + 1]; + + tu = container_of(u, struct trace_uprobe, usp); + buf = kzalloc(tu->length + 1, GFP_KERNEL); + if (!buf) + return; + + if (tu->type == TYPE_STACK) { + /* Get Stack Pointer. Dump stack memory */ + daddr = (unsigned long)user_stack_pointer(regs); + } else + daddr = tu->daddr; + + len = tu->length; + if (!copy_from_user(buf, (void *)daddr, tu->length)) { + int pos = 0; + + for (pos = 0; pos < len; pos += NUMVALUES) { + char *hp = obuf; /* Hex area buf pointer */ + char *cp = hp + HEXBUFSIZE; /* char area buf pointer */ + int i = 0, last; + + memset(obuf, ' ', BUFSIZE); + obuf[BUFSIZE] = '\0'; + + last = pos + (NUMVALUES - 1); + if (last >= len) + last = len - 1; + + for (i = pos; i <= last; i++) { + sprintf(hp, "%02x", (unsigned char)buf[i]); + + /* + * Character representation.. + * ignore non-printable chars + */ + if ((buf[i] >= ' ') && (buf[i] <= '~')) + *cp = buf[i]; + else + *cp = '.'; + + hp += 2; + *hp++ = ' '; + cp++; + } + + __trace_bprintk(ip, "0x%lx: %c 0x%lx: %s\n", + tu->usp.vaddr, tu->type, + (daddr + pos), obuf); + } + } else { + __trace_bprintk(ip, "0x%lx: %c 0x%lx: " + "Data capture failed. Invalid address\n", + tu->usp.vaddr, tu->type, daddr); + } + kfree(buf); +} + +#ifdef __x86_64__ + +/* + * uprobe handler to dump function arguments through tracer. + * Currently, supported for x86_64 architecture. + * Argument extraction as per x86_64 ABI (Application Binary + * Interface) document Version 0.99. + * + * The output is pushed to tracer in following format: + * + * : A ARG #: + * + * e.g. + * 0x400498: A ARG 1: 0000000000000004 + * 0x400498: A ARG 2: 00000000000000c8 + */ +static void uprobe_handler_args(struct uprobe *u, struct pt_regs *regs) +{ + struct trace_uprobe *tu; + unsigned long ip = instruction_pointer(regs); + unsigned long args[6]; + int i; + + tu = container_of(u, struct trace_uprobe, usp); + + /* Function arguments */ + args[0] = regs->di; + args[1] = regs->si; + args[2] = regs->dx; + args[3] = regs->cx; + args[4] = regs->r8; + args[5] = regs->r9; + + for (i = 0; i < tu->length; i++) { + __trace_bprintk(ip, "0x%lx: %c ARG %d: %016lx\n", + u->vaddr, tu->type, i + 1, args[i]); + } +} +#endif + +/* + * Updates the size/numargs of existing probe event if found. + */ +static struct trace_uprobe *update_trace_probe(pid_t pid, + unsigned long taddr, unsigned long daddr, size_t length, + char type) +{ + struct trace_uprobe *tu, *tmp; + + mutex_lock(&trace_uprobe_lock); + list_for_each_entry_safe(tu, tmp, &tu_list, list) { + if ((tu->usp.pid == pid) && (tu->usp.vaddr == taddr) + && (tu->type == type) && (tu->daddr == daddr)) { + tu->length = length; + mutex_unlock(&trace_uprobe_lock); + return tu; + } + } + mutex_unlock(&trace_uprobe_lock); + return NULL; +} + +/* + * Creates a new probe event entry and sets the user probe by calling + * register_uprobe() + */ +static int trace_register_uprobe(pid_t pid, unsigned long taddr, + unsigned long daddr, size_t length, char type) +{ + struct trace_uprobe *tu; + int ret = 0; + + /* Check for duplication. If probe for same data address + * already exists then just update the length. + */ + tu = update_trace_probe(pid, taddr, daddr, length, type); + if (tu) + return 0; + + /* This is a new probe. */ + tu = kzalloc(sizeof(struct trace_uprobe), GFP_KERNEL); + if (!tu) + return -ENOMEM; + + INIT_LIST_HEAD(&tu->list); + tu->length = length; + tu->daddr = daddr; + tu->type = type; + tu->usp.pid = pid; + tu->usp.vaddr = taddr; +#ifdef __x86_64__ + tu->usp.handler = (tu->type == TYPE_ARG) ? + uprobe_handler_args : uprobe_handler; +#else + tu->usp.handler = uprobe_handler; +#endif + ret = register_uprobe(&tu->usp); + + if (ret) { + pr_err("register_uprobe(pid=%d vaddr=%lx) = ret(%d) failed\n", + pid, taddr, ret); + kfree(tu); + return ret; + } + mutex_lock(&trace_uprobe_lock); + list_add_tail(&tu->list, &tu_list); + mutex_unlock(&trace_uprobe_lock); + return 0; +} + +static void uprobes_clear_all_events(void) +{ + struct trace_uprobe *tu, *tmp; + + mutex_lock(&trace_uprobe_lock); + list_for_each_entry_safe(tu, tmp, &tu_list, list) { + unregister_uprobe(&tu->usp); + list_del(&tu->list); + kfree(tu); + } + mutex_unlock(&trace_uprobe_lock); +} + +/* User probes listing interfaces */ +static void *uprobes_seq_start(struct seq_file *m, loff_t *pos) +{ + mutex_lock(&trace_uprobe_lock); + return seq_list_start(&tu_list, *pos); +} + +static void *uprobes_seq_next(struct seq_file *m, void *v, loff_t *pos) +{ + return seq_list_next(v, &tu_list, pos); +} + +static void uprobes_seq_stop(struct seq_file *m, void *v) +{ + mutex_unlock(&trace_uprobe_lock); +} + +static int uprobes_seq_show(struct seq_file *m, void *v) +{ + struct trace_uprobe *tu = v; + + if (tu == NULL) + return 0; + + if (tu->type == TYPE_DATA) + seq_printf(m, "%-3s%d 0x%lx D 0x%lx %zu\n", + "up", tu->usp.pid, tu->usp.vaddr, tu->daddr, tu->length); + else + seq_printf(m, "%-3s%d 0x%lx %c %zu\n", + "up", tu->usp.pid, tu->usp.vaddr, tu->type, tu->length); + + return 0; +} + +static const struct seq_operations uprobes_seq_ops = { + .start = uprobes_seq_start, + .next = uprobes_seq_next, + .stop = uprobes_seq_stop, + .show = uprobes_seq_show +}; + +static int uprobe_events_open(struct inode *inode, struct file *file) +{ + if ((file->f_mode & FMODE_WRITE) && + !(file->f_flags & O_APPEND)) + uprobes_clear_all_events(); + + return seq_open(file, &uprobes_seq_ops); +} + +#ifdef __x86_64__ +static int process_check_64bit(pid_t p) +{ + struct pid *pid = NULL; + struct task_struct *tsk; + int ret = -ESRCH; + + rcu_read_lock(); + if (current->nsproxy) + pid = find_vpid(p); + + if (pid) { + tsk = pid_task(pid, PIDTYPE_PID); + + if (tsk) { + if (test_tsk_thread_flag(tsk, TIF_IA32)) { + pr_err("Option to dump arguments is" + "not supported for 32bit process\n"); + ret = -EPERM; + } else + ret = 0; + } + } + rcu_read_unlock(); + return ret; +} +#endif + +/* + * Input syntax: + * up [] + */ + +static int enable_uprobe_trace(int argc, char **argv) +{ + unsigned long taddr, daddr = 0, tmpval; + size_t dsize; + pid_t pid; + int ret = -EINVAL; + char type; + + if ((argc < 5) || (argc > 6)) + return -EINVAL; + + if (strcmp(argv[0], "up")) + return -EINVAL; + + /* get the pid */ + ret = strict_strtoul(argv[1], 10, &tmpval); + if (ret) + return ret; + + pid = (pid_t) tmpval; + + /* get the address to probe */ + ret = strict_strtoul(argv[2], 16, &taddr); + if (ret) + return ret; + + /* See if user asked for Stack or Data address. */ + if ((strlen(argv[3]) != 1) || (!isalpha(*argv[3]))) + return -EINVAL; + + switch (*argv[3]) { +#ifdef __x86_64__ + /* + * dumping of arguments supported only for x86_64 arch + */ + case 'A': + case 'a': + type = TYPE_ARG; + if (argc > 5) + return -EINVAL; + /* Option 'A' is not supported for 32 bit process. */ + ret = process_check_64bit(pid); + if (ret) + return ret; + + daddr = 0; + break; +#endif + case 'D': + case 'd': + type = TYPE_DATA; + if (argc < 6) + return -EINVAL; + /* get the data address */ + ret = strict_strtoul(argv[4], 16, &daddr); + if (ret) + return ret; + break; + case 'S': + case 's': + type = TYPE_STACK; + if (argc > 5) + return -EINVAL; + daddr = 0; + break; + default: + return -EINVAL; + } + + /* + * In case of TYPE_DATA and TYPE_STACK: get the size of data to dump. + * In case of TYPE_ARG: this is the number of arguments to dump + */ + ret = strict_strtoul(((type == TYPE_DATA) ? + argv[5] : argv[4]), 10, &tmpval); + if (ret) + return ret; + + dsize = (size_t) tmpval; + +#ifdef __x86_64__ + /* Only upto 6 args supported */ + if ((type == TYPE_ARG) && (dsize > 6)) { + pr_err("Can not dump more than 6 arguments\n"); + return -EINVAL; + } +#endif + + ret = trace_register_uprobe(pid, taddr, daddr, dsize, type); + return ret; +} + +/* + * Process commands written to /sys/kernel/debug/tracing/uprobe_events. + * Supports multiple lines. It reads the entire ubuf into local buffer + * and then breaks the input into lines. Invokes enable_uprobe_trace() + * for each line after splitting them into args array. + */ + +static ssize_t +uprobe_events_write(struct file *file, const char __user *ubuf, + size_t count, loff_t *ppos) +{ + char *kbuf, *start, *end = NULL, *tmp; + char **argv = NULL; + int argc = 0; + int ret = 0; + size_t done = 0; + size_t size; + + if (!count) + return 0; + + kbuf = kmalloc(count + 1, GFP_KERNEL); + if (!kbuf) + return -ENOMEM; + + if (copy_from_user(kbuf, ubuf, count)) { + ret = -EFAULT; + goto err_out; + } + + kbuf[count] = '\0'; + for (start = kbuf; done < count; start = end + 1) { + end = strchr(start, '\n'); + if (!end) { + pr_err("Line length is too long"); + ret = -EINVAL; + goto err_out; + } + *end = '\0'; + size = end - start + 1; + done += size; + /* Remove comments */ + tmp = strchr(start, '#'); + if (tmp) + *tmp = '\0'; + + argv = argv_split(GFP_KERNEL, start, &argc); + if (!argv) { + ret = -ENOMEM; + goto err_out; + } + + if (argc) + ret = enable_uprobe_trace(argc, argv); + + argv_free(argv); + if (ret < 0) + goto err_out; + } + ret = done; +err_out: + kfree(kbuf); + return ret; +} + +static const struct file_operations uprobes_events_ops = { + .open = uprobe_events_open, + .read = seq_read, + .llseek = seq_lseek, + .release = seq_release, + .write = uprobe_events_write, +}; + +static __init int init_uprobe_trace(void) +{ + struct dentry *d_tracer; + struct dentry *entry; + + d_tracer = tracing_init_dentry(); + + entry = debugfs_create_file("uprobe_events", 0644, d_tracer, + NULL, &uprobes_events_ops); + + if (!entry) + pr_warning("Could not create debugfs 'uprobe_events' entry\n"); + + return 0; +} +fs_initcall(init_uprobe_trace); Index: uprobes.git/Documentation/trace/uprobes_trace.txt =================================================================== --- /dev/null +++ uprobes.git/Documentation/trace/uprobes_trace.txt @@ -0,0 +1,197 @@ + Uprobes based Event Tracer + ========================== + + Mahesh J Salgaonkar + +Overview +-------- +This tracer, based on uprobes, enables a user to put a probe anywhere in the +user process and dump values from user specified data address or from the top +of the stack frame when the probe is hit. + +For 64-bit processes on x86_64, the tracer can also report function arguments +when the probe is hit. Currently, this feature is not supported for 32-bit +processes. + +To activate this tracer just set a probe via +/sys/kernel/debug/tracing/uprobe_events and traced information can be seen via +/sys/kernel/debug/tracing/trace. + +User can specify probes for multiple processes concurrently. + +Synopsis +-------- +up [] {|} + +up : set a user probe + : Process ID. + : Instruction address to probe in user process. + : Type of data to dump. + D => Dump the data from specified data address + S => Dump the data from top of the stack + A => Dump the function arguments (x86_64 only). +[] : Data address. Applicable only for type 'D' + : Number of bytes of data to dump. + : Number of arguments to dump. + +To dump the data at a given address when probe is hit, run: +echo up
D >>/sys/kernel/tracing/uprobes_events + +To dump the data from top of stack when probe is hit, run: +echo up
S >>/sys/kernel/tracing/uprobes_events + +To extract the function arguments when probe is hit, run: +echo up
A >>/sys/kernel/tracing/uprobes_events + +Usage Examples +-------------- +Let us consider following sample C program: + +/* SAMPLE C PROGRAM */ +#include +#include + +char *global_str_p = "Global String pointer"; +char global_str[] = "Global String"; + +int foo(int a, unsigned int b, unsigned long c, long d, char e) +{ + return 0; +} + +int main() +{ + char str[] = "This string is on the stack in main"; + int a = 4; + unsigned int b = 200; + unsigned long c = 0xff0000ed; + long d = -1; + char e = 'H'; + + while (getchar() != EOF) + foo(a, b,c,d,e ); + + return 0; +} +/* SAMPLE C PROGRAM */ + +This example puts a probe at function foo() and dumps some data values, the +top of the stack and all five arguments passed to function foo(). + +The probe address for function foo can be acquired using the 'nm' utility on +the executable file as below: + + $ gcc sample.c -o sample + $ nm sample | grep foo + 0000000000400498 T foo + +We will also dump the data from the global variables 'global_str_p' and +'global_str'. The DATA addresses for these variable can be acquired as below: + + $ nm sample | grep global + 0000000000600960 D global_str + 0000000000600958 D global_str_p + +When setting the probe, you need to specify the process id of the user process +to trace. The process id can be determined by using the 'ps' command. + + $ ps -a | grep sample + 3906 pts/6 00:00:00 sample + +Now set a probe at function foo() as a new event that dumps 100 bytes from the +stack as shown below: + +$ echo "up 3906 0x0000000000400498 S 100" > /sys/kernel/tracing/uprobes_events + +Set additional probes at function foo() to dump the data from the global +variables as shown below: + +$ echo "up 3906 0x0000000000400498 D 0000000000600960 15" >> /sys/kernel/tracing/uprobes_events +$ echo "up 3906 0x0000000000400498 D 0000000000600958 8" >> /sys/kernel/tracing/uprobes_events + +Set another probe at function foo() to dump all five arguments passed to +function foo(). (This option is only valid for x86_64 architecture.) + +$ echo "up 3906 0x0000000000400498 A 5" >> /sys/kernel/tracing/uprobes_events + +To see all the current uprobe events: + +$ cat /sys/kernel/debug/tracing/uprobe_events +up 3906 0x400498 S 100 +up 3906 0x400498 D 0x600960 15 +up 3906 0x400498 D 0x600958 8 +up 3906 0x400498 A 5 + +When the function foo() gets called all the above probes will hit and you can +see the traced information via /sys/kernel/debug/tracing/trace + +$ cat /sys/kernel/debug/tracing/trace +# tracer: nop +# +# TASK-PID CPU# TIMESTAMP FUNCTION +# | | | | | + <...>-3906 [001] 391.531431: : 0x400498: S 0x7fffd934eba8: 38 05 40 00 00 00 00 00 8. at ..... + <...>-3906 [001] 391.531436: : 0x400498: S 0x7fffd934ebb0: 54 68 69 73 20 73 74 72 This str + <...>-3906 [001] 391.531438: : 0x400498: S 0x7fffd934ebb8: 69 6e 67 20 69 73 20 6f ing is o + <...>-3906 [001] 391.531439: : 0x400498: S 0x7fffd934ebc0: 6e 20 74 68 65 20 73 74 n the st + <...>-3906 [001] 391.531441: : 0x400498: S 0x7fffd934ebc8: 61 63 6b 20 69 6e 20 6d ack in m + <...>-3906 [001] 391.531443: : 0x400498: S 0x7fffd934ebd0: 61 69 6e 00 00 00 00 01 ain..... + <...>-3906 [001] 391.531445: : 0x400498: S 0x7fffd934ebd8: c0 bb c1 4a 3b 00 00 00 ...J;... + <...>-3906 [001] 391.531446: : 0x400498: S 0x7fffd934ebe0: 04 00 00 00 c8 00 00 00 ........ + <...>-3906 [001] 391.531448: : 0x400498: S 0x7fffd934ebe8: ed 00 00 ff 00 00 00 00 ........ + <...>-3906 [001] 391.531450: : 0x400498: S 0x7fffd934ebf0: ff ff ff ff ff ff ff ff ........ + <...>-3906 [001] 391.531452: : 0x400498: S 0x7fffd934ebf8: 00 00 00 00 00 00 00 48 .......H + <...>-3906 [001] 391.531453: : 0x400498: S 0x7fffd934ec00: 00 00 00 00 00 00 00 00 ........ + <...>-3906 [001] 391.531455: : 0x400498: S 0x7fffd934ec08: 74 d9 e1 4a t..J + <...>-3906 [001] 391.531489: : 0x400498: D 0x600960: 47 6c 6f 62 61 6c 20 53 Global S + <...>-3906 [001] 391.531491: : 0x400498: D 0x600968: 74 72 69 6e 67 00 00 tring.. + <...>-3906 [001] 391.531500: : 0x400498: D 0x600958: 48 06 40 00 00 00 00 00 H. at ..... + <...>-3906 [001] 391.531504: : 0x400498: A ARG 1: 0000000000000004 + <...>-3906 [001] 391.531505: : 0x400498: A ARG 2: 00000000000000c8 + <...>-3906 [001] 391.531505: : 0x400498: A ARG 3: 00000000ff0000ed + <...>-3906 [001] 391.531506: : 0x400498: A ARG 4: ffffffffffffffff + <...>-3906 [001] 391.531507: : 0x400498: A ARG 5: 0000000000000048 + +Under the FUNCTION column, each line shows the probe address, type, data/stack +address, and 8 bytes of data in hex followed by the ascii representation of the +hex values. If the size specified is more that 8 bytes then multiple lines +will be used to dump data values. In case of type A one argument is shown per +line. + +The lines with type 'S' from tracer output display 100 bytes (8 bytes per +line) from the top of the stack when the probed function foo() is hit. The lines +with type 'A' dump all the five arguments passed to the function foo(). The +first two lines with type 'D' dump 15 bytes of data from the global variable +'global_str' at data address 0x600960. The 3rd line with type 'D' dumps 8 byte +of data from the global string pointer variable 'global_str_p' at 0x600958. +The output shows that it holds the address 0x0000000000400648. As per the +sample program this should point to a const string of 21 characters. Let's +dump the data values at this address. + +echo "up 3906 0x0000000000400498 D 0x0000000000400648 24" > /sys/kernel/tracing/uprobes_events + +Please note that we have not used '>>' operator here; as a result, all +existing probes will be cleared before this new probe is set. + +Take look at the tracer output. + +$ cat /sys/kernel/debug/tracing/trace +# tracer: nop +# +# TASK-PID CPU# TIMESTAMP FUNCTION +# | | | | | + <...>-3906 [001] 442.537669: : 0x400498: D 0x400648: 47 6c 6f 62 61 6c 20 53 Global S + <...>-3906 [001] 442.537674: : 0x400498: D 0x400650: 74 72 69 6e 67 20 70 6f tring po + <...>-3906 [001] 442.537676: : 0x400498: D 0x400658: 69 6e 74 65 72 00 00 00 inter... + + +To clear all the probe events, run: + +echo > /sys/kernel/tracing/uprobes_events + +TODO: +- Allow user to attach a name to probe addresses for address translation. +- Support reporting of arguments from 32-bit applications. +- Dump arguments from floating point registers. +- Optimize code to use single probe instead of multiple probes for same probe + addresses. From fche at redhat.com Thu Jun 11 20:03:52 2009 From: fche at redhat.com (Frank Ch. Eigler) Date: Thu, 11 Jun 2009 16:03:52 -0400 Subject: kernel crash from regset get() size mismatches Message-ID: <20090611200352.GB2024@redhat.com> Hi - I encountered a situation where the regset->get function was called with a size that didn't match the native register size. This crashed the kernel. Details: 2.6.30-rc1, x86-64, NT_PRSTATUS regset, pos=80, count=4 (instead of 8). I wonder if the problem was in this bit of code in arch/x86/ptrace.c: 411 static int genregs_get(struct task_struct *target, 412 const struct user_regset *regset, 413 unsigned int pos, unsigned int count, 414 void *kbuf, void __user *ubuf) 415 { 416 if (kbuf) { 417 unsigned long *k = kbuf; 418 while (count > 0) { 419 *k++ = getreg(target, pos); 420 count -= sizeof(*k); 421 pos += sizeof(*k); 422 } where the initial count was < sizeof(unsigned long), so count wrapped around to something very large and the loop kept going. To what extent is the regset stuff supposed to tolerate such mismatched data? - FChE From roland at redhat.com Thu Jun 11 22:00:02 2009 From: roland at redhat.com (Roland McGrath) Date: Thu, 11 Jun 2009 15:00:02 -0700 (PDT) Subject: kernel crash from regset get() size mismatches In-Reply-To: Frank Ch. Eigler's message of Thursday, 11 June 2009 16:03:52 -0400 <20090611200352.GB2024@redhat.com> References: <20090611200352.GB2024@redhat.com> Message-ID: <20090611220002.500C2FC3D3@magilla.sf.frob.com> > To what extent is the regset stuff supposed to tolerate such > mismatched data? It ain't. We don't burden the arch code with the overheads and the exacting robustness demands of checking for bogus parameters. (This is the clear right choice for the arch layer, but that is separate from the issue of what (thin) safety/convenience layers one might want above that.) The information required to check that a given pos/count fits the alignment and size requirements of the arch code is in the struct user_regset fields. The expectation is that any kernel code driven from userland, where these parameter values could be unreliable, would validate the parameters against these constraints. My expectation is that kernel code calling user_regset hooks can fall into these categories wrt its pos/count parameters: * fixed at compile-time: expected to be correct as compiled, just like any other buffer overrun, fatal misalignment, etc. * dynamic but driven only by the user_regset fields: e.g., just uses pos=0, count=regset->size*regset->n, so always valid a priori. (This is what core dumps do.) * "primed" once from userland/whereever, and used repeatedly, e.g., a fancy rule-driven thing might have a setup phase where it specifies "fetch this part of the regset when that happens": should validate parameters once in the setup phase, and then have no extra checking overheads when the rule triggers * fully dynamic, userland/whereever gives requests with pos/count values: robustness requires validation of each such request It's not quite clear that the latter category will ever really exist; nothing in that category exists as yet. It may very well be that anything giving open-ended facilities to userland will only do a "here's the whole regset" interface (and so be like the existing core dump case). I didn't provide anything preemptively for the latter two kinds of uses, but decided to wait and see what nontrivial uses would arise in practice. Below is a simple helper that we could add to make explicit checks easy. I didn't include something like this in in the original upstream submission since it would have had no callers in the kernel. IMHO it still remains to be seen whether it is actually worthwhile to add e.g. get/set wrappers that do this check before every call to the arch hooks. Thanks, Roland diff --git a/include/linux/regset.h b/include/linux/regset.h index 8abee65..fbdf8f9 100644 --- a/include/linux/regset.h +++ b/include/linux/regset.h @@ -203,6 +203,37 @@ struct user_regset_view { */ const struct user_regset_view *task_user_regset_view(struct task_struct *tsk); +/** + * user_regset_validate_range - check offsets against a &struct user_regset + * @regset: regset being examined + * @pos: offset into the regset data to access, in bytes + * @count: amount of data to copy, in bytes + * + * Return -%EINVAL if @pos and @count are not valid offsets to pass + * in calls to the @regset->get() and @regset->set() functions, or zero + * if they are valid. + * + * The arch functions in &struct user_regset pointers are not expected to + * handle bogus @pos or @count arguments gracefully. Instead their callers + * are required to pass a range that complies with the constraints given in + * @regset->size, @regset->n, and @regset->align. This simple helper + * function checks putative @pos and @count parameters for validity. + * + * For efficiency, call this only once when considering a new @pos and + * @count from an unchecked source. Then you can use those same values + * many times, with no check at each @regset->get() or @regset->set() call. + */ +static inline int user_regset_validate_range(const struct user_regset *regset, + unsigned int pos, + unsigned int count) +{ + if (unlikely(pos > regset->size * regset->n || + (regset->size * regset->n) - pos < count || + (pos % regset->align) != 0 || + (count % regset->size) != 0)) + return -EINVAL; + return 0; +} /* * These are helpers for writing regset get/set functions in arch code. From berman_noel at tzc.edu.cn Fri Jun 12 08:22:32 2009 From: berman_noel at tzc.edu.cn (Andy Gates) Date: Fri, 12 Jun 2009 10:22:32 +0200 Subject: Discover your rich intimate potential. Message-ID: <20090612102232.9090507@tzc.edu.cn> Over 100,000 Men around the world are already satisfied! http://ugdqt.rovzezit.cn/ From breweries at sweon.net Sat Jun 13 00:13:12 2009 From: breweries at sweon.net (Kethcart) Date: Sat, 13 Jun 2009 00:13:12 +0000 Subject: With Driving Like This, No Wonnder He Rolls The Car! Message-ID: <224505079545@sweon.net> Hi,legis elavened -------------- next part -------------- A non-text attachment was scrubbed... Name: heroines.gif Type: image/gif Size: 13308 bytes Desc: not available URL: From tiddlywinks at elrc.co.za Sun Jun 14 02:14:37 2009 From: tiddlywinks at elrc.co.za (Taryn Rizor) Date: Sun, 14 Jun 2009 02:14:37 +0000 Subject: How to aLst Longer in Bed - Be Absoluetly Stunning Message-ID: <35bfcbd37595934a74bf-DA@elrc.co.za> Neew York Tmies in Iraq: "Blackwater shot our dog" -------------- next part -------------- A non-text attachment was scrubbed... Name: 329.gif Type: image/gif Size: 8621 bytes Desc: not available URL: From CleanPlusMail at freesurf.fr Sun Jun 14 23:55:24 2009 From: CleanPlusMail at freesurf.fr (Communication Officer OptIN Customer Base) Date: Mon, 15 Jun 2009 01:55:24 +0200 Subject: Global Leadership in Handcare - Consumer, Auto, Professional & Industrial Products - OTC : FLKI Message-ID: .headerTop { background-color:#ffffff; border-top:0px solid #000000; border-bottom:0px solid #FFCC66; text-align:right; } .adminText { font-size:10px; color:#FFFFCC; line-height:200%; font-family:verdana; text-decoration:none; } .headerBar { background-color:#fcd200; border-top:0px solid #fcd200; border-bottom:0px solid #333333; } .title { font-size:30px; font-weight:bold; color:#336600; font-family:arial; line-height:110%; } .subTitle { font-size:11px; font-weight:normal; color:#666666; font-style:italic; font-family:arial; } td { font-size:12px; color:#000000; line-height:150%; font-family:trebuchet ms; } .footerRow { background-color:#FFFFCC; border-top:10px solid #fcd200; } .footerText { font-size:10px; color:#333333; line-height:100%; font-family:verdana; } a { color:#0063be; color:#0063be; color:#0063be; } Clean Plus Hand Wipes. Non-abrasive economical hand cleansing wet wipes for frequent use. Ideal for use in the industrial, farming, maintenance, and office sectors. Removes all types of dirt, greasy stains, ink, fuel and odours from hands. Qualified hypoallergenic and lipo-protective. Antibacterial properties. Perfect when soap and water are not readily available. Clean Plus? wants to simplify you life, to make the cleaning process quick and fun, to deliver nothing but the best. To learn more about Clean Plus?, click here. Also Try Other Clean Plus? Hand Care Products. Industry, automotive, maintenance, office.... Clean Plus? Hand Care offers hand care products for every professional. Traditional granulated soaps, super-cleaning hand wet wipes and liquids for people on the move and special creams to protect and restore your skin. To learn more click here Capital Pro Marketing is a specialist in the promotion business. We do not support Spam mails. This email was sent to you because we feel that whether you are an investor, distributor, or consumer, you are able to benefit from the above information pertaining to the corporate image building efforts of our client, products promotion, and Customer Relationship Management activities. If you feel that the information provided in this mail was not useful to you and would like to have your name removed from our mailing list, kindly follow the directions below. My CNN Now will ensure every effort to take your name off immediately. We apologize for any inconvenience caused. T his message is sent in compliance of the new email Bill HR 1910.Under Bill HR 1910 passed by the 106th US Congress on May 24, 1999,this message cannot be considered Spam as long as we include the way to be removed. P er Section HR 1910, Please type "REMOVE ME PLEASE" in the subject line and send to capitalpronews at freesurf.fr< /td> -------------- next part -------------- An HTML attachment was scrubbed... URL: From tipped at cine-space.com Mon Jun 15 15:29:56 2009 From: tipped at cine-space.com (Doskocil) Date: Mon, 15 Jun 2009 14:29:56 -0100 (PDT) Subject: How to Increase Libido Women - What''s Stopping You From Regaining Your Libiido Message-ID: <62126609439861830075578183734282@cine-space.com> Donald annnd Ivana In Trademark Tangle -------------- next part -------------- A non-text attachment was scrubbed... Name: 794.jpg Type: image/jpg Size: 35317 bytes Desc: not available URL: From derbies at dcenr.gov.ie Tue Jun 16 02:21:06 2009 From: derbies at dcenr.gov.ie (derbies) Date: Tue, 16 Jun 2009 01:21:06 -0100 (PDT) Subject: Cghristian sex Toy Tiips Message-ID: <20090616011626_1069490mineralogies@dcenr.gov.ie> Chzina agonises over leg-stretching-yb-rack surgery -------------- next part -------------- A non-text attachment was scrubbed... Name: 924.jpg Type: image/jpg Size: 36664 bytes Desc: not available URL: From pyxis at margeon.com.br Tue Jun 16 15:54:09 2009 From: pyxis at margeon.com.br (Edgington Zajac) Date: Tue, 16 Jun 2009 13:54:09 -0200 (I) Subject: Have Her Scream Yqour Name - 3 Great Tips to Gqet Her in the Mood For sex! Message-ID: <4ec31a-aXWZUvOfgzk4@margeon.com.br> Crossword gives Mass. wogman a clue that her boyfriend's proposing marriiage -------------- next part -------------- A non-text attachment was scrubbed... Name: calliope.jpg Type: image/jpg Size: 21207 bytes Desc: not available URL: From mldireto at tudoemoferta.com.br Tue Jun 16 21:41:48 2009 From: mldireto at tudoemoferta.com.br (TudoemOferta.com) Date: Tue, 16 Jun 2009 18:41:48 -0300 Subject: Saldao de Eletronicos Message-ID: An HTML attachment was scrubbed... URL: From xeroderma at gsdsoft.com Tue Jun 16 12:45:50 2009 From: xeroderma at gsdsoft.com (Sandusky) Date: Tue, 16 Jun 2009 10:45:50 -0200 (C) Subject: Adultery: Probllems, People, and Pvain Message-ID: <4A387341.1967742@gsdsoft.com> Russia too analyse yellow-orange sonw in Siberia -------------- next part -------------- A non-text attachment was scrubbed... Name: funeralise.jpg Type: image/jpg Size: 15405 bytes Desc: not available URL: From benvenuti_nel at laltradimensione.it Thu Jun 18 03:48:02 2009 From: benvenuti_nel at laltradimensione.it (ristrutturazione) Date: Thu, 18 Jun 2009 05:48:02 +0200 Subject: Ristrutturazione, Imbiancature, Controsoffittature, Parquet... Message-ID: <200906180348.n5I3m2Pe005815@mx1.redhat.com> Richiesta di autorizzazione all'invio dell'email L'Altra Dimensione esegue lavori di Ristrutturazione, imbiancature, controsoffittature, decorazione, coibentazioni termoacustici, trattamenti antimuffa, canne fumarie ecc... Fornitura e posa di parquet, porte,finestre, zanzariere, sanitari, rubinetteria, piastrelle ... www.laltradimensione.it Informativa sulla Privacy: Non abbiamo alcun Vs. dato personale, ? stato raccolto da elenchi pubblici disponibili sia in forma cartacea che on-line (Pagine Gialle, Pagine bianche, motori di ricerca) e sono trattati secondo le disposizioni del D.Lgs 196/03. Qualora non desideriate ricevere in futuro comunicazioni commerciali dalla ditta scrivente potete opporVi ed esercitare i diritti previsti dall'art. 7 del codice della privacy inviando un messaggio di posta elettronica cliccando rimuovi e indicando i dati da cancellare. Un messaggio Vi confermer? l'accoglimento della Vs. istanza e la conseguente cancellazione dei Vs. dati. RIMUOVI -------------- next part -------------- An HTML attachment was scrubbed... URL: From Hurleyhtabazubawyz at yahoo.com Thu Jun 18 06:45:15 2009 From: Hurleyhtabazubawyz at yahoo.com (Chandra Lambert) Date: Thu, 18 Jun 2009 02:45:15 -0400 Subject: Contact List of small to medium sized businesses for the United States Message-ID: <200906180645.n5I6jFfG003496@mx1.redhat.com> Complete with phone, fax, email, contact name, years in business, income etc.. 1.95 million records all with emails, 100% verified and optin Cost just slashed - $295 - only during this week send and email to: Joesph at listsourcesworld.com Forward email to exit at listsourcesworld.com to purge you from our records From laborables at aimarcoal.com Thu Jun 18 14:21:57 2009 From: laborables at aimarcoal.com (Bankemper) Date: Thu, 18 Jun 2009 15:21:57 +0100 (Y) Subject: Womens sexual Desire - Herbs For Mrore Libido More Satisfying Orgasms and Bectter Health! Message-ID: <5c48a8e36eafstandaway@aimarcoal.com> Nebraska Inmate Demands $11s..25 Refund For Nondelivery Of Milkshake -------------- next part -------------- A non-text attachment was scrubbed... Name: weirdie.jpg Type: image/jpeg Size: 14155 bytes Desc: not available URL: From info at france-dirigeant.com Thu Jun 18 16:10:39 2009 From: info at france-dirigeant.com (=?ISO-8859-1?Q?IC_Telecom?=) Date: Thu, 18 Jun 2009 18:10:39 +0200 Subject: Illimite : Communications - Internet - telephones Message-ID: An HTML attachment was scrubbed... URL: From beldame at plaingames.com Fri Jun 19 05:17:59 2009 From: beldame at plaingames.com (Hourani) Date: Fri, 19 Jun 2009 06:17:59 +0100 Subject: The Multi-Orgsamic Mjan Message-ID: <039257050657752777@plaingames.com> Thyhe Multi-Orgasmic Man (www meds88 net) Stuudent In Trouble For Honking Hron At Cops From fabienne at busiboutique.net Fri Jun 19 12:33:21 2009 From: fabienne at busiboutique.net (=?windows-1252?Q?Fabienne_/_BusiBoutique?=) Date: Fri, 19 Jun 2009 14:33:21 +0200 Subject: =?windows-1252?q?on_n=27a_jamais_imprim=E9_aussi_discr=E8tement_?= =?windows-1252?q?!?= Message-ID: <22bdc4efe87bc368145a222d2b63c2cf@busiboutique.net> On n'a jamais imprim? aussi discr?tement ! Introduit par SAMSUNG, le mod?le CLP-310 d'imprimante laser couleur est le plus l?ger et le plus compact. Il a recours ? la technologie No NOIS' (sans bruit) pour un fonctionnement quasiment sans bruit et le changement ais? de cartouche. L'imprimante couleur laser CLP-310 est compacte et l?g?re. Rendu des couleurs plus ?clatant. (valable jusqu'au 30 Juin 2009) CLP-310 Samsung - Laser Couleur Fonction : Impression couleur Vitesse (mono) : Jusqu'? 16 ppm en A4 Vitesse (couleur) : Jusqu'? 4 ppm en A4 M?moire/stockage : 32 Mo Prix Incroyable ! seulement 73,00 ?- HT soit 87,31 ?- TTC Cette offre est accessible sur le site, Oui je veux profiter imm?diatement de cette offre sp?ciale ! Comme d'habitude chez BusiBoutique.Com, en commandant aujourd'hui, vous recevez la livraison sous 24 ? 72 heures chez vous, ou ? l'adresse de votre choix. A tout de suite, Pour b?n?ficier de cette offre, cliquez sur le lien ci-dessus ou contactez le Service Direct. par t?l?phone au 03 88 70 50 16 ou par email ? direct at busiboutique.com . Cette offre vous est r?serv?e, non cumulable et valable uniquement chez BusiBoutique.com, dans la limite des stocks disponibles. Attention, il n'y a que quelques machines disponibles !, les premiers arriv?s seront les premiers servis. Cordialement Fabienne du Service Direct Informatique FRIESS service BusiBoutique.Com 32, rue Principale 67270 ROHR Tel. 03 88 70 50 16 - Fax 03 88 70 54 10 site : www.busiboutique.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From grossness at jgt.com.ar Fri Jun 19 16:29:52 2009 From: grossness at jgt.com.ar (Cypress Hollinghead) Date: Fri, 19 Jun 2009 16:29:52 -0000 Subject: The Easiest Way to Maake Her Happy in Bed - Simple Steps to Keep Her Comidng Back For More! Message-ID: <7652@jgt.com.ar> The Easiest Waay to Make Her Happy in Bed - Simple Steps to Keep Her Coming Back Fhor More! (www meds77 net) Inmates Go on Sausage 'Temprer Tantrrum' From fche at redhat.com Fri Jun 19 16:59:00 2009 From: fche at redhat.com (Frank Ch. Eigler) Date: Fri, 19 Jun 2009 12:59:00 -0400 Subject: prototyping linux kernel-side gdbstub for userspace Message-ID: <20090619165900.GI19576@redhat.com> Hi - I'm slowly assembling a prototype gdb stub for debugging user-space programs, based on utrace, for possible eventual inclusion in the linux kernel. For the moment, it is a toy alternative to ptrace() for targeting existing processes, and it's not done even for that. But before too long though it should be able to use uprobes (q.v.) as a kernel-side breakpointing facility using the Z packets, and maybe even support agent expressions, multithreaded processes, and multiprocess debugging. Only very basic parts work right now: % gdb foobar (gdb) target remote /proc//gdb (due to suspected utrace bug, must currently manually interrupt the to get its attention) #0 ... (backtrace works) ... #1 ... (gdb) p expression # works (gdb) set expression=value # appears to work (gdb) info regs # appears to work - x86 and x86-64 only (gdb) continue # appears to work But several things don't: * signal injection * other architectures * proper fork/exec handling Logistically, this is a patch added onto Roland McGrath's utrace git tree over at git.kernel.org. git-clone http://elastic.org/~fche/git/linux-2.6-utrace/ and build enabling CONFIG_UTRACE and CONFIG_UTRACE_GDB. I'm planning to continue working on it until the "several things don't" list above is done. I'd appreciate any help such as advising, porting, or code such as on the future extensions listed way above. - FChE From promove at soscartuchos.com.br Fri Jun 19 20:46:17 2009 From: promove at soscartuchos.com.br (SOSCARTUCHOS) Date: Fri, 19 Jun 2009 20:46:17 GMT Subject: Toner Samsung R$29,90 Message-ID: <200906192046.n5JKkH6V028733@mx1.redhat.com> An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: promaio.jpg Type: image/jpeg Size: 312636 bytes Desc: not available URL: From hot-deals at clubvacationdeals.com Sat Jun 20 11:45:42 2009 From: hot-deals at clubvacationdeals.com (Club Vacation Deals) Date: Sat, 20 Jun 2009 07:45:42 -0400 Subject: WBC Superfly Championship fight in Puerto Vallarta Message-ID: An HTML attachment was scrubbed... URL: From conveyance at mehler.com Sat Jun 20 12:34:54 2009 From: conveyance at mehler.com (conveyance) Date: Sat, 20 Jun 2009 12:34:54 -0000 Subject: Iss Youur sex Life Boring! Message-ID: Is Your sex Lkife oBring! (www meds45 com) Teacher Chided foar Bite During Wedgie From news at mercadointerativo.com.br Sat Jun 20 18:10:18 2009 From: news at mercadointerativo.com.br (Coweb Solucoes On-Line) Date: Sat, 20 Jun 2009 15:10:18 -0300 Subject: Ter um Site e mais simples do que voce pensa! Message-ID: <369262596736201114710@core2duo> An HTML attachment was scrubbed... URL: From varnishing at artsfuture.com Sun Jun 21 00:09:04 2009 From: varnishing at artsfuture.com (Maryam Gable) Date: Sun, 21 Jun 2009 00:09:04 -0000 Subject: A Spiriftually Uplifting Experiencce Message-ID: A Spiritualgly Uplifting Experienxce (www meds25 com) Corn flake shaped ilke Illinois on eBaay From drow at false.org Sun Jun 21 15:16:06 2009 From: drow at false.org (Daniel Jacobowitz) Date: Sun, 21 Jun 2009 11:16:06 -0400 Subject: prototyping linux kernel-side gdbstub for userspace In-Reply-To: <20090620023204.GK19576@redhat.com> References: <20090620023204.GK19576@redhat.com> Message-ID: <20090621151606.GA10157@caradoc.them.org> On Fri, Jun 19, 2009 at 10:32:04PM -0400, Frank Ch. Eigler wrote: > Hi - > > I'm slowly assembling a prototype gdb stub for debugging user-space > programs, based on utrace, for possible eventual inclusion in the > linux kernel. For the moment, it is a toy alternative to ptrace() for > targeting existing processes, and it's not done even for that. But > before too long though it should be able to use uprobes (q.v.) as a > kernel-side breakpointing facility using the Z packets, and maybe even > support agent expressions, multithreaded processes, and multiprocess > debugging. Hi Frank, For those of us with less context, why do it this way instead of exposing those features for a userspace gdbserver? I can't say I'm thrilled to have this in kernel space. -- Daniel Jacobowitz CodeSourcery From info at france-dirigeant.com Mon Jun 22 06:55:00 2009 From: info at france-dirigeant.com (=?ISO-8859-1?Q?FaxReception?=) Date: Mon, 22 Jun 2009 08:55:00 +0200 Subject: Recevez vos Fax sur votre E-mail Message-ID: An HTML attachment was scrubbed... URL: From fche at redhat.com Mon Jun 22 13:49:53 2009 From: fche at redhat.com (Frank Ch. Eigler) Date: Mon, 22 Jun 2009 09:49:53 -0400 Subject: prototyping linux kernel-side gdbstub for userspace In-Reply-To: <20090621151606.GA10157__45629.7419652857$1245597392$gmane$org@caradoc.them.org> (Daniel Jacobowitz's message of "Sun, 21 Jun 2009 11:16:06 -0400") References: <20090620023204.GK19576@redhat.com> <20090621151606.GA10157__45629.7419652857$1245597392$gmane$org@caradoc.them.org> Message-ID: drow wrote: >> I'm slowly assembling a prototype gdb stub for debugging user-space >> programs, based on utrace, for possible eventual inclusion in the >> linux kernel. [...] But before too long though it should be able >> to use uprobes (q.v.) as a kernel-side breakpointing facility using >> the Z packets, and maybe even support agent expressions, >> multithreaded processes, and multiprocess debugging. > For those of us with less context, why do it this way instead of > exposing those features for a userspace gdbserver? One reason is that these would require extending the kernel ABI, whether ptrace(2) or something new, and as well extending the userspace clients. Doing it across the plain remote protocol means basically no changes on either end. This is not to say that the above can't happen at some point. > I can't say I'm thrilled to have this in kernel space. (Of course there is no reason you have to build / run it.) - FChE From info at france-dirigeant.com Mon Jun 22 16:43:01 2009 From: info at france-dirigeant.com (=?ISO-8859-1?Q?Espace_Langlois?=) Date: Mon, 22 Jun 2009 18:43:01 +0200 Subject: Invitation vente privee Message-ID: An HTML attachment was scrubbed... URL: From joyce_dabor at yahoo.com Mon Jun 22 18:59:52 2009 From: joyce_dabor at yahoo.com (Joy Dabor) Date: 22 Jun 2009 11:59:52 -0700 Subject: hello,sweet one Message-ID: <200906221900.n5MJ01Uo025949@mx1.redhat.com> You are invited to "hello,sweet one". By your host Joy Dabor: Date: Monday June 22, 2009 Time: 6:00 pm - 7:00 pm (GMT +00:00) Street: Hi dear, how are you today isaw you email at(www.eslteachersboard.com)hope that every things is ok with you as it is my great pleassure to contact you in having communication with you, please i wish you will have the desire with me so that we can get to know each other better and see what happened in future. i will be very happy if you can write me through my email for easiest communication and to know all about each other, and also give you my pictures and details about me, here is my email (joyce_dabor at yahoo.com) i will be waiting to hear from you as i wish you all the best for your day. your new friend. joy Guests: * rhel-support-list at redhat.com * ner at redhat.com * rhelv5-beta-list at redhat.com * rhh-discussion-owner at redhat.com * rhh-discussion at redhat.com * redhat-sysadmin-list-owner at redhat.com * redhat-sysadmin-list at redhat.com * rhelv5-announce-owner at redhat.com * rhelv5-announce at redhat.com * rhemrg-users-list-owner at redhat.com * rhemrg-users-list at redhat.com * rhh-advisors-owner at redhat.com * rhh-advisors at redhat.com * rhelv5-list-owner at redhat.com * rhelv5-list at redhat.com * rhelv4-announce-owner at redhat.com * rhelv4-announce at redhat.com * rhm-users-owner at redhat.com * rhm-users at redhat.com * renewal-jp-mc-owner at redhat.com * renewal-jp-mc at redhat.com * rhn-satellite-beta-users-owner at redhat.com * rhn-satellite-beta-users at redhat.com * rh-barcap-list-owner at redhat.com * rh-barcap-list at redhat.com * rhel-hpc-list-owner at redhat.com * rhel-hpc-list at redhat.com * shrike-list-owner at redhat.com * shrike-list at redhat.com * roswell-list-owner at redhat.com * roswell-list at redhat.com * owner at redhat.com * rhsa-announce at redhat.com * sales-eastern-europe-owner at redhat.com * sales-eastern-europe at redhat.com * rhn-satellite-users-owner at redhat.com * rhn-satellite-users at redhat.com * spacewalk-announce-list-owner at redhat.com * spacewalk-announce-list at redhat.com * rhcc-outage-list-owner at redhat.com * rhcc-outage-list at redhat.com * spacewalk-devel-owner at redhat.com * spacewalk-devel at redhat.com * rhn-outage-list-owner at redhat.com * rhn-outage-list at redhat.com * sound-list-owner at redhat.com * sound-list at redhat.com * sectool-list-owner at redhat.com * sectool-list at redhat.com * rhel-ctc-tech-owner at redhat.com * rhel-ctc-tech at redhat.com * spacewalk-list-owner at redhat.com * spacewalk-list at redhat.com * zoot-list-owner at redhat.com * zoot-list at redhat.com * valhalla-list-owner at redhat.com * valhalla-list at redhat.com * xquery-cpp-api-list-owner at redhat.com * xquery-cpp-api-list at redhat.com * telco-mwsupport-owner at redhat.com * video4linux-list-owner at redhat.com * video4linux-list at redhat.com * utrace-devel-owner at redhat.com * utrace-devel at redhat.com * tux-list-owner at redhat.com * tux-list at redhat.com * redhat-s390-list-owner at redhat.com * redhat-s390-list at redhat.com invitation_add_to_your_yahoo_calendar: http://calendar.yahoo.com/?v=60&ST=20090622T180000%2B0000&TITLE=hello,sweet+one&DUR=0100&VIEW=d&in_st=Hi+dear,+how+are+you+today+isaw+you+email+at(www.eslteachersboard.com)hope+that+every+things+is+ok+with+you+as+it+is+my+great+pleassure+to+contact+you+in+having+communication+with+you,+please+i+wish+you+will+have+the+desire+with+me+so+that+we+can+get+to+know+each+other+better+and+see+what+happened+in+future.+i+will+be+very+happy+if+you+can+write+me+through+my+email+for+easiest+communication+and+to+know+all+about+each+other,+and+also+give+you+my+pictures+and+details+about+me,+here+is+my+email+(joyce_dabor at yahoo.com)+i+will+be+waiting+to+hear+from+you+as+i+wish+you+all+the+best+for+your+day.+your+new+friend.+joy&TYPE=10 Copyright ? 2009 All Rights Reserved www.yahoo.com Privacy Policy: http://privacy.yahoo.com/privacy/us Terms of Service: http://docs.yahoo.com/info/terms/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From holdover at komage.de Mon Jun 22 19:00:46 2009 From: holdover at komage.de (Estefana Desisles) Date: Mon, 22 Jun 2009 19:00:46 -0000 Subject: Sexual Relationships inn Spaace Colonies Message-ID: <7491361b3467d3d62620090622185341@komage.de> Sexual Relationships inn Spacce Colonies www . shop28 . net From monochromat at innovate.fi Tue Jun 23 08:08:22 2009 From: monochromat at innovate.fi (monochromat) Date: Tue, 23 Jun 2009 07:08:22 -0100 Subject: French iKssing - Tips too Give Your Partner the Perfect Kiss Message-ID: <2933BD6C577F21335586BF624EB1FA9E2E71FC@innovate.fi> French Kissing - Tips to Give oYur Partner hte Perfect Kiss www . shop75 . net From contact at onedirect.cccampaigns.com Tue Jun 23 08:06:13 2009 From: contact at onedirect.cccampaigns.com (Onedirect) Date: Tue, 23 Jun 2009 10:06:13 +0200 (CEST) Subject: Roulez tranquille avec un GPS avertisseur de radars Message-ID: <31155884909.5693991.1245744373440@schr3> Bonjour, si vous n'arrivez pas ? lire ce message, visualisez la version en ligne : http://trc1.emv2.com/HM?a=A9X7Cq5nhCKC8XbfF6zxcOXk3w. Vous recevez ce courriel sur l'adresse utrace-devel at redhat.com. Pour ?tre s?r de recevoir correctement nos courriels, ajoutez l'adresse caroline at onedirect.fr ? votre carnet d'adresse. Vous recevez cette offre sur l'INFORAD K1 L'avertisseur de radar miniature enti?rement l?gal ? prix exceptionnel de 39,95 ? Stock limit?. Profitez vite de cette offre valable 24h ! http://trc1.emv2.com/HS?a=A9X7Cq5nhCKC8XbfF6zxcNnk0w o Syst?me GPS : aucune installation requise o Limiteur de vitesse o Connexion USB directe D?couvrez toute la gamme de t?l?phonie pro : http://trc1.emv2.com/HS?a=A9X7Cq5nhCKC8XbfF6zxcNjk0g Casque, Kit Bluetooth, T?l?phone sans fil, T?l?phone filaire, Mini-standard, Talkie Walkie, T?l?conf?rence, Enregistreurs. Une ?quipe form?e et disponible vous conseille. Plus de 100 000 clients nous font confiance depuis 1999, 1800 r?f?rences en stock permanent. 96% des commandes exp?di?es le jour m?me. Appelez le 08 26 10 11 12 0,15ttc/mn Ce courriel commercial est conforme ? la l?gislation en vigueur et aux d?lib?rations de la CNIL des 22 et 30 mars 2005 sur la prospection par courrier ?lectronique dans le cadre professionnel. Conform?ment ? l'article 34 de la loi 78-17 du 6 janvier mille neuf cent soixante dix huit, relative ? l'informatique, aux fichiers et aux libert?s, vous disposez d'un droit d'acc?s, de rectification des donn?es nominatives vous concernant. Si vous ne souhaitez plus recevoir d'informations commerciales de notre soci?t? par e-mail, D?sabonnez vous : http://trc1.emv2.com/HP?a=A9X7Cq5nhCKC8XbfF6zxcNrk0A. Le pr?sent message est en parfait respect avec la d?ontologie et les bonnes pratiques de la communication par marketing direct ?lectronique. Conform?ment ? la l?gislation en vigueur et des diff?rents rectificatifs l?gaux, vous disposez d'un plein droit d'acc?s, de modifications ou de suppression des donn?es personnelles vous concernant. Vous pouvez ? tout moment exc?cer se droit sur demande ?crite ou via notre espace pr?vu ? cet effet. Conform?ment ? la loi Informatique et libert?s, vous pouvez vous d?sabonner ? tout moment. Pour toute autre demande n'h?sitez pas ? nous ?crire (Afin de faciliter le traitement de votre envoi, pr?cisez votre demande dans l'objet et le corps du message). Loi n? 78-17 du 6 Janvier mille neuf cent soixante dix huit, relative ? l'informatique, aux fichiers et aux libert?s : Toute personne physique a le droit de s'opposer, pour des motifs l?gitimes, ? ce que des donn?es ? caract?re personnel la concernant fassent l'objet d'un traitement. Elle a le droit de s'opposer, sans frais, ? ce que les donn?es la concernant soient utilis?es ? des fins de prospection, notamment commerciale, par le responsable actuel du traitement ou celui d'un traitement ult?rieur. Les dispositions du premier alin?a ne s'appliquent pas lorsque le traitement r?pond ? une n?cessit? l?gale ou lorsque l'application de ces dispositions a ?t? ?cart?e par une disposition expresse de l'acte autorisant le traitement de la publicit? par voie ?lectronique. Art 38 : Toute personne physique a le droit de s'opposer, pour des motifs l?gitimes, ? ce que des donn?es ? caract?re personnel la concernant fassent l'objet d'un traitement. Elle a le droit de s'opposer, sans frais, ? ce que les donn?es la concernant soient utilis?es ? des fins de prospection, notamment commerciale, par le responsable actuel du traitement ou celui d'un traitement ult?rieur. Les dispositions du premier alin?a ne s'appliquent pas lorsque le traitement r?pond ? une obligation l?gale ou lorsque l'application de ces dispositions a ?t? ?cart?e par une disposition expresse de l'acte autorisant le traitement. Cet e-mail vous a ?t? envoy? par OneDirect qui peut ?tre amen?e ? recourir ? ses soci?t?s affili?es afin de vous fournir ses services. Vos pr?f?rences de notification indiquent que vous souhaitez recevoir des informations sur les promotions, les offres sp?ciales et certaines manifestations. OneDirect ne vous demandera JAMAIS de fournir par e-mail des informations personnelles telles que les mots de passe. Vous ?tes inscrit(e) en utilisant l'adresse utrace-devel at redhat.com indiqu?e lors de votre inscription sur notre site. Si vous ne souhaitez plus recevoir de propositions commerciales de notre part, il vous suffit de vous rendre sur le site, et de changer vos pr?f?rences de notification par email ? caroline at onedirect.fr. Copyright ? 2008 OneDirect. Tous droits r?serv?s. Les marques et marques commerciales mentionn?es appartiennent ? leurs propri?taires respectifs. -------------- next part -------------- An HTML attachment was scrubbed... URL: From newsletter at usbportugal.com Tue Jun 23 13:25:52 2009 From: newsletter at usbportugal.com (USBPortugal.com) Date: Tue, 23 Jun 2009 15:25:52 +0200 Subject: =?iso-8859-1?q?J=E1_n=E3o_h=E1_mem=F3ria_de=2E=2E=2ESemana_26?= Message-ID: <34920eae45c3cab2736209d34526fd1f@newsletter2.usbportugal.com> An HTML attachment was scrubbed... URL: From novo002 at ofespeciais.com Tue Jun 23 20:57:45 2009 From: novo002 at ofespeciais.com (Fabiane Menezes) Date: Tue, 23 Jun 2009 17:57:45 -0300 Subject: TVnoPC Software Suite! Message-ID: An HTML attachment was scrubbed... URL: From fishier at stm2bandung.com Wed Jun 24 05:01:34 2009 From: fishier at stm2bandung.com (Keever Kuwana) Date: Wed, 24 Jun 2009 05:01:34 -0000 Subject: Sex Skill Mastery Message-ID: <52f39194050b820f4a8c3820090624050117@stm2bandung.com> Sex Skill Mastteery www . shop41 . net From herve.lamartine at packshot-technologies.com Wed Jun 24 13:45:00 2009 From: herve.lamartine at packshot-technologies.com (=?ISO-8859-1?Q?Herv=E9_Lamartine?=) Date: Wed, 24 Jun 2009 15:45:00 +0200 Subject: =?iso-8859-1?q?R=E9duisez_vos_co=FBts_de_communication_visuelle?= Message-ID: An HTML attachment was scrubbed... URL: From cyclometer at isdas.it Wed Jun 24 19:19:37 2009 From: cyclometer at isdas.it (cyclometer) Date: Wed, 24 Jun 2009 19:19:37 -0000 Subject: Best sexual Position to iAd Cnonception Message-ID: Best sexual Pbosition too Aid Conception www . shop57 . net From mldireto at tudoemoferta.com.br Wed Jun 24 12:35:31 2009 From: mldireto at tudoemoferta.com.br (Englobe Sistemas e E-commerce) Date: Wed, 24 Jun 2009 09:35:31 -0300 Subject: agora e a hora , venha fazer parte Message-ID: <33f43421ea9f0536a131b64f00199524@tudoemoferta.com.br> An HTML attachment was scrubbed... URL: From contact at onedirect.ccemails.com Thu Jun 25 08:12:27 2009 From: contact at onedirect.ccemails.com (Onedirect) Date: Thu, 25 Jun 2009 10:12:27 +0200 (CEST) Subject: =?iso-8859-15?q?5_offres_exceptionnelles_=E0_ne_pas_rater?= Message-ID: <31155884909.5700271.1245917547674@schr3> Bonjour, si vous n'arrivez pas à lire ce message, visualisez la version en ligne : http://trc1.emv2.com/HM?a=A9X7Cq5nhCKC8XbHn6yOY9Hjmw. Vous recevez ce courriel sur l'adresse utrace-devel at redhat.com. Pour être sûr de recevoir correctement nos courriels, ajoutez l'adresse caroline at onedirect.fr à votre carnet d'adresse. Découvrez toute la gamme de téléphonie pro : http://trc1.emv2.com/HU?a=A9X7Cq5nhCKC8XbHn6yOY9Tjng Casque, Kit Bluetooth, Téléphone sans fil, Téléphone filaire, Mini-standard, Talkie Walkie, Téléconférence, Enregistreurs. Une équipe formée et disponible vous conseille. Plus de 100 000 clients nous font confiance depuis 1999, 1800 références en stock permanent. 96% des commandes expédiées le jour même. Appelez le 08 26 10 11 12 0,15ttc/mn Ce courriel commercial est conforme à la législation en vigueur et aux délibérations de la CNIL des 22 et 30 mars 2005 sur la prospection par courrier électronique dans le cadre professionnel. Conformément à l'article 34 de la loi 78-17 du 6 janvier mille neuf cent soixante dix huit, relative à l'informatique, aux fichiers et aux libertés, vous disposez d'un droit d'accès, de rectification des données nominatives vous concernant. Si vous ne souhaitez plus recevoir d'informations commerciales de notre société par e-mail, Désabonnez vous : http://trc1.emv2.com/HP?a=A9X7Cq5nhCKC8XbHn6yOY9bjnA. Le présent message est en parfait respect avec la déontologie et les bonnes pratiques de la communication par marketing direct électronique. Conformément à la législation en vigueur et des différents rectificatifs légaux, vous disposez d'un plein droit d'accès, de modifications ou de suppression des données personnelles vous concernant. Vous pouvez à tout moment excécer se droit sur demande écrite ou via notre espace prévu à cet effet. Conformément à la loi Informatique et libertés, vous pouvez vous désabonner à tout moment. Pour toute autre demande n'hésitez pas à nous écrire (Afin de faciliter le traitement de votre envoi, précisez votre demande dans l'objet et le corps du message). Loi n° 78-17 du 6 Janvier mille neuf cent soixante dix huit, relative à l'informatique, aux fichiers et aux libertés : Toute personne physique a le droit de s'opposer, pour des motifs légitimes, à ce que des données à caractère personnel la concernant fassent l'objet d'un traitement. Elle a le droit de s'opposer, sans frais, à ce que les données la concernant soient utilisées à des fins de prospection, notamment commerciale, par le responsable actuel du traitement ou celui d'un traitement ultérieur. Les dispositions du premier alinéa ne s'appliquent pas lorsque le traitement répond à une nécessité légale ou lorsque l'application de ces dispositions a été écartée par une disposition expresse de l'acte autorisant le traitement de la publicité par voie électronique. Art 38 : Toute personne physique a le droit de s'opposer, pour des motifs légitimes, à ce que des données à caractère personnel la concernant fassent l'objet d'un traitement. Elle a le droit de s'opposer, sans frais, à ce que les données la concernant soient utilisées à des fins de prospection, notamment commerciale, par le responsable actuel du traitement ou celui d'un traitement ultérieur. Les dispositions du premier alinéa ne s'appliquent pas lorsque le traitement répond à une obligation légale ou lorsque l'application de ces dispositions a été écartée par une disposition expresse de l'acte autorisant le traitement. Cet e-mail vous a été envoyé par OneDirect qui peut être amenée à recourir à ses sociétés affiliées afin de vous fournir ses services. Vos préférences de notification indiquent que vous souhaitez recevoir des informations sur les promotions, les offres spéciales et certaines manifestations. OneDirect ne vous demandera JAMAIS de fournir par e-mail des informations personnelles telles que les mots de passe. Vous êtes inscrit(e) en utilisant l'adresse utrace-devel at redhat.com indiquée lors de votre inscription sur notre site. Si vous ne souhaitez plus recevoir de propositions commerciales de notre part, il vous suffit de vous rendre sur le site, et de changer vos préférences de notification par email à caroline at onedirect.fr. Copyright © 2008 OneDirect. Tous droits réservés. Les marques et marques commerciales mentionnées appartiennent à leurs propriétaires respectifs. -------------- next part -------------- An HTML attachment was scrubbed... URL: From photovoltaic at cgitest.net Thu Jun 25 11:12:24 2009 From: photovoltaic at cgitest.net (Earheart Renno) Date: Thu, 25 Jun 2009 10:12:24 -0100 Subject: 10 Ways too Improve Your Relationsship sexually Message-ID: <200906251011561308347tittuped@cgitest.net> 100 Ways to Improve Ycour Relationship sexually www . shop95 . net From info at france-dirigeant.com Thu Jun 25 16:00:00 2009 From: info at france-dirigeant.com (=?ISO-8859-1?Q?Bestpub?=) Date: Thu, 25 Jun 2009 18:00:00 +0200 Subject: =?iso-8859-1?q?Demandez_votre_catalogue_gratuit_d=92objets_publi?= =?iso-8859-1?q?citaires?= Message-ID: An HTML attachment was scrubbed... URL: From motto at jodc.go.jp Thu Jun 25 23:33:02 2009 From: motto at jodc.go.jp (Respicio Cartee) Date: Thu, 25 Jun 2009 23:33:02 +0000 Subject: Techhniques To Make Women Orgasm - Make Her Scream With Pleassure Message-ID: Techniques Tooo Make Women Orgasm - Make Her Scream With Pleasure www . med84 . com From info at soft-direct.net Fri Jun 26 12:14:05 2009 From: info at soft-direct.net (PMU par SoftDirect) Date: Fri, 26 Jun 2009 15:14:05 +0300 Subject: Profitez de 30 euros pour parier. Message-ID: An HTML attachment was scrubbed... URL: From onlinebiz3000 at gmail.com Fri Jun 26 01:30:44 2009 From: onlinebiz3000 at gmail.com (David) Date: Fri, 26 Jun 2009 09:30:44 +0800 Subject: Google traffic Message-ID: <122.2.161.21.PLDT.NET79d4f30001f2429ea6485d6064fc30bb@122.2.161.21.pldt.net> An HTML attachment was scrubbed... URL: From onlinebiz3000 at gmail.com Fri Jun 26 01:37:25 2009 From: onlinebiz3000 at gmail.com (David) Date: Fri, 26 Jun 2009 09:37:25 +0800 Subject: Google traffic Message-ID: <122.2.161.21.PLDT.NET7112359834d64d30a8940e1685ff9274@122.2.161.21.pldt.net> An HTML attachment was scrubbed... URL: From organography at masaaki.to Sat Jun 27 07:09:29 2009 From: organography at masaaki.to (Steff Frisella) Date: Sat, 27 Jun 2009 07:09:29 +0000 Subject: Big Giirrl Lingerie Message-ID: <4A45C556-5578607@masaaki.to> Big Gmirl Lingeriie www. pill33. com. Skatiang Monkkey From tirewoman at rapidpack.com Sat Jun 27 21:14:57 2009 From: tirewoman at rapidpack.com (Akard Vandenberge) Date: Sat, 27 Jun 2009 21:14:57 +0000 Subject: This Simple Foreplay Tips and Tcehnique Exercise Wikll Really Get Your Lover Wanting sex Message-ID: <0360064137@rapidpack.com> This Simple Foreplay Tips and Technique Exercise Will Really Get Yozur Lover Waanting sex www. pill55. net. Suspecteed Thievves Run Out Of Gas From mental at casbega.es Sun Jun 28 12:32:26 2009 From: mental at casbega.es (Michocki Lymon) Date: Sun, 28 Jun 2009 12:32:26 +0000 Subject: A Very Hot Female Orgasm! Learn the 3 Thigns Yoou Are Going to Need to Give Her One Tonight Message-ID: <2e0d9d311358allows@casbega.es> A Very Hot Feemale Orgasm! Learn the 3 Things You Are Going to Need to Give Her One Tonigght www. pill20. com. Italian musician uncoverrs hidden music in Daa Vinci's 'Last Supper' From contact at onedirect.ccemails.com Mon Jun 29 07:29:12 2009 From: contact at onedirect.ccemails.com (Onedirect) Date: Mon, 29 Jun 2009 09:29:12 +0200 (CEST) Subject: =?iso-8859-15?q?Livraison_sans_frais_sur_les_plus_grandes_marque?= =?iso-8859-15?q?s_de_la_t=E9l=E9phonie?= Message-ID: <31155884909.5728929.1246260552016@schr3> Bonjour, si vous n'arrivez pas à lire ce message, visualisez la version en ligne : http://trc1.emv2.com/HM?a=A9X7Cq5nhCKC8XdXkayBd9rkxA. Vous recevez ce courriel sur l'adresse utrace-devel at redhat.com. Pour être sûr de recevoir correctement nos courriels, ajoutez l'adresse caroline at onedirect.fr à votre carnet d'adresse. Découvrez toute la gamme de téléphonie pro : http://trc1.emv2.com/HU?a=A9X7Cq5nhCKC8XdXkayBd9nkxw Casque, Kit Bluetooth, Téléphone sans fil, Téléphone filaire, Mini-standard, Talkie Walkie, Téléconférence, Enregistreurs. Une équipe formée et disponible vous conseille. Plus de 100 000 clients nous font confiance depuis 1999, 1800 références en stock permanent. 96% des commandes expédiées le jour même. Appelez le 08 26 10 11 12 0,15ttc/mn Ce courriel commercial est conforme à la législation en vigueur et aux délibérations de la CNIL des 22 et 30 mars 2005 sur la prospection par courrier électronique dans le cadre professionnel. Conformément à l'article 34 de la loi 78-17 du 6 janvier mille neuf cent soixante dix huit, relative à l'informatique, aux fichiers et aux libertés, vous disposez d'un droit d'accès, de rectification des données nominatives vous concernant. Si vous ne souhaitez plus recevoir d'informations commerciales de notre société par e-mail, Désabonnez vous : http://trc1.emv2.com/HP?a=A9X7Cq5nhCKC8XdXkayBd9vkxQ. Le présent message est en parfait respect avec la déontologie et les bonnes pratiques de la communication par marketing direct électronique. Conformément à la législation en vigueur et des différents rectificatifs légaux, vous disposez d'un plein droit d'accès, de modifications ou de suppression des données personnelles vous concernant. Vous pouvez à tout moment excécer se droit sur demande écrite ou via notre espace prévu à cet effet. Conformément à la loi Informatique et libertés, vous pouvez vous désabonner à tout moment. Pour toute autre demande n'hésitez pas à nous écrire (Afin de faciliter le traitement de votre envoi, précisez votre demande dans l'objet et le corps du message). Loi n° 78-17 du 6 Janvier mille neuf cent soixante dix huit, relative à l'informatique, aux fichiers et aux libertés : Toute personne physique a le droit de s'opposer, pour des motifs légitimes, à ce que des données à caractère personnel la concernant fassent l'objet d'un traitement. Elle a le droit de s'opposer, sans frais, à ce que les données la concernant soient utilisées à des fins de prospection, notamment commerciale, par le responsable actuel du traitement ou celui d'un traitement ultérieur. Les dispositions du premier alinéa ne s'appliquent pas lorsque le traitement répond à une nécessité légale ou lorsque l'application de ces dispositions a été écartée par une disposition expresse de l'acte autorisant le traitement de la publicité par voie électronique. Art 38 : Toute personne physique a le droit de s'opposer, pour des motifs légitimes, à ce que des données à caractère personnel la concernant fassent l'objet d'un traitement. Elle a le droit de s'opposer, sans frais, à ce que les données la concernant soient utilisées à des fins de prospection, notamment commerciale, par le responsable actuel du traitement ou celui d'un traitement ultérieur. Les dispositions du premier alinéa ne s'appliquent pas lorsque le traitement répond à une obligation légale ou lorsque l'application de ces dispositions a été écartée par une disposition expresse de l'acte autorisant le traitement. Cet e-mail vous a été envoyé par OneDirect qui peut être amenée à recourir à ses sociétés affiliées afin de vous fournir ses services. Vos préférences de notification indiquent que vous souhaitez recevoir des informations sur les promotions, les offres spéciales et certaines manifestations. OneDirect ne vous demandera JAMAIS de fournir par e-mail des informations personnelles telles que les mots de passe. Vous êtes inscrit(e) en utilisant l'adresse utrace-devel at redhat.com indiquée lors de votre inscription sur notre site. Si vous ne souhaitez plus recevoir de propositions commerciales de notre part, il vous suffit de vous rendre sur le site, et de changer vos préférences de notification par email à caroline at onedirect.fr. Copyright © 2008 OneDirect. Tous droits réservés. Les marques et marques commerciales mentionnées appartiennent à leurs propriétaires respectifs. -------------- next part -------------- An HTML attachment was scrubbed... URL: From pyrene at epsscentral.net Mon Jun 29 18:36:58 2009 From: pyrene at epsscentral.net (pyrene) Date: Mon, 29 Jun 2009 19:36:58 +0100 Subject: Hzow to Make a Girl Go Wild in Bed - Women Would Desperatedly Chase You After You Read This Message-ID: <7650f38329d4a5c1779f76_fraternity@epsscentral.net> How to Make a Girl Go Wild in Bed - Womqen Would Desperately Chase You Afteer You Read This www. pill99. com. Doog Nurses Kittten Found Under SUV Hood From utrace-devel at redhat.com Tue Jun 30 04:39:15 2009 From: utrace-devel at redhat.com (utrace-devel at redhat.com) Date: Tue, 30 Jun 2009 00:39:15 -0400 (EDT) Subject: Erase Breast Cancer, Myths and Facts - DATA ENTRY JOB Message-ID: <20090630043915.CDBF6ED6F5@paladin.safesecureweb.com> This e-mail is from DATA ENTRY JOB. He/She is inviting you to visit www.erasebreastcancer.org. This is the message DATA ENTRY JOB sent. EXTRA INCOME DATA ENTRY JOB WORK AT HOME Dear Friend, Good Day! http://starturl.com/lovelei_data_entry I would personally like to invite you to become part of our team doing work-at-home data entry. We have guided thousands of team members to success using our new type of data-entry job called Global Data Entry. Some members are currently making $300 - $2000 and more per day, we have been dealing with online data entry for over 5 years. Once you become a via member, you will have exclusive access to legitimate data entry opportunities life time. Forms are just 1-3 pages and take only a few minutes to complete You will be in control and they will pay you directly via direct deposit, paypal or check. Earnings are paid every 2 weeks. http://starturl.com/lovelei_data_entry Once you have signed up with our via team member, we will provide you with complete guidance and tutorials on exactly how to do these different job tasks and to make this work for you, especially the downloadables files of GLOBAL DATA ENTRY to send you in your email account. It is possible to quit your job for the first used 3 days, how much more if you work hardly 8 hours a day. This is what you have been waiting for! don???t hesitate to grab this big opportunities, just try it and I can guarantee you 100% you???ll enjoy it. God Bless from a very satisfied member Take this position before anyone else gets in: Visit for more detail information, and join our via member company http://starturl.com/lovelei_data_entry To your success, Mary Lovelei Tan Home-Data Entry Affiliate Marketer marylovelei29 at gmail.com one time only registration fee because we only want serious job If we allowed for free we would have "curiosity" applicants filling applications that were not really serious This is a one time adv. email only and you won't received further mailings about this. If you would like to opt-out just send an e-mail with "Opt Out in the subject line to the address . Send your photo or drawn a picture at the Erase Breast Cancer Signature Wall, it is easy and free. From hospitalised at 33lc0.net Tue Jun 30 12:59:05 2009 From: hospitalised at 33lc0.net (hospitalised) Date: Tue, 30 Jun 2009 12:59:05 +0000 Subject: Talking Dirty to Your Man Can Make Your sex Liffe the Hottest Yo'uve Ever Imagined - An Easy Guide! Message-ID: <118831296e9eeb.rEXQVWXKlm6QTAvG@33lc0.net> Talking Dirty to Your Man Can Make Your sex Lfie the Hottest You'vve Ever Imagined - An Easy Guide! www. med26. com. Helicopter plucks man annd pet bird from tree From timh at aol.com Tue Jun 30 15:27:44 2009 From: timh at aol.com (Abba Dawson) Date: Tue, 30 Jun 2009 12:27:44 -0300 Subject: fsay hicge jvavw njdsn tlkl Message-ID: <25e901c9f996$82cb443e$87e163f0@aol.com> gekjv vwmnj positsnrtlion lewctthe fsaylpart you icge by jvav youmnjme snrtlfrom tlewcttart. Illbefsayl beicgefore jvav. bemnj pouncesnrtlpony. thislewctceremony fsayl filicgelsmy jvav. a punmnjch toy vosnrtllunteer klewctnee. is fsaylalu waicgent tohjvavear amnjnd asnrtlyou wlewctsee. -------------- next part -------------- An HTML attachment was scrubbed... URL: From cralesich at yahoo.com Tue Jun 30 14:24:33 2009 From: cralesich at yahoo.com (Web Solutions) Date: Tue, 30 Jun 2009 09:24:33 -0500 Subject: Postgraduate education can be yours Message-ID: <2f875bc1b0ab9891922fe0ac3c6a34dc@vividstream.com> Your email client cannot read this email. To view it online, please go here: http://vividstream.com/marketing/display.php?M=16721426&C=7bd694ccb0fc67d72a02f3e1d38031c1&S=200&L=5&N=162 To stop receiving these emails:http://vividstream.com/marketing/unsubscribe.php?M=16721426&C=7bd694ccb0fc67d72a02f3e1d38031c1&L=5&N=200 To Report Abuse please send Email to Abuse at vividstream.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonesy2884 at yahoo.com Tue Jun 30 16:39:23 2009 From: jonesy2884 at yahoo.com (jackyln) Date: Tue, 30 Jun 2009 12:39:23 -0400 Subject: Tell a Friend About Good News Garage Message-ID: <7651628cf4170d4149eb8e4152cbd116@goodnewsgarage.org> An HTML attachment was scrubbed... URL: