[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: pthread_kill is racy: probably needs kernel change



> so userspace has to be sure that:
> 
>   1) the PID/TID it is passing to sys_kill()/sys_tkill() is valid
It's either valid or terminated, unlesss the struct pthread is corrupted.

>   2) the target process/thread does not exit prematurely
SUSv3 doesn't specify whether this is required or not, so we should
make it work.

> i suspect there are other problems as well that might happen if a struct
> pthread is freed while it's being used - isnt this a programming bug?
No: struct pthreads are only freed when the thread is detached.

> it's not a consequence of CLONE_DETACHED - it's a fundamental property of
> the PID/TID space. It's something that is freed.

Without CLONE_DETACHED children become zombies and thus their pid is
not freed until the parent waits on them.

Instead with CLONE_CLEARTID | CLONE_DETACHED, we need to lock against
the tid clearing that happens in do_exit.

Of course you have a problem if you get pids from pid files or /proc
but this is generally only done based on explicit user action (see
below for a way to fix this problem) or at system startup/shutdown.

> there's one thing that can be done, which i proposed a few weeks ago, to
> pass in PID and TID as well to tkill(), thus the kernel can double-check
> that the thread is really in the intended thread group. The kernel
> guarantees that the 'thread group ID' (ie. the PID) is not reused unless
> all threads exit.
But this doesn't work if the reused TID is in the same thread group.

How about instead my patch, changed to pin the page in physiscal
memory before taking tasklist_lock and to create a new syscall? (how
about sys_tkilldet?)

Alternatively, rather than pinning the page we could do like this:

get_user(pid, ppid);
retry:
if(pid == 0)
	return -ESRCH;

read_lock(&tasklist_lock);
if((p = find_task_by_pid(pid)))
	get_task_struct(p);
read_unlock(&tasklist_lock);

if(!p)
	return -ESRCH;

get_user(newpid, ppid);
if(pid != newpid)
{
	pid = newpid;
	put_task_struct(p);
	goto retry;
}

read_lock(&tasklist_lock);
error = -ESRCH;
if(!p->sig) /* task exited */
	goto out_unlock;

send signal(p);

put_task_struct(p);

out_unlock:
read_unlock(&tasklist_lock);

But it seems more contrived than pinning the page.


Another possible solution is to attach a 64-bit or 128-bit unique
identifier to tasks that is checked after the pid lookup, but this
change would be much more intrusive.

> the polite thing is to create a new syscall, sys_tkill() is something that
> exists in the 2.4 kernel as well and NGPT uses sys_tkill().
OK.

Attachment: pgp00039.pgp
Description: PGP signature


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]