linux-stable-rt/kernel
Paul Jackson 3077a260e9 [PATCH] cpuset release ABBA deadlock fix
Fix possible cpuset_sem ABBA deadlock if 'notify_on_release' set.

For a particular usage pattern, creating and destroying cpusets fairly
frequently using notify_on_release, on a very large system, this deadlock
can be seen every few days.  If you are not using the cpuset
notify_on_release feature, you will never see this deadlock.

The existing code, on task exit (or cpuset deletion) did:

  get cpuset_sem
  if cpuset marked notify_on_release and is ready to release:
    compute cpuset path relative to /dev/cpuset mount point
    call_usermodehelper() forks /sbin/cpuset_release_agent with path
  drop cpuset_sem

Unfortunately, the fork in call_usermodehelper can allocate memory, and
allocating memory can require cpuset_sem, if the mems_generation values
changed in the interim.  This results in an ABBA deadlock, trying to obtain
cpuset_sem when it is already held by the current task.

To fix this, I put the cpuset path (which must be computed while holding
cpuset_sem) in a temporary buffer, to be used in the call_usermodehelper
call of /sbin/cpuset_release_agent only _after_ dropping cpuset_sem.

So the new logic is:

  get cpuset_sem
  if cpuset marked notify_on_release and is ready to release:
    compute cpuset path relative to /dev/cpuset mount point
    stash path in kmalloc'd buffer
  drop cpuset_sem
  call_usermodehelper() forks /sbin/cpuset_release_agent with path
  free path

The sharp eyed reader might notice that this patch does not contain any
calls to kmalloc.  The existing code in the check_for_release() routine was
already kmalloc'ing a buffer to hold the cpuset path.  In the old code, it
just held the buffer for a few lines, over the cpuset_release_agent() call
that in turn invoked call_usermodehelper().  In the new code, with the
application of this patch, it returns that buffer via the new char
**ppathbuf parameter, for later use and freeing in cpuset_release_agent(),
which is called after cpuset_sem is dropped.  Whereas the old code has just
one call to cpuset_release_agent(), right in the check_for_release()
routine, the new code has three calls to cpuset_release_agent(), from the
various places that a cpuset can be released.

This patch has been build and booted on SN2, and passed a stress test that
previously hit the deadlock within a few seconds.

Signed-off-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-09 12:08:22 -07:00
..
irq [PATCH] irqpoll 2005-06-28 21:20:35 -07:00
power [PATCH] Address BUG: using smp_processor_id() in preemptible [00000001] code 2005-07-27 16:25:50 -07:00
Kconfig.hz [PATCH] i386: Selectable Frequency of the Timer Interrupt 2005-06-23 09:45:10 -07:00
Kconfig.preempt [PATCH] sched: voluntary kernel preemption 2005-06-25 16:24:45 -07:00
Makefile [PATCH] kdump: Routines for copying dump pages 2005-06-25 16:24:53 -07:00
acct.c
audit.c
auditsc.c
capability.c [PATCH] kernel/capability.c: add kerneldoc 2005-07-27 16:26:06 -07:00
compat.c
configs.c
cpu.c [PATCH] i386 CPU hotplug 2005-06-25 16:24:29 -07:00
cpuset.c [PATCH] cpuset release ABBA deadlock fix 2005-08-09 12:08:22 -07:00
crash_dump.c [PATCH] kernel/crash_dump.c: add kerneldoc 2005-07-27 16:26:06 -07:00
dma.c
exec_domain.c
exit.c [PATCH] revert "timer exit cleanup" 2005-08-04 16:57:49 -07:00
extable.c
fork.c [PATCH] lower VM_DONTCOPY total_vm 2005-07-12 16:00:58 -07:00
futex.c
intermodule.c
itimer.c [PATCH] itimer fixes 2005-07-27 16:25:51 -07:00
kallsyms.c
kexec.c [PATCH] kexec: fix sparse warnings 2005-06-28 14:53:40 -07:00
kfifo.c
kmod.c [PATCH] Keys: Pass session keyring to call_usermodehelper() 2005-06-24 00:05:18 -07:00
kprobes.c [PATCH] kprobes: fix namespace problem and sparc64 build 2005-07-05 19:19:00 -07:00
ksysfs.c [PATCH] Kdump: Export crash notes section address through sysfs 2005-06-25 16:24:51 -07:00
kthread.c
module.c [PATCH] Module per-cpu alignment cannot always be met 2005-08-01 21:38:01 -07:00
panic.c [PATCH] Call emergency_reboot from panic 2005-07-26 14:35:43 -07:00
params.c
pid.c
posix-cpu-timers.c
posix-timers.c [PATCH] revert "timer exit cleanup" 2005-08-04 16:57:49 -07:00
printk.c [PATCH] CPU hotplug printk fix 2005-06-25 16:24:34 -07:00
profile.c [PATCH] mostly_read data section 2005-07-07 18:23:46 -07:00
ptrace.c
rcupdate.c
resource.c [PATCH] Use ALIGN to remove duplicate code 2005-06-25 16:25:02 -07:00
sched.c [PATCH] fix MAX_USER_RT_PRIO and MAX_RT_PRIO 2005-07-26 15:40:00 -07:00
seccomp.c
signal.c [PATCH] Cleanup patch for process freezing 2005-06-25 17:10:13 -07:00
softirq.c [PATCH] revert bogus softirq changes 2005-07-30 10:49:59 -07:00
spinlock.c
stop_machine.c
sys.c [PATCH] Remove suspend() calls from shutdown path 2005-08-04 08:20:47 -07:00
sys_ni.c [PATCH] remove sys_set_zone_reclaim() 2005-08-01 10:03:56 -07:00
sysctl.c [PATCH] s390: spin lock retry 2005-07-27 16:26:04 -07:00
time.c [PATCH] clean up inline static vs static inline 2005-07-27 16:26:20 -07:00
timer.c [PATCH] kernel/timer: fix msleep_interruptible() comment 2005-06-25 16:24:58 -07:00
uid16.c
user.c [PATCH] inotify 2005-07-12 20:38:38 -07:00
wait.c
workqueue.c