linux-stable-rt/kernel
Johannes Weiner 777c6c5f1f wait: prevent exclusive waiter starvation
With exclusive waiters, every process woken up through the wait queue must
ensure that the next waiter down the line is woken when it has finished.

Interruptible waiters don't do that when aborting due to a signal.  And if
an aborting waiter is concurrently woken up through the waitqueue, noone
will ever wake up the next waiter.

This has been observed with __wait_on_bit_lock() used by
lock_page_killable(): the first contender on the queue was aborting when
the actual lock holder woke it up concurrently.  The aborted contender
didn't acquire the lock and therefor never did an unlock followed by
waking up the next waiter.

Add abort_exclusive_wait() which removes the process' wait descriptor from
the waitqueue, iff still queued, or wakes up the next waiter otherwise.
It does so under the waitqueue lock.  Racing with a wake up means the
aborting process is either already woken (removed from the queue) and will
wake up the next waiter, or it will remove itself from the queue and the
concurrent wake up will apply to the next waiter after it.

Use abort_exclusive_wait() in __wait_event_interruptible_exclusive() and
__wait_on_bit_lock() when they were interrupted by other means than a wake
up through the queue.

[akpm@linux-foundation.org: coding-style fixes]
Reported-by: Chris Mason <chris.mason@oracle.com>
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Mentored-by: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Chuck Lever <cel@citi.umich.edu>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: <stable@kernel.org>		["after some testing"]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-05 12:56:48 -08:00
..
irq Merge branch 'core/xen' into x86/urgent 2009-02-04 14:54:56 +01:00
power Hibernation: Introduce system_entering_hibernation 2009-01-27 02:15:45 -05:00
time hrtimers: allow the hot-unplugging of all cpus 2009-01-30 22:35:29 +01:00
trace ftrace: do_each_pid_task() needs rcu lock 2009-02-03 22:50:58 +01:00
.gitignore
Kconfig.freezer
Kconfig.hz
Kconfig.preempt
Makefile kernel/up.c: omit it if SMP=y, USE_GENERIC_SMP_HELPERS=n 2009-01-14 09:42:11 -08:00
acct.c [CVE-2009-0029] System call wrappers part 04 2009-01-14 14:15:19 +01:00
async.c kernel/async.c: fix printk warnings 2009-02-05 12:56:46 -08:00
audit.c
audit.h
audit_tree.c
auditfilter.c
auditsc.c
backtracetest.c
bounds.c
capability.c [CVE-2009-0029] System call wrappers part 04 2009-01-14 14:15:19 +01:00
cgroup.c cgroup: fix root_count when mount fails due to busy subsystem 2009-01-29 18:04:45 -08:00
cgroup_debug.c
cgroup_freezer.c
compat.c
configs.c
cpu.c
cpuset.c cpuset: fix possible deadlock in async_rebuild_sched_domains 2009-01-19 02:44:00 +01:00
cred-internals.h
cred.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6 2009-01-09 13:59:25 -08:00
delayacct.c
dma-coherent.c dma-coherent: Restore dma_alloc_from_coherent() large alloc fall back policy. 2009-01-21 18:51:53 +09:00
dma.c
exec_domain.c [CVE-2009-0029] System call wrappers part 04 2009-01-14 14:15:19 +01:00
exit.c [CVE-2009-0029] System call wrappers part 08 2009-01-14 14:15:21 +01:00
extable.c
fork.c Merge branch 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2009-01-26 09:47:43 -08:00
freezer.c
futex.c [CVE-2009-0029] System call wrappers part 31 2009-01-14 14:15:31 +01:00
futex_compat.c
hrtimer.c hrtimer: prevent negative expiry value after clock_was_set() 2009-01-30 22:35:34 +01:00
itimer.c [CVE-2009-0029] System call wrappers part 05 2009-01-14 14:15:20 +01:00
kallsyms.c Revert "kbuild: strip generated symbols from *.ko" 2009-01-14 21:38:20 +01:00
kexec.c [CVE-2009-0029] System call wrappers part 07 2009-01-14 14:15:20 +01:00
kfifo.c
kgdb.c
kmod.c
kprobes.c kprobes: check CONFIG_FREEZER instead of CONFIG_PM 2009-01-16 14:32:17 -05:00
ksysfs.c
kthread.c
latencytop.c
lockdep.c
lockdep_internals.h
lockdep_proc.c
marker.c
module.c modules: Use a better scheme for refcounting 2009-02-02 19:17:55 -08:00
mutex-debug.c
mutex-debug.h
mutex.c
mutex.h
notifier.c
ns_cgroup.c ns_cgroup: remove unused spinlock 2009-01-08 08:31:02 -08:00
nsproxy.c
panic.c
params.c
pid.c pid: generalize task_active_pid_ns 2009-01-08 08:31:12 -08:00
pid_namespace.c
pm_qos_params.c
posix-cpu-timers.c
posix-timers.c [CVE-2009-0029] System call wrappers part 05 2009-01-14 14:15:20 +01:00
printk.c [CVE-2009-0029] System call wrappers part 27 2009-01-14 14:15:29 +01:00
profile.c
ptrace.c [CVE-2009-0029] System call wrappers part 27 2009-01-14 14:15:29 +01:00
rcuclassic.c rcu: add __cpuinit to rcu_init_percpu_data() 2009-01-14 11:26:40 +01:00
rcupdate.c
rcupreempt.c
rcupreempt_trace.c
rcutorture.c rcu: fix bug in rcutorture system-shutdown code 2009-01-07 23:36:25 +01:00
rcutree.c rcu: add __cpuinit to rcu_init_percpu_data() 2009-01-14 11:26:40 +01:00
rcutree_trace.c
relay.c relay: fix lock imbalance in relay_late_setup_files 2009-01-18 20:29:35 +01:00
res_counter.c memcg: memory cgroup resource counters for hierarchy 2009-01-08 08:31:05 -08:00
resource.c resources: fix parameter name and kernel-doc 2009-01-15 16:39:38 -08:00
rtmutex-debug.c
rtmutex-debug.h
rtmutex-tester.c
rtmutex.c
rtmutex.h
rtmutex_common.h
rwsem.c
sched.c wait: prevent exclusive waiter starvation 2009-02-05 12:56:48 -08:00
sched_clock.c
sched_cpupri.c
sched_cpupri.h
sched_debug.c sched: partly revert "sched debug: remove NULL checking in print_cfs_rt_rq()" 2009-01-11 02:40:32 +01:00
sched_fair.c sched: fix buddie group latency 2009-02-01 10:49:51 +01:00
sched_features.h
sched_idletask.c
sched_rt.c sched_rt: don't use first_cpu on cpumask created with cpumask_and 2009-02-01 10:49:52 +01:00
sched_stats.h
seccomp.c
semaphore.c
signal.c signals, debug: fix BUG: using smp_processor_id() in preemptible code in print_fatal_signal() 2009-01-27 00:36:19 +01:00
smp.c generic-ipi: use per cpu data for single cpu ipi calls 2009-01-30 18:31:08 +01:00
softirq.c
softlockup.c softlock: fix false panic which can occur if softlockup_thresh is reduced 2009-01-14 11:48:07 +01:00
spinlock.c
srcu.c
stacktrace.c
stop_machine.c
sys.c revert "rlimit: permit setting RLIMIT_NOFILE to RLIM_INFINITY" 2009-02-05 12:56:47 -08:00
sys_ni.c [CVE-2009-0029] Make sys_syslog a conditional system call 2009-01-14 14:15:16 +01:00
sysctl.c Merge branch 'core/debugobjects' into core/urgent 2009-01-22 10:03:02 +01:00
sysctl_check.c
taskstats.c
test_kprobes.c
time.c [CVE-2009-0029] System call wrappers part 01 2009-01-14 14:15:18 +01:00
timeconst.pl
timer.c [CVE-2009-0029] System call wrappers part 27 2009-01-14 14:15:29 +01:00
tracepoint.c
tsacct.c
uid16.c [CVE-2009-0029] System call wrappers part 19 2009-01-14 14:15:26 +01:00
up.c smp_call_function_single(): be slightly less stupid, fix #2 2009-01-12 16:04:37 +01:00
user.c
user_namespace.c
utsname.c
utsname_sysctl.c
wait.c wait: prevent exclusive waiter starvation 2009-02-05 12:56:48 -08:00
workqueue.c work_on_cpu: Use our own workqueue. 2009-01-19 22:36:07 +01:00