linux-stable-rt/kernel
Steven Rostedt 8f0a056fcb ftrace: introduce ftrace_preempt_disable()/enable()
Impact: add new ftrace-plugin internal APIs

Parts of the tracer needs to be careful about schedule recursion.
If the NEED_RESCHED flag is set, a preempt_enable will call schedule.
Inside the schedule function, the NEED_RESCHED flag is cleared.

The problem arises when a trace happens in the schedule function but before
NEED_RESCHED is cleared. The race is as follows:

schedule()
  >> tracer called

    trace_function()
       preempt_disable()
       [ record trace ]
       preempt_enable()  <<- here's the issue.

         [check NEED_RESCHED]
          schedule()
          [ Repeat the above, over and over again ]

The naive approach is simply to use preempt_enable_no_schedule instead.
The problem with that approach is that, although we solve the schedule
recursion issue, we now might lose a preemption check when not in the
schedule function.

  trace_function()
    preempt_disable()
    [ record trace ]
    [Interrupt comes in and sets NEED_RESCHED]
    preempt_enable_no_resched()
    [continue without scheduling]

The way ftrace handles this problem is with the following approach:

	int resched;

	resched = need_resched();
	preempt_disable_notrace();
	[record trace]
	if (resched)
		preempt_enable_no_sched_notrace();
	else
		preempt_enable_notrace();

This may seem like the opposite of what we want. If resched is set
then we call the "no_sched" version??  The reason we do this is because
if NEED_RESCHED is set before we disable preemption, there's two reasons
for that:

  1) we are in an atomic code path
  2) we are already on our way to the schedule function, and maybe even
     in the schedule function, but have yet to clear the flag.

Both the above cases we do not want to schedule.

This solution has already been implemented within the ftrace infrastructure.
But the problem is that it has been implemented several times. This patch
encapsulates this code to two nice functions.

  resched = ftrace_preempt_disable();
  [ record trace]
  ftrace_preempt_enable(resched);

This way the tracers do not need to worry about getting it right.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-04 10:09:48 +01:00
..
irq irq: make variable static 2008-10-22 07:37:17 +02:00
power PM_TEST_SUSPEND should depend on RTC_CLASS, not RTC_LIB 2008-11-01 12:40:38 -07:00
time Merge branch 'timers/range-hrtimers' into v28-range-hrtimers-for-linus-v2 2008-10-22 09:48:06 +02:00
trace ftrace: introduce ftrace_preempt_disable()/enable() 2008-11-04 10:09:48 +01:00
.gitignore
Kconfig.freezer container freezer: implement freezer cgroup subsystem 2008-10-20 08:52:34 -07:00
Kconfig.hz
Kconfig.preempt
Makefile Merge branch 'tracing/ftrace' into tracing/urgent 2008-10-22 09:08:14 +02:00
acct.c tty: Fix abusers of current->sighand->tty 2008-10-13 09:51:42 -07:00
audit.c [PATCH] Fix the bug of using AUDIT_STATUS_RATE_LIMIT when set fail, no error output. 2008-08-01 12:15:16 -04:00
audit.h
audit_tree.c [PATCH] get rid of nameidata in audit_tree 2008-10-23 05:12:53 -04:00
auditfilter.c Re: [PATCH] the loginuid field should be output in all AUDIT_CONFIG_CHANGE audit messages 2008-08-01 12:15:03 -04:00
auditsc.c tty: Fix abusers of current->sighand->tty 2008-10-13 09:51:42 -07:00
backtracetest.c
bounds.c
capability.c security: Fix setting of PF_SUPERPRIV by __capable() 2008-08-14 22:59:43 +10:00
cgroup.c cgroup: remove unused variable 2008-10-26 09:38:17 -07:00
cgroup_debug.c cgroups: fix probable race with put_css_set[_taskexit] and find_css_set 2008-10-20 08:52:38 -07:00
cgroup_freezer.c freezer_cg: simplify freezer_change_state() 2008-10-30 11:38:45 -07:00
compat.c Merge branches 'timers/clocksource', 'timers/hrtimers', 'timers/nohz', 'timers/ntp', 'timers/posixtimers' and 'timers/debug' into v28-timers-for-linus 2008-10-20 13:14:06 +02:00
configs.c kernel/configs.c: remove useless comments 2008-10-20 08:52:34 -07:00
cpu.c Merge branches 'sched/devel', 'sched/cpu-hotplug', 'sched/cpusets' and 'sched/urgent' into sched/core 2008-10-08 11:31:02 +02:00
cpuset.c cpuset: use seq_*mask_* to print masks 2008-10-20 08:52:39 -07:00
delayacct.c
dma-coherent.c dma-coherent: export dma_[alloc|release]_from_coherent methods 2008-08-22 08:34:53 +02:00
dma.c kernel/dma.c: remove a CVS keyword 2008-10-16 11:21:30 -07:00
exec_domain.c proc: move /proc/execdomains to kernel/exec_domain.c 2008-10-23 14:30:41 +04:00
exit.c Merge branch 'tracing-v28-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2008-10-20 13:35:07 -07:00
extable.c
fork.c Merge branch 'timers/range-hrtimers' into v28-range-hrtimers-for-linus-v2 2008-10-22 09:48:06 +02:00
freezer.c freezer_cg: use thaw_process() in unfreeze_cgroup() 2008-10-30 11:38:45 -07:00
futex.c hrtimer: make the futex() system call use the per process slack value 2008-09-11 07:17:00 -07:00
futex_compat.c
hrtimer.c Merge branch 'timers/range-hrtimers' into v28-range-hrtimers-for-linus-v2 2008-10-22 09:48:06 +02:00
itimer.c timers: fix itimer/many thread hang 2008-09-14 16:25:35 +02:00
kallsyms.c kernel/kallsyms.c: fix double return 2008-10-16 11:21:32 -07:00
kexec.c kexec: fix crash_save_vmcoreinfo_init build problem 2008-10-20 15:28:50 -07:00
kfifo.c
kgdb.c kgdb: call touch_softlockup_watchdog on resume 2008-10-06 13:50:59 -05:00
kmod.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus 2008-10-16 12:38:34 -07:00
kprobes.c make kprobes.c:kretprobe_table_lock() static 2008-10-16 11:21:52 -07:00
ksysfs.c profiling: dynamically enable readprofile at runtime 2008-10-16 11:21:31 -07:00
kthread.c Merge branch 'tracing-v28-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2008-10-20 13:35:07 -07:00
latencytop.c
lockdep.c lockdep: fix irqs on/off ip tracing 2008-10-28 11:19:07 +01:00
lockdep_internals.h lockdep: build fix 2008-08-13 12:55:10 +02:00
lockdep_proc.c lockstat: fix numerical output rounding error 2008-08-26 10:37:46 +02:00
marker.c markers: break the redundant loop in kernel/marker.c 2008-10-27 13:02:23 +01:00
module.c Merge branch 'proc' of git://git.kernel.org/pub/scm/linux/kernel/git/adobriyan/proc 2008-10-23 12:04:37 -07:00
mutex-debug.c
mutex-debug.h
mutex.c locking: fix mutex @key parameter kernel-doc notation 2008-07-28 18:12:36 +02:00
mutex.h
notifier.c ftrace: ignore functions that cannot be kprobe-ed 2008-10-14 10:34:22 +02:00
ns_cgroup.c
nsproxy.c removed unused #include <linux/version.h>'s 2008-08-23 12:14:12 -07:00
panic.c Make panic= and panic_on_oops into core_params 2008-10-22 10:00:25 +11:00
params.c Fix compile warning in kernel/params.c 2008-10-23 12:09:00 -07:00
pid.c
pid_namespace.c pid_ns: (BUG 11391) change ->child_reaper when init->group_leader exits 2008-09-02 19:21:38 -07:00
pm_qos_params.c pm_qos_requirement might sleep 2008-09-02 19:21:40 -07:00
posix-cpu-timers.c timers: fix itimer/many thread hang, v2 2008-09-23 13:38:44 +02:00
posix-timers.c Merge branch 'timers/range-hrtimers' into v28-range-hrtimers-for-linus-v2 2008-10-22 09:48:06 +02:00
printk.c printk: remove unused code from kernel/printk.c 2008-10-23 21:54:29 +02:00
profile.c kernel/profile: fix profile_init() section mismatch 2008-10-30 11:38:46 -07:00
ptrace.c make ptrace_untrace() static 2008-10-20 08:52:39 -07:00
rcuclassic.c rcu: RCU-based detection of stalled CPUs for Classic RCU, fix 2008-10-03 10:41:00 +02:00
rcupdate.c rcupdate: fix bug of rcu_barrier*() 2008-10-21 15:59:53 +02:00
rcupreempt.c byteorder: remove direct includes of linux/byteorder/swab[b].h 2008-10-20 08:52:40 -07:00
rcupreempt_trace.c rcu: trace fix possible mem-leak 2008-08-15 17:54:40 +02:00
rcutorture.c byteorder: remove direct includes of linux/byteorder/swab[b].h 2008-10-20 12:51:53 -07:00
relay.c relay: fix "full buffer with exactly full last subbuffer" accounting problem 2008-08-05 14:33:46 -07:00
res_counter.c
resource.c reserve_region_with_split: Fix GFP_KERNEL usage under spinlock 2008-11-01 09:53:58 -07:00
rtmutex-debug.c
rtmutex-debug.h
rtmutex-tester.c
rtmutex.c hrtimer: convert kernel/* to the new hrtimer apis 2008-09-05 21:35:13 -07:00
rtmutex.h
rtmutex_common.h
rwsem.c
sched.c sched: virtual time buddy preemption 2008-10-24 12:51:03 +02:00
sched_clock.c sched_clock: prevent scd->clock from moving backwards 2008-10-10 11:17:04 +02:00
sched_cpupri.c
sched_cpupri.h
sched_debug.c sched: change sched_debug's mode to 0444 2008-10-30 11:37:57 +01:00
sched_fair.c sched: virtual time buddy preemption 2008-10-24 12:51:03 +02:00
sched_features.h sched: disable the hrtick for now 2008-10-20 14:27:43 +02:00
sched_idletask.c sched: add CONFIG_SMP consistency 2008-10-22 10:01:52 +02:00
sched_rt.c Merge commit 'v2.6.28-rc1' into sched/urgent 2008-10-24 12:48:46 +02:00
sched_stats.h Merge branch 'proc' of git://git.kernel.org/pub/scm/linux/kernel/git/adobriyan/proc 2008-10-23 12:04:37 -07:00
seccomp.c
semaphore.c semaphore: __down_common: use signal_pending_state() 2008-08-05 14:33:47 -07:00
signal.c 'kill sig -1' must only apply to caller's namespace 2008-10-30 11:38:46 -07:00
smp.c smp: have smp_call_function_single() detect invalid CPUs 2008-08-25 17:45:48 -07:00
softirq.c Merge branches 'timers/clocksource', 'timers/hrtimers', 'timers/nohz', 'timers/ntp', 'timers/posixtimers' and 'timers/debug' into v28-timers-for-linus 2008-10-20 13:14:06 +02:00
softlockup.c Make the taint flags reliable 2008-10-16 11:21:31 -07:00
spinlock.c lockdep: spin_lock_nest_lock(), checkpatch fixes 2008-08-13 13:56:51 +02:00
srcu.c
stacktrace.c
stop_machine.c Revert "Call init_workqueues before pre smp initcalls." 2008-10-25 19:53:38 -07:00
sys.c Merge branch 'timers/range-hrtimers' into v28-range-hrtimers-for-linus-v2 2008-10-22 09:48:06 +02:00
sys_ni.c Configure out AIO support 2008-10-16 11:21:51 -07:00
sysctl.c Merge branch 'linus' into tracing/ftrace 2008-10-31 00:38:21 +01:00
sysctl_check.c
taskstats.c
test_kprobes.c
time.c select: add a timespec_add_safe() function 2008-09-05 21:34:57 -07:00
timeconst.pl
timer.c Merge branches 'timers/clocksource', 'timers/hrtimers', 'timers/nohz', 'timers/ntp', 'timers/posixtimers' and 'timers/debug' into v28-timers-for-linus 2008-10-20 13:14:06 +02:00
tracepoint.c tracepoint: introduce *_noupdate APIs. 2008-11-03 10:28:52 +01:00
tsacct.c
uid16.c
user.c sched: rt-bandwidth for user grouping interface 2008-08-19 13:10:09 +02:00
user_namespace.c removed unused #include <linux/version.h>'s 2008-08-23 12:14:12 -07:00
utsname.c removed unused #include <linux/version.h>'s 2008-08-23 12:14:12 -07:00
utsname_sysctl.c sysctl: simplify ->strategy 2008-10-16 11:21:47 -07:00
wait.c wait: kill is_sync_wait() 2008-10-16 11:21:31 -07:00
workqueue.c workqueue: introduce create_rt_workqueue 2008-10-22 10:00:25 +11:00