Commit Graph

367 Commits

Author SHA1 Message Date
Al Viro 9796fdd829 [PATCH] gfp_t: kernel/*
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-28 08:16:49 -07:00
Roland McGrath 72ab373a56 [PATCH] Yet more posix-cpu-timer fixes
This just makes sure that a thread's expiry times can't get reset after
it clears them in do_exit.

This is what allowed us to re-introduce the stricter BUG_ON() check in
a362f463a6.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-27 09:08:43 -07:00
Linus Torvalds a362f463a6 Revert "remove false BUG_ON() from run_posix_cpu_timers()"
This reverts commit 3de463c7d9.

Roland has another patch that allows us to leave the BUG_ON() in place
by just making sure that the condition it tests for really is always
true.

That goes in next.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-27 09:07:33 -07:00
Oleg Nesterov 7a4ed937aa [PATCH] Fix cpu timers expiration time
There's a silly off-by-one error in the code that updates the expiration
of posix CPU timers, causing them to not be properly updated when they
hit exactly on their expiration time (which should be the normal case).

This causes them to then fire immediately again, and only _then_ get
properly updated.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-26 15:21:14 -07:00
Linus Torvalds 70ab81c2ed posix cpu timers: fix timer ordering
Pointed out by Oleg Nesterov, who has been walking over the code
forwards and backwards.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-26 11:23:06 -07:00
Andrew Morton bb32051532 [PATCH] export cpu_online_map
With CONFIG_SMP=n:

*** Warning: "cpu_online_map" [drivers/firmware/dcdbas.ko] undefined!

due to set_cpus_allowed().

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-26 10:39:43 -07:00
Oleg Nesterov a69ac4a78d [PATCH] posix-timers: fix posix_cpu_timer_set() vs run_posix_cpu_timers() race
This might be harmless, but looks like a race from code inspection (I
was unable to trigger it).  I must admit, I don't understand why we
can't return TIMER_RETRY after 'spin_unlock(&p->sighand->siglock)'
without doing bump_cpu_timer(), but this is what original code does.

posix_cpu_timer_set:

	read_lock(&tasklist_lock);

	spin_lock(&p->sighand->siglock);
	list_del_init(&timer->it.cpu.entry);
	spin_unlock(&p->sighand->siglock);

We are probaly deleting the timer from run_posix_cpu_timers's 'firing'
local list_head while run_posix_cpu_timers() does list_for_each_safe.

Various bad things can happen, for example we can just delete this timer
so that list_for_each() will not notice it and run_posix_cpu_timers()
will not reset '->firing' flag. In that case,

	....

	if (timer->it.cpu.firing) {
		read_unlock(&tasklist_lock);
		timer->it.cpu.firing = -1;
		return TIMER_RETRY;
	}

sys_timer_settime() goes to 'retry:', calls posix_cpu_timer_set() again,
it returns TIMER_RETRY ...

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-24 08:13:14 -07:00
Oleg Nesterov ca531a0a5e [PATCH] posix-timers: exit path cleanup
No need to rebalance when task exited

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-24 08:12:35 -07:00
Oleg Nesterov 3de463c7d9 [PATCH] posix-timers: remove false BUG_ON() from run_posix_cpu_timers()
do_exit() clears ->it_##clock##_expires, but nothing prevents
another cpu to attach the timer to exiting process after that.

After exit_notify() does 'write_unlock_irq(&tasklist_lock)' and
before do_exit() calls 'schedule() local timer interrupt can find
tsk->exit_state != 0. If that state was EXIT_DEAD (or another cpu
does sys_wait4) interrupted task has ->signal == NULL.

At this moment exiting task has no pending cpu timers, they were cleaned
up in __exit_signal()->posix_cpu_timers_exit{,_group}(), so we can just
return from irq.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-24 08:12:35 -07:00
Oleg Nesterov 108150ea78 [PATCH] posix-timers: fix cleanup_timers() and run_posix_cpu_timers() races
1. cleanup_timers() sets timer->task = NULL under tasklist + ->sighand locks.
   That means that this code in posix_cpu_timer_del() and posix_cpu_timer_set()

   		lock_timer(timer);
		if (timer->task == NULL)
			return;
		read_lock(tasklist);
		put_task_struct(timer->task)

   is racy. With this patch timer->task modified and accounted only under
   timer->it_lock. Sadly, this means that dead task_struct won't be freed
   until timer deleted or armed.

2. run_posix_cpu_timers() collects expired timers into local list under
   tasklist + ->sighand again. That means that posix_cpu_timer_del()
   should check timer->it.cpu.firing under these locks too.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-24 08:12:35 -07:00
Linus Torvalds e80eda94d3 Posix timers: limit number of timers firing at once
Bursty timers aren't good for anybody, very much including latency for
other programs when we trigger lots of timers in interrupt context.  So
set a random limit, after which we'll handle the rest on the next timer
tick.

Noted by Oleg Nesterov <oleg@tv-sign.ru>

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-23 10:02:50 -07:00
Roland McGrath 25f407f0b6 [PATCH] Call exit_itimers from do_exit, not __exit_signal
When I originally moved exit_itimers into __exit_signal, that was the only
place where we could reliably know it was the last thread in the group
dying, without races.  Since then we've gotten the signal_struct.live
counter, and do_exit can reliably do group-wide cleanup work.

This patch moves the call to do_exit, where it's made without locks.  This
avoids the deadlock issues that the old __exit_signal code's comment talks
about, and the one that Oleg found recently with process CPU timers.

[ This replaces e03d13e985, which is why
  it was just reverted. ]

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-21 15:38:08 -07:00
Linus Torvalds 9465bee863 Revert "Fix cpu timers exit deadlock and races"
Revert commit e03d13e985, to be replaced
by a much nicer fix from Roland.
2005-10-21 15:36:00 -07:00
Alan Stern d1209d049b [PATCH] Threads shouldn't inherit PF_NOFREEZE
The PF_NOFREEZE process flag should not be inherited when a thread is
forked.  This patch (as585) removes the flag from the child.

This problem is starting to show up more and more as drivers turn to the
kthread API instead of using kernel_thread().  As a result, their kernel
threads are now children of the kthread worker instead of modprobe, and
they inherit the PF_NOFREEZE flag.  This can cause problems during system
suspend; the kernel threads are not getting frozen as they ought to be.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-19 23:04:31 -07:00
Roland McGrath e03d13e985 [PATCH] Fix cpu timers exit deadlock and races
Oleg Nesterov reported an SMP deadlock.  If there is a running timer
tracking a different process's CPU time clock when the process owning
the timer exits, we deadlock on tasklist_lock in posix_cpu_timer_del via
exit_itimers.

That code was using tasklist_lock to check for a race with __exit_signal
being called on the timer-target task and clearing its ->signal.
However, there is actually no such race.  __exit_signal will have called
posix_cpu_timers_exit and posix_cpu_timers_exit_group before it does
that.  Those will clear those k_itimer's association with the dying
task, so posix_cpu_timer_del will return early and never reach the code
in question.

In addition, posix_cpu_timer_del called from exit_itimers during execve
or directly from timer_delete in the process owning the timer can race
with an exiting timer-target task to cause a double put on timer-target
task struct.  Make sure we always access cpu_timers lists with sighand
lock held.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Chris Wright <chrisw@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-19 23:02:01 -07:00
Eric Dumazet 5ee832dbc6 [PATCH] rcu: keep rcu callback event counter
This makes call_rcu() keep track of how many events there are on the RCU
list, and cause a reschedule event when the list gets too long.

This helps keep RCU event lists down.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-17 15:27:58 -07:00
Oleg Nesterov 47d6b08334 [PATCH] posix-timers: fix task accounting
Make sure we release the task struct properly when releasing pending
timers.

release_task() does write_lock_irq(&tasklist_lock), so it can't race
with run_posix_cpu_timers() on any cpu.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-17 15:00:00 -07:00
Linus Torvalds 2cc78eb52b Increase default RCU batching sharply
Dipankar made RCU limit the batch size to improve latency, but that
approach is unworkable: it can cause the RCU queues to grow without
bounds, since the batch limiter ended up limiting the callbacks.

So make the limit much higher, and start planning on instead limiting
the batch size by doing RCU callbacks more often if the queue looks like
it might be growing too long.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-17 09:10:15 -07:00
Takashi Iwai c6ecf7ed31 [PATCH] Add missing export of getnstimeofday()
Adds the missing EXPORT_SYMBOL_GPL for getnstimeofday() when
CONFIG_TIME_INTERPOLATION isn't set.  Needed by drivers/char/mmtimer.c

Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-14 17:10:12 -07:00
Harald Welte 46113830a1 [PATCH] Fix signal sending in usbdevio on async URB completion
If a process issues an URB from userspace and (starts to) terminate
before the URB comes back, we run into the issue described above.  This
is because the urb saves a pointer to "current" when it is posted to the
device, but there's no guarantee that this pointer is still valid
afterwards.

In fact, there are three separate issues:

1) the pointer to "current" can become invalid, since the task could be
   completely gone when the URB completion comes back from the device.

2) Even if the saved task pointer is still pointing to a valid task_struct,
   task_struct->sighand could have gone meanwhile.

3) Even if the process is perfectly fine, permissions may have changed,
   and we can no longer send it a signal.

So what we do instead, is to save the PID and uid's of the process, and
introduce a new kill_proc_info_as_uid() function.

Signed-off-by: Harald Welte <laforge@gnumonks.org>
[ Fixed up types and added symbol exports ]
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-10 16:16:33 -07:00
Rafael J. Wysocki 3dd083255d [PATCH] x86_64: Set up safe page tables during resume
The following patch makes swsusp avoid the possible temporary corruption
of page translation tables during resume on x86-64.  This is achieved by
creating a copy of the relevant page tables that will not be modified by
swsusp and can be safely used by it on resume.

The problem is that during resume on x86-64 swsusp may temporarily
corrupt the page tables used for the direct mapping of RAM.  If that
happens, a page fault occurs and cannot be handled properly, which leads
to the solid hang of the affected system.  This leads to the loss of the
system's state from before suspend and may result in the loss of data or
the corruption of filesystems, so it is a serious issue.  Also, it
appears to happen quite often (for me, as often as 50% of the time).

The problem is related to the fact that (at least) one of the PMD
entries used in the direct memory mapping (starting at PAGE_OFFSET)
points to a page table the physical address of which is much greater
than the physical address of the PMD entry itself.  Moreover,
unfortunately, the physical address of the page table before suspend
(i.e.  the one stored in the suspend image) happens to be different to
the physical address of the corresponding page table used during resume
(i.e.  the one that is valid right before swsusp_arch_resume() in
arch/x86_64/kernel/suspend_asm.S is executed).  Thus while the image is
restored, the "offending" PMD entry gets overwritten, so it does not
point to the right physical address any more (i.e.  there's no page
table at the address pointed to by it, because it points to the address
the page table has been at during suspend).  Consequently, if the PMD
entry is used later on, and it _is_ used in the process of copying the
image pages, a page fault occurs, but it cannot be handled in the normal
way and the system hangs.

In principle we can call create_resume_mapping() from
swsusp_arch_resume() (ie.  from suspend_asm.S), but then the memory
allocations in create_resume_mapping(), resume_pud_mapping(), and
resume_pmd_mapping() must be made carefully so that we use _only_
NosaveFree pages in them (the other pages are overwritten by the loop in
swsusp_arch_resume()).  Additionally, we are in atomic context at that
time, so we cannot use GFP_KERNEL.  Moreover, if one of the allocations
fails, we should free all of the allocated pages, so we need to trace
them somehow.

All of this is done in the appended patch, except that the functions
populating the page tables are located in arch/x86_64/kernel/suspend.c
rather than in init.c.  It may be done in a more elegan way in the
future, with the help of some swsusp patches that are in the works now.

[AK: move some externs into headers, renamed a function]

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-10 08:36:46 -07:00
Al Viro dd0fc66fb3 [PATCH] gfp flags annotations - part 1
- added typedef unsigned int __nocast gfp_t;

 - replaced __nocast uses for gfp flags with gfp_t - it gives exactly
   the same warnings as far as sparse is concerned, doesn't change
   generated code (from gcc point of view we replaced unsigned int with
   typedef) and documents what's going on far better.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-08 15:00:57 -07:00
Oleg Nesterov 788e05a67c [PATCH] fix do_coredump() vs SIGSTOP race
Let's suppose we have 2 threads in thread group:
	A - does coredump
	B - has pending SIGSTOP

thread A						thread B

do_coredump:						get_signal_to_deliver:

  lock(->sighand)
  ->signal->flags = SIGNAL_GROUP_EXIT
  unlock(->sighand)

							lock(->sighand)
							signr = dequeue_signal()
								->signal->flags |= SIGNAL_STOP_DEQUEUED
								return SIGSTOP;

							do_signal_stop:
							    unlock(->sighand)

  coredump_wait:

      zap_threads:
          lock(tasklist_lock)
          send SIGKILL to B
              // signal_wake_up() does nothing
          unlock(tasklist_lock)

							    lock(tasklist_lock)
							    lock(->sighand)
							    re-check sig->flags & SIGNAL_STOP_DEQUEUED, yes
							    set_current_state(TASK_STOPPED);
							    finish_stop:
							        schedule();
							            // ->state == TASK_STOPPED

      wait_for_completion(&startup_done)
         // waits for complete() from B,
         // ->state == TASK_UNINTERRUPTIBLE

We can't wake up 'B' in any way:

	SIGCONT will be ignored because handle_stop_signal() sees
	->signal->flags & SIGNAL_GROUP_EXIT.

	sys_kill(SIGKILL)->__group_complete_signal() will choose
	uninterruptible 'A', so it can't help.

	sys_tkill(B, SIGKILL) will be ignored by specific_send_sig_info()
	because B already has pending SIGKILL.

This scenario is not possbile if 'A' does do_group_exit(), because
it sets sig->flags = SIGNAL_GROUP_EXIT and delivers SIGKILL to
subthreads atomically, holding both tasklist_lock and sighand->lock.
That means that do_signal_stop() will notice !SIGNAL_STOP_DEQUEUED
after re-locking ->sighand. And it is not possible to any other
thread to re-add SIGNAL_STOP_DEQUEUED later, because dequeue_signal()
can only return SIGKILL.

I think it is better to change do_coredump() to do sigaddset(SIGKILL)
and signal_wake_up() under sighand->lock, but this patch is much
simpler.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-08 14:53:31 -07:00
Linus Torvalds 14bf01bb05 Fix inequality comparison against "task->state"
We should always use bitmask ops, rather than depend on some ordering of
the different states.  With the TASK_NONINTERACTIVE flag, the inequality
doesn't really work.

Oleg Nesterov argues (likely correctly) that this test is unnecessary in
the first place.  However, the minimal fix for now is to at least make
it work in the presense of TASK_NONINTERACTIVE.  Waiting for consensus
from Roland & co on potential bigger cleanups.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-01 11:04:18 -07:00
Al Viro eacaa1f5aa [PATCH] cpuset crapectomy
Switched cpuset_common_file_read() to simple_read_from_buffer(), killed
a bunch of useless (and not quite correct - e.g.  min(size_t,ssize_t))
code.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-30 08:42:24 -07:00
Roland McGrath 5acbc5cb50 [PATCH] Fix task state testing properly in do_signal_stop()
Any tests using < TASK_STOPPED or the like are left over from the time
when the TASK_ZOMBIE and TASK_DEAD bits were in the same word, and it
served to check for "stopped or dead".  I think this one in
do_signal_stop is the only such case.  It has been buggy ever since
exit_state was separated, and isn't testing the exit_state value.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-29 15:20:47 -07:00
Paul Jackson 5134fc15b6 [PATCH] cpuset read past eof memory leak fix
Don't leak a page of memory if user reads a cpuset file past eof.

Signed-off-by: KUROSAWA Takahiro <kurosawa@valinux.co.jp>
Signed-off-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-28 07:58:51 -07:00
Rafael J. Wysocki 0f7347c20c [PATCH] swsusp: avoid problems if there are too many pages to save
The following patch makes swsusp avoid problems during resume if there are
too many pages to save on suspend.  It adds a constant that allows us to
verify if we are going to save too many pages and implements the check
(this is done as early as we can tell that the check will trigger, which is
in swsusp_alloc()).

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-28 07:46:41 -07:00
Rusty Russell f36462f078 [PATCH] Ignore trailing whitespace on kernel parameters correctly
Dave Jones says:

... if the modprobe.conf has trailing whitespace, modules fail to load
with the following helpful message..

	snd_intel8x0: Unknown parameter `'

Previous version truncated last argument.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Dave Jones <davej@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-28 07:46:41 -07:00
Rafael J. Wysocki f2d613799a [PATCH] swsusp: prevent possible memory leak
Prevent swsusp from leaking some memory in case of an error in
read_pagedir().  It also prevents the BUG_ON() from triggering if there's
an error while reading swap.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-28 07:46:40 -07:00
Rafael J. Wysocki 254b54771c [PATCH] swsusp: remove wrong code from data_free
The following patch removes some wrong code from the data_free() function
in swsusp.

This function could only be called if there's an error while writing the
suspend image to swap, so it is not triggered easily.  However, if
triggered, it would probably corrupt some memory.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-28 07:46:40 -07:00
Linus Torvalds 188a1eafa0 Make sure SIGKILL gets proper respect
Bhavesh P. Davda <bhavesh@avaya.com> noticed that SIGKILL wouldn't
properly kill a process under just the right cicumstances: a stopped
task that already had another signal queued would get the SIGKILL
queued onto the shared queue, and there it would remain until SIGCONT.

This simplifies the signal acceptance logic, and fixes the bug in the
process.

Losely based on an earlier patch by Bhavesh.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-23 13:22:21 -07:00
Pavel Machek 8686bcd0a5 [PATCH] swsusp: fix comments
Fix comments in swsusp.

Signed-off-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-22 22:17:36 -07:00
Rafael J. Wysocki 57487f4376 [PATCH] swsusp: do not trigger BUG_ON() if there is not enough memory
The following patch makes swsusp avoid triggering the BUG_ON() in
swsusp_suspend() if there is not enough memory for suspend.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-22 22:17:35 -07:00
Randy Dunlap 720b9429e8 [PATCH] SOFTWARE_SUSPEND needs HOTPLUG_CPU on SMP
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-22 22:17:34 -07:00
Eric W. Biederman 88d10bbaae [PATCH] suspend: cleanup calling of power off methods.
In the lead up to 2.6.13 I fixed a large number of reboot problems by
making the calling conventions consistent.  Despite checking and double
checking my work it appears I missed an obvious one.

The S4 suspend code for PM_DISK_PLATFORM was also calling device_shutdown
without setting system_state, and was not calling the appropriate
reboot_notifier.

This patch fixes the bug by replacing the call of device_suspend with
kernel_poweroff_prepare.

Various forms of this failure have been fixed and tracked for a while.

Thanks for tracking this down go to: Alexey Starikovskiy, Meelis Roos
<mroos@linux.ee>, Nigel Cunningham <ncunningham@cyclades.com>, Pierre
Ossman <drzeus-list@drzeus.cx>

History of this bug is at:
http://bugme.osdl.org/show_bug.cgi?id=4320

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-22 22:17:33 -07:00
Eric W. Biederman e4c94330e3 [PATCH] reboot: comment and factor the main reboot functions
In the lead up to 2.6.13 I fixed a large number of reboot problems by
making the calling conventions consistent.  Despite checking and double
checking my work it appears I missed an obvious one.

This first patch simply refactors the reboot routines so all of the
preparation for various kinds of reboots are in their own functions.
Making it very hard to get the various kinds of reboot out of sync.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-22 22:17:33 -07:00
Andrew Morton 31f6d9d628 [PATCH] Add printk_clock()
ia64's sched_clock() accesses per-cpu data which isn't set up at boot time.
Hence ia64 cannot use printk timestamping, because printk() will crash in
sched_clock().

So make printk() use printk_clock(), defaulting to sched_clock(), overrideable
by the architecture via attribute(weak).

Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-21 10:11:54 -07:00
Dipankar Sarma 4fb3a53860 [PATCH] files: fix preemption issues
With the new fdtable locking rules, you have to protect fdtable with either
->file_lock or rcu_read_lock/unlock().  There are some places where we
aren't doing either.  This patch fixes those places.

Signed-off-by: Dipankar Sarma <dipankar@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-17 11:50:02 -07:00
Michael Kerrisk 2030c0fd3d [PATCH] PR_GET_DUMPABLE returns incorrect info
2.6.13 incorporated Alan Cox's patch for /proc/sys/fs/suid_dumpable (one
version of this patch can be found here
http://marc.theaimsgroup.com/?l=linux-kernel&m=109647550421014&w=2 ).

This patch also made corresponding changes in kernel/sys.c to change the
prctl() PR_SET_DUMPABLE operation so that the permitted range of 'arg2' was
modified from 0..1 to 0..2.

However, a corresponding change was not made for PR_GET_DUMPABLE: if the
dumpable flag is non-zero, then PR_GET_DUMPABLE always returns 1, so that
the caller can't determine the true setting of this flag.

Acked-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-17 11:50:01 -07:00
Srivatsa Vaddagiri 26ff6ad978 [PATCH] CPU hotplug breaks wake_up_new_task
Fix a problem wherein a new-born task is added to a dead CPU.

Signed-off-by: Srivatsa Vaddagiri <vatsa@in.ibm.com>
Acked-by: Nick Piggin <nickpiggin@yahoo.com.au>
Acked-by: Shaohua Li <shaohua.li@intel.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-17 11:49:59 -07:00
Ingo Molnar da04c03503 [PATCH] Fix spinlock owner debugging
fix up the runqueue lock owner only if we truly did a context-switch
with the runqueue lock held. Impacts ia64, mips, sparc64 and arm.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-13 09:59:04 -07:00
Linus Torvalds 5d54e69c68 Merge master.kernel.org:/pub/scm/linux/kernel/git/dwmw2/audit-2.6 2005-09-13 09:47:30 -07:00
Randy Dunlap 9f1583339a [PATCH] use add_taint() for setting tainted bit flags
Use the add_taint() interface for setting tainted bit flags instead of
doing it manually.

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-13 08:22:29 -07:00
Andrew Morton 8a1c17574a [PATCH] schedule_timeout_[un]interruptible() speedup
These functions don't need schedule_timeout()'s barrier.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-13 08:22:29 -07:00
Andi Kleen 3f74478b5f [PATCH] x86-64: Some cleanup and optimization to the processor data area.
- Remove unused irqrsp field
- Remove pda->me
- Optimize set_softirq_pending slightly

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-12 10:49:58 -07:00
Paul Jackson b3426599af [PATCH] cpuset semaphore depth check optimize
Optimize the deadlock avoidance check on the global cpuset
semaphore cpuset_sem.  Instead of adding a depth counter to the
task struct of each task, rather just two words are enough, one
to store the depth and the other the current cpuset_sem holder.

Thanks to Nikita Danilov for the idea.

Signed-off-by: Paul Jackson <pj@sgi.com>

[ We may want to change this further, but at least it's now
  a totally internal decision to the cpusets code ]

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-12 09:16:27 -07:00
Linus Torvalds 1df5c10a5b Mark ia64-specific MCA/INIT scheduler hooks as dangerous
..and only enable them for ia64. The functions are only valid
when the whole system has been totally stopped and no scheduler
activity is ongoing on any CPU, and interrupts are globally
disabled.

In other words, they aren't useful for anything else. So make
sure that nobody can use them by mistake.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-12 07:59:21 -07:00
Keith Owens a2a979821b [PATCH] MCA/INIT: scheduler hooks
Scheduler hooks to see/change which process is deemed to be on a cpu.

Signed-off-by: Keith Owens <kaos@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-09-11 14:01:30 -07:00
Nishanth Aravamudan 75bcc8c5e1 [PATCH] kernel: fix-up schedule_timeout() usage
Use schedule_timeout_{,un}interruptible() instead of
set_current_state()/schedule_timeout() to reduce kernel size.

Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-10 10:06:37 -07:00