Commit Graph

181 Commits

Author SHA1 Message Date
David S. Miller cab3f16feb [SPARC64]: Fix >8K I/O mappings.
Increment the PFN field of the PTE so that the tests
on vm_pfn in mm/memory.c match up.  The TLB ignores these
lower bits for larger page sizes, so it's OK to set things
like this.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-29 13:59:03 -08:00
David S. Miller 5cd9194a1b [PATCH] sparc: convert IO remapping to VM_PFNMAP
Here are the Sparc bits.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-28 14:35:36 -08:00
Hugh Dickins f3d48f0373 [PATCH] unpaged: fix sound Bad page states
Earlier I unifdefed PageCompound, so that snd_pcm_mmap_control_nopage and
others can give out a 0-order component of a higher-order page, which won't
be mistakenly freed when zap_pte_range unmaps it.  But many Bad page states
reported a PG_reserved was freed after all: I had missed that we need to
say __GFP_COMP to get compound page behaviour.

Some of these higher-order pages are allocated by snd_malloc_pages, some by
snd_malloc_dev_pages; or if SBUS, by sbus_alloc_consistent - but that has
no gfp arg, so add __GFP_COMP into its sparc32/64 implementations.

I'm still rather puzzled that DRM seems not to need a similar change.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-22 09:13:43 -08:00
Hugh Dickins 0b14c179a4 [PATCH] unpaged: VM_UNPAGED
Although we tend to associate VM_RESERVED with remap_pfn_range, quite a few
drivers set VM_RESERVED on areas which are then populated by nopage.  The
PageReserved removal in 2.6.15-rc1 changed VM_RESERVED not to free pages in
zap_pte_range, without changing those drivers not to set it: so their pages
just leak away.

Let's not change miscellaneous drivers now: introduce VM_UNPAGED at the core,
to flag the special areas where the ptes may have no struct page, or if they
have then it's not to be touched.  Replace most instances of VM_RESERVED in
core mm by VM_UNPAGED.  Force it on in remap_pfn_range, and the sparc and
sparc64 io_remap_pfn_range.

Revert addition of VM_RESERVED to powerpc vdso, it's not needed there.  Is it
needed anywhere?  It still governs the mm->reserved_vm statistic, and special
vmas not to be merged, and areas not to be core dumped; but could probably be
eliminated later (the drivers are probably specifying it because in 2.4 it
kept swapout off the vma, but in 2.6 we work from the LRU, which these pages
don't get on).

Use the VM_SHM slot for VM_UNPAGED, and define VM_SHM to 0: it serves no
purpose whatsoever, and should be removed from drivers when we clean up.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Acked-by: William Irwin <wli@holomorphy.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-22 09:13:42 -08:00
Christoph Hellwig 9ffb83bcc5 [SBUSFB]: implement ->compat_ioctl
This patch adds a new function, sbusfb_compat_ioctl() to
drivers/video/sbuslib.c and uses it as compat_ioctl in all sbus fb
drivers

This remove the last per-arch compat ioctl bits in
arch/sparc64/kernel/ioctl32.c so it would be nice if people could test
if this actually copiles and works and if yes apply it :)

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-12 12:11:12 -08:00
David S. Miller 4d45cbacb8 [SPARC64]: Restore 2.4.x /proc/cpuinfo behavior for "ncpus probed" field.
Noticed by Tom 'spot' Callaway.

Even on uniprocessor we always reported the number of physical
cpus in the system via /proc/cpuinfo.  But when this got changed
to use num_possible_cpus() it always reads as "1" on uniprocessor.
This change was unintentional.

So scan the firmware device tree and count the number of cpu
nodes, and report that, as we always did.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-11 12:48:56 -08:00
Christoph Hellwig 8ca2bdc7a9 [SPARC] sbus rtc: implement ->compat_ioctl
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-09 12:07:18 -08:00
Tobias Klauser 84c1a13a30 [SPARC64]: Use ARRAY_SIZE macro
Use ARRAY_SIZE macro instead of sizeof(x)/sizeof(x[0]) and remove a
duplicate of ARRAY_SIZE which is never used anyways.

Signed-off-by: Tobias Klauser <tklauser@nuerscht.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-09 12:03:42 -08:00
Nick Piggin 64c7c8f885 [PATCH] sched: resched and cpu_idle rework
Make some changes to the NEED_RESCHED and POLLING_NRFLAG to reduce
confusion, and make their semantics rigid.  Improves efficiency of
resched_task and some cpu_idle routines.

* In resched_task:
- TIF_NEED_RESCHED is only cleared with the task's runqueue lock held,
  and as we hold it during resched_task, then there is no need for an
  atomic test and set there. The only other time this should be set is
  when the task's quantum expires, in the timer interrupt - this is
  protected against because the rq lock is irq-safe.

- If TIF_NEED_RESCHED is set, then we don't need to do anything. It
  won't get unset until the task get's schedule()d off.

- If we are running on the same CPU as the task we resched, then set
  TIF_NEED_RESCHED and no further action is required.

- If we are running on another CPU, and TIF_POLLING_NRFLAG is *not* set
  after TIF_NEED_RESCHED has been set, then we need to send an IPI.

Using these rules, we are able to remove the test and set operation in
resched_task, and make clear the previously vague semantics of
POLLING_NRFLAG.

* In idle routines:
- Enter cpu_idle with preempt disabled. When the need_resched() condition
  becomes true, explicitly call schedule(). This makes things a bit clearer
  (IMO), but haven't updated all architectures yet.

- Many do a test and clear of TIF_NEED_RESCHED for some reason. According
  to the resched_task rules, this isn't needed (and actually breaks the
  assumption that TIF_NEED_RESCHED is only cleared with the runqueue lock
  held). So remove that. Generally one less locked memory op when switching
  to the idle thread.

- Many idle routines clear TIF_POLLING_NRFLAG, and only set it in the inner
  most polling idle loops. The above resched_task semantics allow it to be
  set until before the last time need_resched() is checked before going into
  a halt requiring interrupt wakeup.

  Many idle routines simply never enter such a halt, and so POLLING_NRFLAG
  can be always left set, completely eliminating resched IPIs when rescheduling
  the idle task.

  POLLING_NRFLAG width can be increased, to reduce the chance of resched IPIs.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Con Kolivas <kernel@kolivas.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-09 07:56:33 -08:00
Nick Piggin 5bfb5d690f [PATCH] sched: disable preempt in idle tasks
Run idle threads with preempt disabled.

Also corrected a bugs in arm26's cpu_idle (make it actually call schedule()).
How did it ever work before?

Might fix the CPU hotplugging hang which Nigel Cunningham noted.

We think the bug hits if the idle thread is preempted after checking
need_resched() and before going to sleep, then the CPU offlined.

After calling stop_machine_run, the CPU eventually returns from preemption and
into the idle thread and goes to sleep.  The CPU will continue executing
previous idle and have no chance to call play_dead.

By disabling preemption until we are ready to explicitly schedule, this bug is
fixed and the idle threads generally become more robust.

From: alexs <ashepard@u.washington.edu>

  PPC build fix

From: Yoichi Yuasa <yuasa@hh.iij4u.or.jp>

  MIPS build fix

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Yoichi Yuasa <yuasa@hh.iij4u.or.jp>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-09 07:56:33 -08:00
Christoph Hellwig 7e4c54a2a4 [PATCH] remove ioctl32_handler_t
Some architectures define and use this type in their compat_ioctl code, but
all of them can easily use the identical ioctl_trans_handler_t type that is
defined in common code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-09 07:56:00 -08:00
Hugh Dickins da1605465e [SPARC64] mm: update get_user_insn comment
Update comment on get_user_insn to the more general "pte lock", which may
or may not be the page_table_lock.  Note vmtruncate handled like kswapd.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-08 10:00:55 -08:00
David S. Miller dd3e2dcf34 [SPARC64]: Kill some unnecessary includes from ioctl32.c
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-07 14:13:46 -08:00
Christoph Hellwig f48497e383 [SPARC64]: remove drm compat ioctl handling
drivers/drm/ now implements proper ->compat_ioctl methods, so this isn't
needed anymore.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-07 14:13:27 -08:00
Christoph Hellwig b66621fef3 [SPARC] cpwatchdog: implement ->compat_ioctl
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-07 14:13:14 -08:00
Christoph Hellwig 1d5d00bd9c [SPARC] display7seg: implement ->unlocked_ioctl and ->compat_ioctl
all ioctls are 32bit compat clean, so the driver can use ->compat_ioctl
and ->unlocked_ioctl easily.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-07 14:13:01 -08:00
Christoph Hellwig b31023fc24 [SPARC] openprom: implement ->compat_ioctl
implement a compat_ioctl handle in the driver instead of having table
entries in sparc64 ioctl32.c (I plan to get rid of the arch ioctl32.c
file eventually)

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-07 14:12:47 -08:00
Christoph Hellwig 1928f8e541 [SPARC] envctrl: implement ->unlocked_ioctl and ->compat_ioctl
all the ioctls in the driver are 32bit compat clean and don't need BKL,
so we can switch it to ->unlocked_ioctl and ->compat_ioctl trivially.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-07 14:12:34 -08:00
Christoph Hellwig 16cf0d8165 [SPARC]: Kill remaining kbio.h references.
Would you mind applying the following patch that kills those two + the
m68k and Documentation/ references?

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-07 14:12:21 -08:00
Christoph Hellwig 261b033afc [SPARC64]: remove duplicated compat ioctl entries
all these are handled by fs/compat_ioctls.c already.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-07 14:11:49 -08:00
Christoph Hellwig 59f85dc95e [SPARC]: remove vuid_event.h
I don't know if we ever implemented this, but the only user in any 2.6
tree are the compat ioctls.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-07 14:11:38 -08:00
Christoph Hellwig e1413315b8 [SPARC]: remove kbio.h
The old keyboard driver is gone in 2.6, so the only user left are the
compat ioctls.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-07 14:11:25 -08:00
Christoph Hellwig 9d3c7d1bfd [SPARC]: remove audioio.h
The old sound drivers are gone in 2.6, so the only user left are the
compat ioctls.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-07 14:11:14 -08:00
Christoph Hellwig e0436b3164 [SPARC64]: remove alloc_user_space()
this inline routine in arch/sparc64/kernel/ioctl32.c is completely
unused and superceeded by compat_alloc_user_space()

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-07 14:11:02 -08:00
David S. Miller fc3214952f [SPARC64]: Kill off dummy_tick_ops.
It only serves to generate false-positive buildcheck warnings.
Just set it initially to tick_operations which uses the v9
%tick register which every sparc64 processor has.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-07 14:10:10 -08:00
David S. Miller 62dbec78be [SPARC64] mm: Do not flush TLB mm in tlb_finish_mmu()
It isn't needed any longer, as noted by Hugh Dickins.

We still need the flush routines, due to the one remaining
call site in hugetlb_prefault_arch_hook().  That can be
eliminated at some later point, however.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-07 14:09:58 -08:00
Hugh Dickins dedeb0029b [SPARC64] mm: context switch ptlock
sparc64 is unique among architectures in taking the page_table_lock in
its context switch (well, cris does too, but erroneously, and it's not
yet SMP anyway).

This seems to be a private affair between switch_mm and activate_mm,
using page_table_lock as a per-mm lock, without any relation to its uses
elsewhere.  That's fine, but comment it as such; and unlock sooner in
switch_mm, more like in activate_mm (preemption is disabled here).

There is a block of "if (0)"ed code in smp_flush_tlb_pending which would
have liked to rely on the page_table_lock, in switch_mm and elsewhere;
but its comment explains how dup_mmap's flush_tlb_mm defeated it.  And
though that could have been changed at any time over the past few years,
now the chance vanishes as we push the page_table_lock downwards, and
perhaps split it per page table page.  Just delete that block of code.

Which leaves the mysterious spin_unlock_wait(&oldmm->page_table_lock)
in kernel/fork.c copy_mm.  Textual analysis (supported by Nick Piggin)
suggests that the comment was written by DaveM, and that it relates to
the defeated approach in the sparc64 smp_flush_tlb_pending.  Just delete
this block too.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-07 14:09:01 -08:00
Hugh Dickins b8ae48656d [SPARC64] mm: don't re-evaluate *ptep
sparc64 prom_callback and new_setup_frame32 each operates on a user page
table without holding lock, and no doubt they've good reason.  But I'd
feel more confident if they were to do a "pte = *ptep" and then operate
on pte, rather than re-evaluating *ptep.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-07 14:08:46 -08:00
Jesper Juhl b2325fe1b7 [PATCH] kfree cleanup: arch
This is the arch/ part of the big kfree cleanup patch.

Remove pointless checks for NULL prior to calling kfree() in arch/.

Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Acked-by: Grant Grundler <grundler@parisc-linux.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:54:06 -08:00
Ananth N Mavinakayanahalli d217d5450f [PATCH] Kprobes: preempt_disable/enable() simplification
Reorganize the preempt_disable/enable calls to eliminate the extra preempt
depth.  Changes based on Paul McKenney's review suggestions for the kprobes
RCU changeset.

Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:46 -08:00
Ananth N Mavinakayanahalli 991a51d83a [PATCH] Kprobes: Use RCU for (un)register synchronization - arch changes
Changes to the arch kprobes infrastructure to take advantage of the locking
changes introduced by usage of RCU for synchronization.  All handlers are now
run without any locks held, so they have to be re-entrant or provide their own
synchronization.

Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:46 -08:00
Ananth N Mavinakayanahalli f215d985e9 [PATCH] Kprobes: Track kprobe on a per_cpu basis - sparc64 changes
Sparc64 changes to track kprobe execution on a per-cpu basis.  We now track
the kprobe state machine independently on each cpu using an arch specific
kprobe control block.

Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:46 -08:00
Ananth N Mavinakayanahalli 66ff2d0691 [PATCH] Kprobes: rearrange preempt_disable/enable() calls
The following set of patches are aimed at improving kprobes scalability.  We
currently serialize kprobe registration, unregistration and handler execution
using a single spinlock - kprobe_lock.

With these changes, kprobe handlers can run without any locks held.  It also
allows for simultaneous kprobe handler executions on different processors as
we now track kprobe execution on a per processor basis.  It is now necessary
that the handlers be re-entrant since handlers can run concurrently on
multiple processors.

All changes have been tested on i386, ia64, ppc64 and x86_64, while sparc64
has been compile tested only.

The patches can be viewed as 3 logical chunks:

patch 1: 	Reorder preempt_(dis/en)able calls
patches 2-7: 	Introduce per_cpu data areas to track kprobe execution
patches 8-9: 	Use RCU to synchronize kprobe (un)registration and handler
		execution.

Thanks to Maneesh Soni, James Keniston and Anil Keshavamurthy for their
review and suggestions. Thanks again to Anil, Hien Nguyen and Kevin Stafford
for testing the patches.

This patch:

Reorder preempt_disable/enable() calls in arch kprobes files in preparation to
introduce locking changes.  No functional changes introduced by this patch.

Signed-off-by: Ananth N Mavinakayahanalli <ananth@in.ibm.com>
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:45 -08:00
Prasanna S Panchamukhi cd6b0762a0 [PATCH] Move Kprobes and Oprofile to "Instrumentation Support" menu
Andrew Morton suggested to move kprobes from kernel hacking menu, since
kernel hacking menu is in-appropriate for the Kprobes.  This patch moves
Kprobes and Oprofile under instrumentation menu.

(akpm: it's not a natural fit, but things like djprobes and the s390 guys'
statistics library need a home)

Signed-of-by: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Cc: Philippe Elie <phil.el@wanadoo.fr>
Cc: John Levon <levon@movementarian.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:35 -08:00
Thomas Gleixner ecea8d19c9 [PATCH] jiffies_64 cleanup
Define jiffies_64 in kernel/timer.c rather than having 24 duplicated
defines in each architecture.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:25 -08:00
Christoph Hellwig 9c0cbd54ce [PATCH] TIOC* compat ioctl handling
TIOCSTART and TIOCSTOP are defined in asm/ioctls.h and asm/termios.h by
various architectures but not actually implemented anywhere but in the IRIX
compatibility layer, so remove their COMPATIBLE_IOCTL from parisc, ppc64
and sparc64.

Move the TIOCSLTC COMPATIBLE_IOCTL to common code, guided by an ifdef to
only show up on architectures that support it (same as the code handling it
in tty_ioctl.c), aswell as it's brother TIOCGLTC that wasn't handled so
far.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:17 -08:00
Hugh Dickins b462705ac6 [PATCH] mm: arches skip ptlock
Convert those few architectures which are calling pud_alloc, pmd_alloc,
pte_alloc_map on a user mm, not to take the page_table_lock first, nor drop it
after.  Each of these can continue to use pte_alloc_map, no need to change
over to pte_alloc_map_lock, they're neither racy nor swappable.

In the sparc64 io_remap_pfn_range, flush_tlb_range then falls outside of the
page_table_lock: that's okay, on sparc64 it's like flush_tlb_mm, and that has
always been called from outside of page_table_lock in dup_mmap.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:40 -07:00
Nick Piggin b5810039a5 [PATCH] core remove PageReserved
Remove PageReserved() calls from core code by tightening VM_RESERVED
handling in mm/ to cover PageReserved functionality.

PageReserved special casing is removed from get_page and put_page.

All setting and clearing of PageReserved is retained, and it is now flagged
in the page_alloc checks to help ensure we don't introduce any refcount
based freeing of Reserved pages.

MAP_PRIVATE, PROT_WRITE of VM_RESERVED regions is tentatively being
deprecated.  We never completely handled it correctly anyway, and is be
reintroduced in future if required (Hugh has a proof of concept).

Once PageReserved() calls are removed from kernel/power/swsusp.c, and all
arch/ and driver code, the Set and Clear calls, and the PG_reserved bit can
be trivially removed.

Last real user of PageReserved is swsusp, which uses PageReserved to
determine whether a struct page points to valid memory or not.  This still
needs to be addressed (a generic page_is_ram() should work).

A last caveat: the ZERO_PAGE is now refcounted and managed with rmap (and
thus mapcounted and count towards shared rss).  These writes to the struct
page could cause excessive cacheline bouncing on big systems.  There are a
number of ways this could be addressed if it is an issue.

Signed-off-by: Nick Piggin <npiggin@suse.de>

Refcount bug fix for filemap_xip.c

Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:39 -07:00
Hugh Dickins 404351e67a [PATCH] mm: mm_init set_mm_counters
How is anon_rss initialized?  In dup_mmap, and by mm_alloc's memset; but
that's not so good if an mm_counter_t is a special type.  And how is rss
initialized?  By set_mm_counter, all over the place.  Come on, we just need to
initialize them both at once by set_mm_counter in mm_init (which follows the
memcpy when forking).

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:38 -07:00
Hugh Dickins fc2acab31b [PATCH] mm: tlb_finish_mmu forget rss
zap_pte_range has been counting the pages it frees in tlb->freed, then
tlb_finish_mmu has used that to update the mm's rss.  That got stranger when I
added anon_rss, yet updated it by a different route; and stranger when rss and
anon_rss became mm_counters with special access macros.  And it would no
longer be viable if we're relying on page_table_lock to stabilize the
mm_counter, but calling tlb_finish_mmu outside that lock.

Remove the mmu_gather's freed field, let tlb_finish_mmu stick to its own
business, just decrement the rss mm_counter in zap_pte_range (yes, there was
some point to batching the update, and a subsequent patch restores that).  And
forget the anal paranoia of first reading the counter to avoid going negative
- if rss does go negative, just fix that bug.

Remove the mmu_gather's flushes and avoided_flushes from arm and arm26: no use
was being made of them.  But arm26 alone was actually using the freed, in the
way some others use need_flush: give it a need_flush.  arm26 seems to prefer
spaces to tabs here: respect that.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:37 -07:00
Hugh Dickins 4d6ddfa924 [PATCH] mm: tlb_is_full_mm was obscure
tlb_is_full_mm?  What does that mean?  The TLB is full?  No, it means that the
mm's last user has gone and the whole mm is being torn down.  And it's an
inline function because sparc64 uses a different (slightly better)
"tlb_frozen" name for the flag others call "fullmm".

And now the ptep_get_and_clear_full macro used in zap_pte_range refers
directly to tlb->fullmm, which would be wrong for sparc64.  Rather than
correct that, I'd prefer to scrap tlb_is_full_mm altogether, and change
sparc64 to just use the same poor name as everyone else - is that okay?

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:37 -07:00
Al Viro 53f9fc93f9 [PATCH] gfp_t: remaining bits of arch/*
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-28 08:16:51 -07:00
David S. Miller b4d1b82578 [SPARC64]: Fix powering off on SMP.
Doing a "SUNW,stop-self" firmware call on the other cpus is not the
correct thing to do when dropping into the firmware for a halt,
reboot, or power-off.

For now, just do nothing to quiet the other cpus, as the system should
be quiescent enough.  Later we may decide to implement smp_send_stop()
like the other SMP platforms do.

Based upon a report from Christopher Zimmermann.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-14 15:26:08 -07:00
David S. Miller 688cb30bdc [SPARC64]: Eliminate PCI IOMMU dma mapping size limit.
The hairy fast allocator in the sparc64 PCI IOMMU code
has a hard limit of 256 pages.  Certain devices can
exceed this when performing very large I/Os.

So replace with a more simple allocator, based largely
upon the arch/ppc64/kernel/iommu.c code.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-13 22:15:24 -07:00
David S. Miller 51e8513615 [SPARC64]: Consolidate common PCI IOMMU init code.
All the PCI controller drivers were doing the same thing
setting up the IOMMU software state, put it all in one spot.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-13 21:10:08 -07:00
David S. Miller c9c1083074 [SPARC64]: Fix boot failures on SunBlade-150
The sequence to move over to the Linux trap tables from
the firmware ones needs to be more air tight.  It turns
out that to be %100 safe we do need to be able to translate
OBP mappings in our TLB miss handlers early.

In order not to eat up a lot of kernel image memory with
static page tables, just use the translations array in
the OBP TLB miss handlers.  That solves the bulk of the
problem.

Furthermore, to make sure the OBP TLB miss path will work
even before the fixed MMU globals are loaded, explicitly
load %g1 to TLB_SFSR at the beginning of the i-TLB and
d-TLB miss handlers.

To ease the OBP TLB miss walking of the prom_trans[] array,
we sort it then delete all of the non-OBP entries in there
(for example, there are entries for the kernel image itself
which we're not interested in at all).

We also save about 32K of kernel image size with this change.
Not a bad side effect :-)

There are still some reasons why trampoline.S can't use the
setup_trap_table() yet.  The most noteworthy are:

1) OBP boots secondary processors with non-bias'd stack for
   some reason.  This is easily fixed by using a small bootup
   stack in the kernel image explicitly for this purpose.

2) Doing a firmware call via the normal C call prom_set_trap_table()
   goes through the whole OBP enter/exit sequence that saves and
   restores OBP and Linux kernel state in the MMUs.  This path
   unfortunately does a "flush %g6" while loading up the OBP locked
   TLB entries for the firmware call.

   If we setup the %g6 in the trampoline.S code properly, that
   is in the PAGE_OFFSET linear mapping, but we're not on the
   kernel trap table yet so those addresses won't translate properly.

   One idea is to do a by-hand firmware call like we do in the
   early bootup code and elsewhere here in trampoline.S  But this
   fails as well, as aparently the secondary processors are not
   booted with OBP's special locked TLB entries loaded.  These
   are necessary for the firwmare to processes TLB misses correctly
   up until the point where we take over the trap table.

This does need to be resolved at some point.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-12 12:22:46 -07:00
David S. Miller b1b510aa28 [SPARC64]: Fix net booting on Ultra5
We were not doing alignment properly when remapping the kernel image.

What we want is a 4MB aligned physical address to map at KERNBASE.
Mistakedly we were 4MB aligning the virtual address where the kernel
initially sits, that's wrong.

Instead, we should PAGE align the virtual address, then 4MB align the
physical address result the prom gives to us.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-11 15:45:16 -07:00
David S. Miller 5d8e1b181c [SPARC64]: Fix Ultra5, Ultra60, et al. boot failures.
On the boot processor, we need to do the move onto the Linux trap
table a little bit differently else we'll take unhandlable faults in
the firmware address space.

Previously we would do the following:

1) Disable PSTATE_IE in %pstate.
2) Set %tba by hand to sparc64_ttable_tl0
3) Initialize alternate, mmu, and interrupt global
   trap registers.
4) Call prom_set_traptable()

That doesn't work very well actually with the way we boot the kernel
VM these days.  It worked by luck on many systems because the firmware
accesses for the prom_set_traptable() call happened to be loaded into
the TLB already, something we cannot assume.

So the new scheme is this:

1) Clear PSTATE_IE in %pstate and set %pil to 15
2) Call prom_set_traptable()
3) Initialize alternate, mmu, and interrupt global
   trap registers.

and this works quite well.  This sequence has been moved into a
callable function in assembler named setup-trap_table().  The idea is
that eventually trampoline.S can use this code as well.  That isn't
possible currently due to some complications, but eventually we should
be able to do it.

Thanks to Meelis Roos for the Ultra5 boot failure report.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-10 16:12:13 -07:00
Sven Hartge 2e457ef667 [SPARC64]: Fix compile error in irq.c
irq.c is missing the inclusion of asm/io.h, which causes
readb() and writeb() the be undefined.

Signed-off-by: Sven Hartge <hartge@ds9.argh.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-08 21:12:04 -07:00
David S. Miller ba6399334d [SPARC64]: Fix userland FPU state corruption.
We need to use stricter memory barriers around the block
load and store instructions we use to save and restore the
FPU register file.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-07 13:30:49 -07:00