Commit Graph

442 Commits

Author SHA1 Message Date
David S. Miller a2bd4fd179 [SPARC64]: Add of_device layer and make ebus/isa use it.
Sparcspkr and power drivers are converted, to make sure it works.
Eventually the SBUS device layer will use this as a sub-class.

I really cannot cut loose on that bit until sparc32 is given the
same infrastructure.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23 23:15:43 -07:00
David S. Miller 8cd24ed4f8 [SPARC64]: Expand of_*() interfaces some more.
Import some more stuff from powerpc.

Add of_device_is_compatible(), and of_find_compatible_node().
Export some more of the other routines to modules.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23 23:15:41 -07:00
David S. Miller 92c4e22593 [SPARC64]: Kill unused local vars in map_prom_timers().
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23 23:15:38 -07:00
David S. Miller 25c7581bcd [SPARC64]: Kill off some more prom_getproperty() remnants.
The remaining ones occur before we have imported the
device tree.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23 23:15:36 -07:00
David S. Miller 44bdef5e8f [SPARC64]: Convert Cheetah memory controller driver to in-kernel PROM tree.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23 23:15:34 -07:00
David S. Miller cecc4e9222 [SPARC64]: Convert central bus layer to in-kernel PROM device tree.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23 23:15:32 -07:00
David S. Miller 9c10a58ed6 [SPARC64]: Kill ebus/isa range and interrupt mapping struct members.
Unused outside of initial bus probe scan.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23 23:15:30 -07:00
David S. Miller 690c8fd31f [SPARC64]: Use in-kernel PROM tree for EBUS and ISA.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23 23:15:28 -07:00
David S. Miller de8d28b16f [SPARC64]: Convert sparc64 PCI layer to in-kernel device tree.
One thing this change pointed out was that we really should
pull the "get 'local-mac-address' property" logic into a helper
function all the network drivers can call.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23 23:15:26 -07:00
David S. Miller 765b5f3273 [SPARC64]: Must run smp_setup_cpu_possible_map() after paging_init()
Otherwise the in-kernel PROM device tree isn't built yet,
and therefore the present cpu bits don't get set properly.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23 23:15:23 -07:00
David S. Miller c2a5a46be4 [SPARC64]: Fix for Niagara memory corruption.
On some sun4v systems, after netboot the ethernet controller and it's
DMA mappings can be left active.  The net result is that the kernel
can end up using memory the ethernet controller will continue to DMA
into, resulting in corruption.

To deal with this, we are more careful about importing IOMMU
translations which OBP has left in the IO-TLB.  If the mapping maps
into an area the firmware claimed was free and available memory for
the kernel to use, we demap instead of import that IOMMU entry.

This is going to cause the network chip to take a PCI master abort on
the next DMA it attempts, if it has been left going like this.  All
tests show that this is handled properly by the PCI layer and the e1000
drivers.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23 23:15:21 -07:00
David S. Miller 07f8e5f358 [SPARC64]: Convert cpu_find_by_*() interface to in-kernel PROM device tree.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23 23:15:17 -07:00
David S. Miller 6d307724cb [SPARC64]: Add of_getintprop_default().
This encodes a common idiomatic coding pattern used when
dealing with integer properties.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23 23:15:15 -07:00
David S. Miller 6760d28bc6 [SPARC64]: Convert sun4v virtual-device layer to in-kernel PROM device tree.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23 23:15:13 -07:00
David S. Miller 27cc64c7cc [SPARC64]: Rate limited kernel unaligned trap logging, ala IA64.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23 23:15:11 -07:00
David S. Miller 20edac8ad4 [SPARC64]: Disable verbose PCI IRQ probing messages by default.
Allow them to be enabled with "pci=irq_verbose" on the
boot command line.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23 23:15:09 -07:00
David S. Miller e87dc35020 [SPARC64]: Use in-kernel OBP device tree for PCI controller probing.
It can be pushed even further down, but this is a first step.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23 23:15:07 -07:00
David S. Miller aaf7cec276 [SPARC64]: Add of_find_node_by_{name,type}().
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23 23:15:04 -07:00
David S. Miller 372b07bb5a [SPARC64]: Import OBP device tree into kernel data structures.
The basic framework is based on the PowerPC OF code.

This code even tries to get the device addressing components
correct in the full path names.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23 23:15:02 -07:00
David S. Miller 8fae097deb [SBUS]: Start cleaning up generic sbus support layer.
In particular, move the IRQ probing out to sparc32/sparc64
arch specific code where it belongs.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23 23:15:00 -07:00
David S. Miller c8bfcd95de [SPARC64]: Don't double-export synchronize_irq.
It is done by the generic IRQ layer now.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-20 01:23:56 -07:00
David S. Miller e18e2a00ef [SPARC64]: Move over to GENERIC_HARDIRQS.
This is the long overdue conversion of sparc64 over to
the generic IRQ layer.

The kernel image is slightly larger, but the BSS is ~60K
smaller due to the reduced size of struct ino_bucket.

A lot of IRQ implementation details, including ino_bucket,
were moved out of asm-sparc64/irq.h and are now private to
arch/sparc64/kernel/irq.c, and most of the code in irq.c
totally disappeared.

One thing that's different at the moment is IRQ distribution,
we do it at enable_irq() time.  If the cpu mask is ALL then
we round-robin using a global rotating cpu counter, else
we pick the first cpu in the mask to support single cpu
targetting.  This is similar to what powerpc's XICS IRQ
support code does.

This works fine on my UP SB1000, and the SMP build goes
fine and runs on that machine, but lots of testing on
different setups is needed.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-20 01:23:32 -07:00
David S. Miller 8047e247c8 [SPARC64]: Virtualize IRQ numbers.
Inspired by PowerPC XICS interrupt support code.

All IRQs are virtualized in order to keep NR_IRQS from needing
to be too large.  Interrupts on sparc64 are arbitrary 11-bit
values, but we don't need to define NR_IRQS to 2048 if we
virtualize the IRQs.

As PCI and SBUS controller drivers build device IRQs, we divy
out virtual IRQ numbers incrementally starting at 1.  Zero is
a special virtual IRQ used for the timer interrupt.

So device drivers all see virtual IRQs, and all the normal
interfaces such as request_irq(), enable_irq(), etc. translate
that into a real IRQ number in order to configure the IRQ.

At this point knowledge of the struct ino_bucket is almost
entirely contained within arch/sparc64/kernel/irq.c  There are
a few small bits in the PCI controller drivers that need to
be swept away before we can remove ino_bucket's definition
out of asm-sparc64/irq.h and privately into kernel/irq.c

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-20 01:22:35 -07:00
David S. Miller 37cdcd9e82 [SPARC64]: Kill ino_bucket->pil
And reuse that struct member for virt_irq, which will
be used in future changesets for the implementation of
mapping between real and virtual IRQ numbers.

This nicely kills off a ton of SBUS and PCI controller
PIL assignment code which is no longer necessary.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-20 01:21:57 -07:00
David S. Miller 6a76267f0e [SPARC64]: bp->pil can never be zero
Only pil0_dummy_bucket had a pil of zero and we just killed that
off, so we can delete all special case code that used bp->pil==0
as a way to identify a dummy bucket.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-20 01:20:30 -07:00
David S. Miller fd0504c321 [SPARC64]: Send all device interrupts via one PIL.
This is the first in a series of cleanups that will hopefully
allow a seamless attempt at using the generic IRQ handling
infrastructure in the Linux kernel.

Define PIL_DEVICE_IRQ and vector all device interrupts through
there.

Get rid of the ugly pil0_dummy_{bucket,desc}, instead vector
the timer interrupt directly to a specific handler since the
timer interrupt is the only event that will be signaled on
PIL 14.

The irq_worklist is now in the per-cpu trap_block[].

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-20 01:20:00 -07:00
David S. Miller ccefb5f3f6 [SPARC64]: Do not double-export sys_close() when CONFIG_SOLARIS_EMUL_MODULE
It is already exported by fs/open.c

Noticed by Ben Collins.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-11 21:05:25 -07:00
David S. Miller 9145bcf635 [SPARC64]: Set appropriate max_cache_size.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-10 22:02:17 -07:00
David S. Miller 46b304934d [SPARC64]: Avoid JBUS errors on some Niagara systems.
Doing PCI config space accesses to non-present PCI slots
can result in fatal JBUS errors if the PCI config access
hypervisor call is performed on cpus other than the boot
cpu.

PCI config space accesses to present PCI slots works just
fine.

Recursively traverse the OBP device tree under the PCI
controller node and record all present device IDs into
a small hash table.

Avoid the hypervisor call for any PCI config space access
attempt for a device not recorded in the hash table.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-10 01:06:25 -07:00
David S. Miller 5224e6cc3a [SPARC64]: Dump local cpu registers in sun4v_log_error()
This makes the debugging information more usable.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-09 12:03:49 -07:00
David S. Miller 951bc82c53 [SPARC64]: Make smp_processor_id() functional before start_kernel()
Uses of smp_processor_id() get pushed earlier and earlier in
the start_kernel() sequence.  So just get it working before
we call start_kernel() to avoid all possible problems.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-31 01:24:02 -07:00
David S. Miller 42f142371e [SPARC64]: Respect gfp_t argument to dma_alloc_coherent().
Using asm-generic/dma-mapping.h does not work because pushing
the call down to pci_alloc_coherent() causes the gfp_t argument
of dma_alloc_coherent() to be ignored.

Fix this by implementing things directly, and adding a gfp_t
argument we can use in the internal call down to the PCI DMA
implementation of pci_alloc_coherent().

This fixes massive memory corruption when using the sound driver
layer, which passes things like __GFP_COMP down into these
routines and (correctly) expects that to work.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-23 02:07:22 -07:00
David S. Miller 353b28bafd [SPARC]: Add robust futex syscall entries.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-21 21:22:53 -07:00
David S. Miller 06a1be167e [SPARC]: Handle UNWIND_INFO properly.
For sparc32 we need R_SPARC_UA32 relocation support, for
sparc64 we need the handle R_SPARC_DISP32 relocations.

Based upon reports and initial patch by Martin Habets.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-12 12:45:50 -07:00
David S. Miller 8c45112b82 [SPARC]: Hook up vmsplice into syscall tables.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-03 13:55:46 -07:00
Al Viro 5411be59db [PATCH] drop task argument of audit_syscall_{entry,exit}
... it's always current, and that's a good thing - allows simpler locking.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-05-01 06:06:18 -04:00
Prasanna S Panchamukhi 07fab8da80 [PATCH] Switch Kprobes inline functions to __kprobes for sparc64
Andrew Morton pointed out that compiler might not inline the functions
marked for inline in kprobes.  There-by allowing the insertion of probes
on these kprobes routines, which might cause recursion.

This patch removes all such inline and adds them to kprobes section
there by disallowing probes on all such routines.  Some of the routines
can even still be inlined, since these routines gets executed after the
kprobes had done necessay setup for reentrancy.

Signed-off-by: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-19 09:13:53 -07:00
David S. Miller 5fdfd42e3a [SPARC64]: Export pcibios_resource_to_bus().
SYM2 driver uses it.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-17 13:34:44 -07:00
David S. Miller 5fdef39495 [SPARC]: Hook up sys_tee() into syscall tables.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-14 15:29:32 -07:00
Kyle McMartin 894b5779ce [PATCH] No arch-specific strpbrk implementations
While cleaning up parisc_ksyms.c earlier, I noticed that strpbrk wasn't
being exported from lib/string.c.  Investigating further, I noticed a
changeset that removed its export and added it to _ksyms.c on a few more
architectures.  The justification was that "other arches do it."

I think this is wrong, since no architecture currently defines
__HAVE_ARCH_STRPBRK, there's no reason for any of them to be exporting it
themselves.  Therefore, consolidate the export to lib/string.c.

Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:40 -07:00
KAMEZAWA Hiroyuki a283a52520 [PATCH] for_each_possible_cpu: sparc64
for_each_cpu() actually iterates across all possible CPUs.  We've had mistakes
in the past where people were using for_each_cpu() where they should have been
iterating across only online or present CPUs.  This is inefficient and
possibly buggy.

We're renaming for_each_cpu() to for_each_possible_cpu() to avoid this in the
future.

This patch replaces for_each_cpu with for_each_possible_cpu.
for sparc64.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:31 -07:00
David S. Miller aa1d1a0af6 [SPARC64]: smp_call_function() fixups...
1) Take doc-book function comment from i386 implementation.
2) cacheline align call_lock, taken from powerpc
3) Need memory barrier after setting call_data
4) Remove timeout

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-09 22:56:44 -07:00
David S. Miller 731bbe431f [SPARC64]: Translate PTRACE_GETEVENTMSG for 32-bit tasks.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-09 22:56:41 -07:00
David S. Miller 955c054f79 [SPARC64]: Print out return PC in cheetah_log_errors().
This makes debugging things a little bit easier.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-09 22:56:37 -07:00
David S. Miller 1759e58ed2 [SPARC64]: Add dummy PTRACE_PEEKUSR for gdb.
GDB uses a PTRACE_PEEKUSR call with offset 0 to see
if a thread is alive, so provide a success return for
this particular special case.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-09 22:56:35 -07:00
David S. Miller 289eee6fa7 [SPARC]: Wire up sys_sync_file_range() into syscall tables.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-31 23:49:34 -08:00
David S. Miller 1339713a32 [SPARC]: Wire up sys_splice() into the syscall tables.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-31 23:03:38 -08:00
David S. Miller 6f25f3986a [SPARC64]: Make tsb_sync() mm comparison more precise.
switch_mm() changes the mm state and does a tsb_context_switch()
first, then we do the cpu register state switch which changes
current_thread_info() and current().

So it's safer to check the PGD physical address stored in the
trap block (which will be updated by the tsb_context_switch() in
switch_mm()) than current->active_mm.

Technically we should never run here in between those two
updates, because interrupts are disabled during the entire
context switch operation.  But some day we might like to leave
interrupts enabled during the context switch and this change
allows that to happen without any surprises.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-31 23:03:34 -08:00
Matt Mackall 3dedf53bb1 [PATCH] RTC: Remove RTC UIP synchronization on Sparc64
Signed-off-by: Matt Mackall <mpm@selenic.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28 09:16:00 -08:00
Alan Stern e041c68341 [PATCH] Notifier chain update: API changes
The kernel's implementation of notifier chains is unsafe.  There is no
protection against entries being added to or removed from a chain while the
chain is in use.  The issues were discussed in this thread:

    http://marc.theaimsgroup.com/?l=linux-kernel&m=113018709002036&w=2

We noticed that notifier chains in the kernel fall into two basic usage
classes:

	"Blocking" chains are always called from a process context
	and the callout routines are allowed to sleep;

	"Atomic" chains can be called from an atomic context and
	the callout routines are not allowed to sleep.

We decided to codify this distinction and make it part of the API.  Therefore
this set of patches introduces three new, parallel APIs: one for blocking
notifiers, one for atomic notifiers, and one for "raw" notifiers (which is
really just the old API under a new name).  New kinds of data structures are
used for the heads of the chains, and new routines are defined for
registration, unregistration, and calling a chain.  The three APIs are
explained in include/linux/notifier.h and their implementation is in
kernel/sys.c.

With atomic and blocking chains, the implementation guarantees that the chain
links will not be corrupted and that chain callers will not get messed up by
entries being added or removed.  For raw chains the implementation provides no
guarantees at all; users of this API must provide their own protections.  (The
idea was that situations may come up where the assumptions of the atomic and
blocking APIs are not appropriate, so it should be possible for users to
handle these things in their own way.)

There are some limitations, which should not be too hard to live with.  For
atomic/blocking chains, registration and unregistration must always be done in
a process context since the chain is protected by a mutex/rwsem.  Also, a
callout routine for a non-raw chain must not try to register or unregister
entries on its own chain.  (This did happen in a couple of places and the code
had to be changed to avoid it.)

Since atomic chains may be called from within an NMI handler, they cannot use
spinlocks for synchronization.  Instead we use RCU.  The overhead falls almost
entirely in the unregister routine, which is okay since unregistration is much
less frequent that calling a chain.

Here is the list of chains that we adjusted and their classifications.  None
of them use the raw API, so for the moment it is only a placeholder.

  ATOMIC CHAINS
  -------------
arch/i386/kernel/traps.c:		i386die_chain
arch/ia64/kernel/traps.c:		ia64die_chain
arch/powerpc/kernel/traps.c:		powerpc_die_chain
arch/sparc64/kernel/traps.c:		sparc64die_chain
arch/x86_64/kernel/traps.c:		die_chain
drivers/char/ipmi/ipmi_si_intf.c:	xaction_notifier_list
kernel/panic.c:				panic_notifier_list
kernel/profile.c:			task_free_notifier
net/bluetooth/hci_core.c:		hci_notifier
net/ipv4/netfilter/ip_conntrack_core.c:	ip_conntrack_chain
net/ipv4/netfilter/ip_conntrack_core.c:	ip_conntrack_expect_chain
net/ipv6/addrconf.c:			inet6addr_chain
net/netfilter/nf_conntrack_core.c:	nf_conntrack_chain
net/netfilter/nf_conntrack_core.c:	nf_conntrack_expect_chain
net/netlink/af_netlink.c:		netlink_chain

  BLOCKING CHAINS
  ---------------
arch/powerpc/platforms/pseries/reconfig.c:	pSeries_reconfig_chain
arch/s390/kernel/process.c:		idle_chain
arch/x86_64/kernel/process.c		idle_notifier
drivers/base/memory.c:			memory_chain
drivers/cpufreq/cpufreq.c		cpufreq_policy_notifier_list
drivers/cpufreq/cpufreq.c		cpufreq_transition_notifier_list
drivers/macintosh/adb.c:		adb_client_list
drivers/macintosh/via-pmu.c		sleep_notifier_list
drivers/macintosh/via-pmu68k.c		sleep_notifier_list
drivers/macintosh/windfarm_core.c	wf_client_list
drivers/usb/core/notify.c		usb_notifier_list
drivers/video/fbmem.c			fb_notifier_list
kernel/cpu.c				cpu_chain
kernel/module.c				module_notify_list
kernel/profile.c			munmap_notifier
kernel/profile.c			task_exit_notifier
kernel/sys.c				reboot_notifier_list
net/core/dev.c				netdev_chain
net/decnet/dn_dev.c:			dnaddr_chain
net/ipv4/devinet.c:			inetaddr_chain

It's possible that some of these classifications are wrong.  If they are,
please let us know or submit a patch to fix them.  Note that any chain that
gets called very frequently should be atomic, because the rwsem read-locking
used for blocking chains is very likely to incur cache misses on SMP systems.
(However, if the chain's callout routines may sleep then the chain cannot be
atomic.)

The patch set was written by Alan Stern and Chandra Seetharaman, incorporating
material written by Keith Owens and suggestions from Paul McKenney and Andrew
Morton.

[jes@sgi.com: restructure the notifier chain initialization macros]
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:50 -08:00
David S. Miller 5d5d7727a8 [SPARC64]: Kill duplicate exports of string library functions.
Kbuild now points these out.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-26 15:30:29 -08:00
Akinobu Mita 2d78d4beb6 [PATCH] bitops: sparc64: use generic bitops
- remove __{,test_and_}{set,clear,change}_bit() and test_bit()
- remove ffz()
- remove __ffs()
- remove generic_fls()
- remove generic_fls64()
- remove sched_find_first_bit()
- remove ffs()

- unless defined(ULTRA_HAS_POPULATION_COUNT)

  - remove generic_hweight{64,32,16,8}()

- remove find_{next,first}{,_zero}_bit()
- remove ext2_{set,clear,test,find_first_zero,find_next_zero}_bit()
- remove minix_{test,set,test_and_clear,test,find_first_zero}_bit()

Signed-off-by: Akinobu Mita <mita@miraclelinux.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:14 -08:00
Prasanna S Panchamukhi b67000962f [PATCH] kprobes: fix broken fault handling for sparc64
Provide proper kprobes fault handling, if a user-specified pre/post handlers
tries to access user address space, through copy_from_user(), get_user() etc.

The user-specified fault handler gets called only if the fault occurs while
executing user-specified handlers.  In such a case user-specified handler is
allowed to fix it first, later if the user-specifed fault handler does not fix
it, we try to fix it by calling fix_exception().

The user-specified handler will not be called if the fault happens when single
stepping the original instruction, instead we reset the current probe and
allow the system page fault handler to fix it up.

I could not test this patch for sparc64.

Signed-off-by: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:05 -08:00
bibo,mao 2326c77017 [PATCH] kprobe handler: discard user space trap
Currently kprobe handler traps only happen in kernel space, so function
kprobe_exceptions_notify should skip traps which happen in user space.
This patch modifies this, and it is based on 2.6.16-rc4.

Signed-off-by: bibo mao <bibo.mao@intel.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: "Keshavamurthy, Anil S" <anil.s.keshavamurthy@intel.com>
Cc: <hiramatu@sdl.hitachi.co.jp>
Signed-off-by: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:04 -08:00
Stephen Rothwell 3158e9411a [PATCH] consolidate sys32/compat_adjtimex
Create compat_sys_adjtimex and use it an all appropriate places.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Arnd Bergmann <arnd@arndb.de>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:56:57 -08:00
Stephen Rothwell 88959ea968 [PATCH] create struct compat_timex and use it everywhere
We had a copy of the compatibility version of struct timex in each 64 bit
architecture.  This patch just creates a global one and replaces all the
usages of the old ones.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Arnd Bergmann <arnd@arndb.de>
Acked-by: Kyle McMartin <kyle@parisc-linux.org>
Acked-by: Tony Luck <tony.luck@intel.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:56:57 -08:00
David S. Miller 7d3aee9a96 [SPARC64]: Keep cpu_present_map in sync with phys_cpu_present_map.
Don't rely on fixup_cpu_present_map() to do this as that function
is about to be removed.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-25 13:00:17 -08:00
Alexey Dobriyan 53b3531bbb [PATCH] s/;;/;/g
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-24 07:33:24 -08:00
Andrew Morton 394e3902c5 [PATCH] more for_each_cpu() conversions
When we stop allocating percpu memory for not-possible CPUs we must not touch
the percpu data for not-possible CPUs at all.  The correct way of doing this
is to test cpu_possible() or to use for_each_cpu().

This patch is a kernel-wide sweep of all instances of NR_CPUS.  I found very
few instances of this bug, if any.  But the patch converts lots of open-coded
test to use the preferred helper macros.

Cc: Mikael Starvik <starvik@axis.com>
Cc: David Howells <dhowells@redhat.com>
Acked-by: Kyle McMartin <kyle@parisc-linux.org>
Cc: Anton Blanchard <anton@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: Andi Kleen <ak@muc.de>
Cc: Christian Zankel <chris@zankel.net>
Cc: Philippe Elie <phil.el@wanadoo.fr>
Cc: Nathan Scott <nathans@sgi.com>
Cc: Jens Axboe <axboe@suse.de>
Cc: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-23 07:38:17 -08:00
David S. Miller dcc1e8dd88 [SPARC64]: Add a secondary TSB for hugepage mappings.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-22 01:15:14 -08:00
David S. Miller 14778d9072 [SPARC]: Respect vm_page_prot in io_remap_page_range().
Make sure the callers do a pgprot_noncached() on
vma->vm_page_prot.

Pointed out by Hugh Dickens.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-22 01:15:13 -08:00
Andrew Morton 467418f350 [SPARC64]: CONFIG_BLK_DEV_RAM fix
init/do_mounts_rd.c depends upon CONFIG_BLK_DEV_RAM, not CONFIG_BLK_DEV_INITRD.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:16:41 -08:00
David S. Miller bb8646d834 [SPARC64]: Optimized TSB table initialization.
We only need to write an invalid tag every 16 bytes,
so taking advantage of this can save many instructions
compared to the simple memset() call we make now.

A prefetching implementation is implemented for sun4u
and a block-init store version if implemented for Niagara.

The next trick is to be able to perform an init and
a copy_tsb() in parallel when growing a TSB table.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:16:41 -08:00
David S. Miller 05f9ca8359 [SPARC64]: Randomize mm->mmap_base when PF_RANDOMIZE is set.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:16:37 -08:00
David S. Miller d61e16df94 [SPARC64]: Increase top of 32-bit process stack.
Put it one page below the top of the 32-bit address space.
This gives us ~16MB more address space to work with.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:16:36 -08:00
David S. Miller a91690ddd0 [SPARC64]: Top-down address space allocation for 32-bit tasks.
Currently allocations are very constrained for 32-bit processes.
It grows down-up from 0x70000000 to 0xf0000000 which gives about
2GB of stack + dynamic mmap() space.

So support the top-down method, and we need to override the
generic helper function in order to deal with D-cache coloring.

With these changes I was able to squeeze out a mmap() just over
3.6GB in size in a 32-bit process.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:16:35 -08:00
David S. Miller 7a1ac52641 [SPARC64]: Fix and re-enable dynamic TSB sizing.
This is good for up to %50 performance improvement of some test cases.
The problem has been the race conditions, and hopefully I've plugged
them all up here.

1) There was a serious race in switch_mm() wrt. lazy TLB
   switching to and from kernel threads.

   We could erroneously skip a tsb_context_switch() and thus
   use a stale TSB across a TSB grow event.

   There is a big comment now in that function describing
   exactly how it can happen.

2) All code paths that do something with the TSB need to be
   guarded with the mm->context.lock spinlock.  This makes
   page table flushing paths properly synchronize with both
   TSB growing and TLB context changes.

3) TSB growing events are moved to the end of successful fault
   processing.  Previously it was in update_mmu_cache() but
   that is deadlock prone.  At the end of do_sparc64_fault()
   we hold no spinlocks that could deadlock the TSB grow
   sequence.  We also have dropped the address space semaphore.

While we're here, add prefetching to the copy_tsb() routine
and put it in assembler into the tsb.S file.  This piece of
code is quite time critical.

There are some small negative side effects to this code which
can be improved upon.  In particular we grab the mm->context.lock
even for the tsb insert done by update_mmu_cache() now and that's
a bit excessive.  We can get rid of that locking, and the same
lock taking in flush_tsb_user(), by disabling PSTATE_IE around
the whole operation including the capturing of the tsb pointer
and tsb_nentries value.  That would work because anyone growing
the TSB won't free up the old TSB until all cpus respond to the
TSB change cross call.

I'm not quite so confident in that optimization to put it in
right now, but eventually we might be able to and the description
is here for reference.

This code seems very solid now.  It passes several parallel GCC
bootstrap builds, and our favorite "nut cruncher" stress test which is
a full "make -j8192" build of a "make allmodconfig" kernel.  That puts
about 256 processes on each cpu's run queue, makes lots of process cpu
migrations occur, causes lots of page table and TLB flushing activity,
incurs many context version number changes, and it swaps the machine
real far out to disk even though there is 16GB of ram on this test
system. :-)

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:16:33 -08:00
David S. Miller 0c51ed93ca [SPARC64]: First cut at VIS simulator for Niagara.
Niagara does not implement some of the VIS instructions in
hardware, so we have to emulate them.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:14:26 -08:00
David S. Miller 90a6646bf6 [SPARC64]: Fix system type in /proc/cpuinfo and remove bogus OBP check.
Report 'sun4v' when appropriate in /proc/cpuinfo

Remove all the verifications of the OBP version string.  Just
make sure it's there, and report it raw in the bootup logs and
via /proc/cpuinfo.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:14:25 -08:00
David S. Miller 8935dced54 [SPARC64]: Add SMT scheduling support for Niagara.
The mapping is a simple "(cpuid >> 2) == core" for now.
Later we'll add more sophisticated code that will walk
the sun4v machine description and figure this out from
there.

We should also add core mappings for jaguar and panther
processors.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:14:24 -08:00
David S. Miller d1112018b4 [SPARC64]: Move over to sparsemem.
This has been pending for a long time, and the fact
that we waste a ton of ram on some configurations
kind of pushed things over the edge.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:14:22 -08:00
David S. Miller ee29074d3b [SPARC64]: Fix new context version SMP handling.
Don't piggy back the SMP receive signal code to do the
context version change handling.

Instead allocate another fixed PIL number for this
asynchronous cross-call.  We can't use smp_call_function()
because this thing is invoked with interrupts disabled
and a few spinlocks held.

Also, fix smp_call_function_mask() to count "cpus" correctly.
There is no guarentee that the local cpu is in the mask
yet that is exactly what this code was assuming.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:14:21 -08:00
Eric Sesterhenn 9132983ae1 [SPARC64]: kzalloc() conversion
this patch converts arch/sparc64 to kzalloc usage.
Crosscompile tested with allyesconfig.

Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:14:19 -08:00
David S. Miller 74ae998772 [SPARC64]: Simplify TSB insert checks.
Don't try to avoid putting non-base page sized entries
into the user TSB.  It actually costs us more to check
this than it helps.

Eventually we'll have a multiple TSB scheme for user
processes.  Once a process starts using larger pages,
we'll allocate and use such a TSB.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:14:18 -08:00
David S. Miller 3cab0c3e86 [SPARC64]: More SUN4V cpu mondo bug fixing.
This cpu mondo sending interface isn't all that easy to
use correctly...

We were clearing out the wrong bits from the "mask" after getting
something other than EOK from the hypervisor.

It turns out the hypervisor can just be resent the same cpu_list[]
array, with the 0xffff "done" entries still in there, and it will do
the right thing.

So don't update or try to rebuild the cpu_list[] array to condense it.

This requires the "forward_progress" check to be done slightly
differently, but this new scheme is less bug prone than what we were
doing before.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:14:17 -08:00
David S. Miller bcc28ee0bf [SPARC64]: Fix sun4v mna winfixup handling.
We were clobbering a base register before we were done
using it.  Fix a comment typo while we're here.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:14:16 -08:00
David S. Miller c4f8ef77f9 [SPARC64]: Fix mini RTC driver reading.
Need to subtract 1900 from year and 1 from month before
giving it back to userspace.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:14:15 -08:00
David S. Miller 8bcd174116 [SPARC64]: Do not allow mapping pages within 4GB of 64-bit VA hole.
The UltraSPARC T1 manual recommends this because the chip
could instruction prefetch into the VA hole, and this would
also make decoding  certain kinds of memory access traps
more difficult (because the chip sign extends certain pieces
of trap state).

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:14:14 -08:00
David S. Miller 45f791eb0f [SPARC64]: Fix _PAGE_EXEC handling.
First of all, use the known _PAGE_EXEC_{4U,4V} value instead
of loading _PAGE_EXEC from memory.  We either know which one
to use by context, or we can code patch the test.

Next, we need to check executability of a PTE in the generic
TSB miss handler.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:14:13 -08:00
David S. Miller 92daa77e9a [SPARC64]: Fix typo in SUN4V D-TLB miss handler.
Should put FAULT_CODE_DTLB into %g3 not FAULT_CODE_ITLB.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:14:12 -08:00
David S. Miller 8ba706a95b [SPARC64]: Add mini-RTC driver for Starfire and SUN4V.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:14:10 -08:00
David S. Miller b830ab665a [SPARC64]: Fix bugs in SUN4V cpu mondo dispatch.
There were several bugs in the SUN4V cpu mondo dispatch code.

In fact, if we ever got a EWOULDBLOCK or other error from
the hypervisor call, we'd potentially send a cpu mondo multiple
times to the same cpu and even worse we could loop until the
timeout resending the same mondo over and over to such cpus.

So let's bulletproof this thing as follows:

1) Implement cpu_mondo_send() and cpu_state() hypervisor calls
   in arch/sparc64/kernel/entry.S, add prototypes to asm/hypervisor.h

2) Don't build and update the cpulist using inline functions, this
   was causing the cpu mask to not get updated in the caller.

3) Disable interrupts during the entire mondo send, otherwise our
   cpu list and/or mondo block could get overwritten if we take
   an interrupt and do a cpu mondo send on the current cpu.

4) Check for all possible error return types from the cpu_mondo_send()
   hypervisor call.  In particular:

   HV_EOK) Our work is done, all cpus have received the mondo.
   HV_CPUERROR) One or more of the cpus in the cpu list we passed
                to the hypervisor are in error state.  Use cpu_state()
                calls over the entries in the cpu list to see which
		ones.  Record them in "error_mask" and report this
		after we are done sending the mondo to cpus which are
		not in error state.
   HV_EWOULDBLOCK) We need to keep trying.

   Any other error we consider fatal, we report the event and exit
   immediately.

5) We only timeout if forward progress is not made.  Forward progress
   is defined as having at least one cpu get the mondo successfully
   in a given cpu_mondo_send() call.  Otherwise we bump a counter
   and delay a little.  If the counter hits a limit, we signal an
   error and report the event.

Also, smp_call_function_mask() error handling reports the number
of cpus incorrectly.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:14:09 -08:00
David S. Miller aac0aadf09 [SPARC64]: Fix bugs in SMP TLB context version expiration handling.
1) We must flush the TLB, duh.

2) Even if the sw context was seen to be valid, the local cpu's
   hw context can be out of date, so reload it unconditionally.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:14:08 -08:00
David S. Miller 6889331a12 [SPARC64]: Fix indexing into kpte_linear_bitmap.
Need to shift back up by 3 bits to get 8-byte entry
index.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:14:07 -08:00
David S. Miller 2a3a5f5ddb [SPARC64]: Bulletproof hypervisor TLB flushing.
Check TLB flush hypervisor calls for errors and report them.

Pass HV_MMU_ALL always for now, we can add back the optimization
to avoid the I-TLB flush later.

Always explicitly page align the virtual address arguments.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:14:05 -08:00
David S. Miller 6cc80cfab8 [SPARC64]: Report mondo error correctly in hypervisor_xcall_deliver().
It's in "arg0" not "func".

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:14:04 -08:00
David S. Miller 3634476239 [SPARC64]: Niagara optimized XOR functions for RAID.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:14:03 -08:00
Andrew Morton c4e9249b19 [SPARC64]: Fix binfmt_aout32.c build.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:14:02 -08:00
David S. Miller a0663a79ad [SPARC64]: Fix TLB context allocation with SMT style shared TLBs.
The context allocation scheme we use depends upon there being a 1<-->1
mapping from cpu to physical TLB for correctness.  Chips like Niagara
break this assumption.

So what we do is notify all cpus with a cross call when the context
version number changes, and if necessary this makes them allocate
a valid context for the address space they are running at the time.

Stress tested with make -j1024, make -j2048, and make -j4096 kernel
builds on a 32-strand, 8 core, T2000 with 16GB of ram.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:14:00 -08:00
David S. Miller 074d82cf68 [SPARC64]: Put syscall tables after trap table.
Otherwise with too much stuff enabled in the kernel config
we can end up with an unaligned trap table.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:13:59 -08:00
David S. Miller fc50492867 [SPARC64]: Drop %gl to 0 before re-enabling PSTATE_IE in rtrap
If we take a window fault, on SUN4V set %gl to zero before we
turn PSTATE_IE back on in %pstate.  Otherwise if we take an
interrupt we'll end up with corrupt register state.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:13:57 -08:00
David S. Miller d7744a0950 [SPARC64]: Create a seperate kernel TSB for 4MB/256MB mappings.
It can map all of the linear kernel mappings with zero TSB hash
conflicts for systems with 16GB or less ram.  In such cases, on
SUN4V, once we load up this TSB the first time with all the
mappings, we never take a linear kernel mapping TLB miss ever
again, the hypervisor handles them all.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:13:56 -08:00
David S. Miller 9cc3a1ac9a [SPARC64]: Make use of Niagara 256MB PTEs for kernel mappings.
We use a bitmap, one bit for every 256MB of memory.  If the
bit is set we can use a 256MB PTE for linear mappings, else
we have to use a 4MB PTE.

SUN4V support is there, and we can very easily add support
for Panther cpu 256MB PTEs in the future.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:13:55 -08:00
David S. Miller 30c91d576e [SPARC64]: Use sun4v_cpu_idle() in cpu_idle() on SUN4V.
We have to turn off the "polling nrflag" bit when we sleep
the cpu like this, so that we'll get a cross-cpu interrupt
to wake the processor up from the yield.

We also have to disable PSTATE_IE in %pstate around the yield
call and recheck need_resched() in order to avoid any races.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:13:54 -08:00
David S. Miller 6f5374c91f [SPARC64]: Add sun4v_cpu_yield().
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:13:52 -08:00
David S. Miller 1bd0cd74d1 [SPARC64]: Kill cpudata->idle_volume.
Set, but never used.

We used to use this for dynamic IRQ retargetting, but that
code died a long time ago.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:13:51 -08:00
David S. Miller 8ca2557c48 [SPARC64]: Niagara optimized memset/bzero/clear_user.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:13:50 -08:00
David S. Miller d371c0c174 [SPARC64]: Pass multiple CPUs at once to hypervisor cross-call API.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:13:49 -08:00
David S. Miller 55555633bd [SPARC64]: Typo in sun4v_data_access_exception log message.
Should be "Dax" not "Iax".

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:13:46 -08:00
David S. Miller d82965c167 [SPARC64]: Handle zero-length map requests in pci_sun4v.c
By simply changing the do-while loop into a plain
while loop.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:13:45 -08:00