Commit Graph

3702 Commits

Author SHA1 Message Date
Kumar Gala 976164497d [PATCH] ppc: Fix warnings related to seq_file
When we moved things around in irq.h seq_file became an issue.  Fix
warnings related to its usage.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-17 17:26:09 -08:00
Russell King 02b3083922 [ARM] Fix some corner cases in new mm initialisation
Document that the VMALLOC_END address must be aligned to 2MB since
it must align with a PGD boundary.

Allocate the vectors page early so that the flush_cache_all() later
will cause any dirty cache lines in the direct mapping will be safely
written back.

Move the flush_cache_all() to the second local_flush_cache_tlb() and
remove the now redundant first local_flush_cache_tlb().

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-11-17 22:43:30 +00:00
Grant Grundler 9d7d57567c [PARISC] Remove unused variable in signal.c
Remove unused variable "struct siginfo si" in signal.c

Signed-off-by: Grant Grundler <grundler@parisc-linux.org>
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
2005-11-17 16:43:52 -05:00
Christoph Hellwig 784412f74c [PARISC] remove drm compat ioctls handlers
Remove drm compat_ioctl handlers. The drm drivers have proper
compat_ioctl methods these days.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Matthew Wilcox <willy@parisc-linux.org>
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
2005-11-17 16:41:26 -05:00
Christoph Hellwig ad7dd338fb [PARISC] move PA perf driver over to ->compat_ioctl
Move PA perf driver over to ->compat_ioctl.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Randolph Chung <tausq@parisc-linux.org>
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
2005-11-17 16:40:31 -05:00
Matthew Wilcox 83aceb5b6a [PARISC] Fix some compile problems in ptrace.c
Fix some compile problems:
- ret wasn't being initialised in all code paths
- I'm pretty sure 'goto out' should have been 'goto out_tsk'

Signed-off-by: Matthew Wilcox <willy@parisc-linux.org>
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
2005-11-17 16:37:24 -05:00
Matthew Wilcox 4269b0d371 [PARISC] Improve the error message when we get a clashing mod path
Improve the error message when we get a clashing mod path, and
actually display the IODC data and path for the conflicting device.

Signed-off-by: Matthew Wilcox <willy@parisc-linux.org>
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
2005-11-17 16:33:56 -05:00
Matthew Wilcox ba5c4f1bae [PARISC] Return PDC_OK when alloc_pa_dev fails to enumerate all devices
Return PDC_OK when device registration fails so that we enumerate all
subsequent devices, even when we get two devices with the same hardware
path (which should never happen, but does with at least one revision of
rp8400 firmware).

Signed-off-by: Matthew Wilcox <willy@parisc-linux.org>
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
2005-11-17 16:33:29 -05:00
Carlos O'Donell aa0eecb07f [PARISC] Document some register usages in assembly files
Document clobbers and args in entry.S and syscall.S.

entry.S: Add comment to indicate that cr27 may recycle and EDEADLOCK
detection is not 100% correct. Since this is only enabled when using
ENABLE_LWS_DEBUG, the user is warned by the comment.

Signed-off-by: Carlos O'Donell <carlos@parisc-linux.org>
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
2005-11-17 16:32:46 -05:00
Ryan Bradetich 75be99a8c5 [PARISC] Make redirecting irq messages less noisy
Make the "redirecting irq" message to not display on the console by
setting the severity to KERN_DEBUG.  The console was basically unusable.

Signed-off-by: Ryan Bradetich <rbrad@parisc-linux.org>
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
2005-11-17 16:29:50 -05:00
Grant Grundler 03afe22f07 [PARISC] irq_affinityp[] only available for SMP builds
irq_affinityp[] only available for SMP builds, make code that uses
it conditional on CONFIG_SMP.

Signed-off-by: Grant Grundler <grundler@parisc-linux.org>
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
2005-11-17 16:29:16 -05:00
James Bottomley c2ab64d098 [PARISC] Add IRQ affinities
This really only adds them for the machines I can check SMP on, which
is CPU interrupts and IOSAPIC (so not any of the GSC based machines).

With this patch, irqbalanced can be used to maintain irq balancing.
Unfortunately, irqbalanced is a bit x86 centric, so it doesn't do an
incredibly good job, but it does work.

Signed-off-by: James Bottomley <jejb@parisc-linux.org>
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
2005-11-17 16:28:37 -05:00
Kyle McMartin 1d4c452a85 [PARISC] Fix uniprocessor build by dummying smp_send_all_nop()
Since irq.c uses smp_send_all_nop, we must define it for UP builds
as well. Make it a static inline so it gets optimized away. This forces
irq.c to include <asm/smp.h> though.

Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
2005-11-17 16:27:44 -05:00
James Bottomley d911aed8ad [PARISC] Fix our interrupts not to use smp_call_function
Fix our interrupts not to use smp_call_function

On K and D class smp, the generic code calls this under an irq
spinlock, which causes the WARN_ON() message in smp_call_function()
(and is also illegal because it could deadlock).

The fix is to use a new scheme based on the IPI_NOP.

Signed-off-by: James Bottomley <jejb@parisc-linux.org>
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
2005-11-17 16:27:02 -05:00
Grant Grundler 3f902886a8 [PARISC] Disable nesting of interrupts
Disable nesting of interrupts - still has holes

The offending sequence starts out like this:
1) take external interrupt
2) set_eiem() to only allow TIMER_IRQ; local interrupts still disabled
3) read the EIRR to get a "list" of pending interrupts
4) clear EIRR of pending interrupts we intend to handle
5) call __do_IRQ() to handle IRQ.
6) handle_IRQ_event() enables local interrupts (I-Bit)
7) take a timer interrupt
8) read EIRR to get a new list of pending interrupts
9) clear EIRR of pending interrupts we just read
10) handle pending interrupts found in (8)
11) set_eiem(cpu_eiem) and return
        [ TROUBLE! all enabled CPU IRQs are unmasked. }
12) handle remaining interrupts pending from (3)
        e.g. call __do_IRQ() -> handle_IRQ_event()..etc
        [ TROUBLE! call to handle_IRQ_event() can now enable *any* IRQ. }
13) set_eiem(cpu_eiem) and return

The problem is we now get into ugly race conditions with Timer and IPI
interrupts at this point.  I'm not exactly sure what happens when
things go wrong (perhaps nest calls to IPI or timer interrupt?).
But I'm certain it's not good.

This sequence will break sooner if (10) would accidentally leave
interrupts enabled.

I'm pretty sure the right answer is now to make cpu_eiem
a per CPU variable since all external interrupts on parisc
are per CPU. This means we will NOT need to send an IPI to
every CPU in the system when enabling or disabling an IRQ
since only one CPU needs to change it's EIEM.

Thanks to James Bottomley for (once again) pointing out the problem.

Signed-off-by: Grant Grundler <grundler@parisc-linux.org>
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
2005-11-17 16:26:20 -05:00
James Bottomley 9a8b458406 [PARISC] Make sure timer and IPI execute with interrupts disabled
Fix a longstanding smp bug

The problem is that both the timer and ipi interrupts are being called
with interrupts enabled, which isn't what anyone is expecting.

The IPI issue has just started to show up by causing a BUG_ON in the
slab debugging code.  The timer issue never shows up because there's an
eiem work around in our irq.c

The fix is to label both these as SA_INTERRUPT which causes the generic
irq code not to enable interrupts.

I also suspect the smp_call_function timeouts we're seeing might be
connected with the fact that we disable IPIs when handling any other
type of interrupt.  I've put a WARN_ON in the code for executing
smp_call_function() with IPIs disabled.

Signed-off-by: James Bottomley <jejb@parisc-linux.org>
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
2005-11-17 16:24:52 -05:00
Linus Torvalds 7652aab77f Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus 2005-11-17 10:56:34 -08:00
Chen, Kenneth W e8aabc4716 [IA64] polish comments for tlb fault handler in ivt.S
Polish the comments specifically in vhpt_miss and nested_dtlb_miss
handlers.  I think it's better to explicitly name each page table
level with its name instead of numerically name them.  i.e., use
pgd, pud, pmd, and pte instead of referring as L1, L2, L3 etc.
Along the line, remove some magic number in the comments like:
"PTA + (((IFA(61,63) << 7) | IFA(33,39))*8)".  No code change at
all, pure comment update.  Feel free to shoot anything you have,
darts or tomahawk cruise missile.  I will duck behind a bunker ;-)

Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Acked-by: Robin Holt <holt@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-11-17 09:48:15 -08:00
Chen, Kenneth W fedb25fae7 [IA64] 4 level page table bug fix in vhpt_miss
From source code inspection, I think there is a bug with 4 level
page table with vhpt_miss handler.  In the code path of rechecking
page table entry against previously read value after tlb insertion,
*pte value in register r18 was overwritten with value newly read
from pud pointer, render the check of new *pte against previous
*pte completely wrong.  Though the bug is none fatal and the penalty
is to purge the entry and retry.  For functional correctness, it
should be fixed.  The fix is to use a different register so new
*pud don't trash *pte.  (btw, the comments in the cmp statement is
wrong as well, which I will address in the next patch).

Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-11-17 09:47:18 -08:00
Russell King 67a1901ff4 [ARM] __ioremap doesn't use 4th argument
The "align" argument in ARMs __ioremap is unused and provides a
misleading expectation that it might do something.  It doesn't.
Remove it.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-11-17 16:48:00 +00:00
Linus Torvalds d0fa7e9f8e Merge master.kernel.org:/home/rmk/linux-2.6-arm 2005-11-17 08:45:42 -08:00
Russell King 728f5c076a [ARM] Improve comment about ASSERT()s in vmlinux.lds.S
Provide folk with an idea what to do if the ASSERT statements fail
with their linker.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-11-17 16:43:14 +00:00
Ralf Baechle aec8b7557c [MIPS] Update defconfigs
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2005-11-17 16:23:58 +00:00
Ralf Baechle 1a6ea3ec67 [MIPS] SEAD: More build fixes.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2005-11-17 16:23:57 +00:00
Ralf Baechle 09b696efd9 [MIPS] TX3927: Try to glue the PCI code.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2005-11-17 16:23:56 +00:00
Ralf Baechle 3d5d440176 [MIPS] Ocelot G: Use CPU_MASK_NONE instead of 0 to initialize cpu mask.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2005-11-17 16:23:55 +00:00
Ralf Baechle c32cf78c02 [MIPS] JMR3927: Fix compilation by including <linux/ds1742rtc.h>.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2005-11-17 16:23:53 +00:00
Ralf Baechle 5135b0cdb2 [MIPS] JMR3927: need include/asm-mips/mach-jmr3927 in it's include path.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2005-11-17 16:23:52 +00:00
Ralf Baechle 8bf4057bdd [MIPS] JMR3927: It's ops-tx3927.o not ops-jmr3927.o
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2005-11-17 16:23:51 +00:00
Pantelis Antoniou b60ccd575c [MIPS] Alchemy: Console output fixup
This is needed to make console output appear with the new driver...
    
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2005-11-17 16:23:50 +00:00
Arnaud Giersch 59f145d28c [MIPS] IP32: Fix sparse warnings.
Add __iomem qualifier to crime and mace pointers.
    
Signed-off-by: Arnaud Giersch <arnaud.giersch@free.fr>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2005-11-17 16:23:48 +00:00
Arnaud Giersch 19ce1cfb2d [MIPS] IP32: Export mace symbol.
Export mace symbol so that it can be used in modules.
    
Signed-off-by: Arnaud Giersch <arnaud.giersch@free.fr>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2005-11-17 16:23:47 +00:00
Ralf Baechle 70ad7d1840 [MIPS] JMR3927: Fix syntax error.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2005-11-17 16:23:46 +00:00
Ralf Baechle efd9412d85 [MIPS] JMR3927: Undo accidental rename.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2005-11-17 16:23:45 +00:00
Ralf Baechle d93efab838 [MIPS] DDB5477: Fix unused variable warning.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2005-11-17 16:23:45 +00:00
David Woodhouse 1e28a7ddd3 [PATCH] Avoid use of uninitialised spinlock in EEH.
If the kernel supports both G5 and pSeries, and CONFIG_EEH is enabled,
eeh_init() is (quite reasonably) never called when we boot on a G5. Yet
eeh_check_failure() still gets called. We should avoid doing that if
!eeh_subsystem_enabled.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-11-17 16:53:38 +11:00
Russell King 45e109d072 [ARM] sa1111.c needs asm/sizes.h
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-11-16 18:29:51 +00:00
Russell King 1b12050f17 [ARM] Move zone adjustment for SA1111 on SA11x0 platforms
Unfortunately, using PAGE_SHIFT in asm/arch/memory.h is unsafe, and we
can't include asm/page.h into this file because then we have a circular
dependency.  Move the offending code to arch/arm/common/sa1111.c
instead.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-11-16 17:38:40 +00:00
Linus Torvalds d58a75ef75 Merge git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc-merge 2005-11-16 07:58:48 -08:00
Ben Dooks 994e128053 [ARM] 3162/1: S3C2410 - updated defconfig
Patch from Ben Dooks

Minor changes, including add SysRq, selecting the DM9000
as a built-in driver, not as a module, and selecting the
framebuffer.

Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-11-16 15:05:13 +00:00
Ben Dooks b526bf23fd [ARM] 3161/1: BAST - fix commas on end of structs
Patch from Ben Dooks

Make the use of , on the lsat entry structs consistenent
through arch/arm/mach-s3c2410/mach-bast.c

Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-11-16 15:05:12 +00:00
Russell King 224b5be6dd [ARM] compressed/head.S debugging defaults to asm/arch/debug-macro.S
Since we want new platforms to use debug-macro.S, make the decompressor
debugging method default to using this include file rather than having
new platforms add to an #if defined().

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-11-16 14:59:51 +00:00
Russell King 0a5709b2dc [ARM] Include asm/hardware.h instead of asm/arch/hardware.h
Rationalise hardware.h include.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-11-16 14:51:20 +00:00
Russell King ce07d90aa8 [ARM] Fix arch-realview/system.h to use __io_address()
Move __io_address to arch-realview/hardware.h, drop core.h from platsmp.c
and localtimer.c, and include asm/io.h where required.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-11-16 14:38:19 +00:00
Benjamin Herrenschmidt 5d66da3d71 [PATCH] powerpc: Make the vDSO functions set error code (#2)
The vDSO functions should have the same calling convention as a syscall.
Unfortunately, they currently don't set the cr0.so bit which is used to
indicate an error. This patch makes them clear this bit unconditionally
since all functions currently succeed. The syscall fallback done by some
of them will eventually override this if the syscall fails.

This also changes the symbol version of all vdso exports to make sure
glibc can differenciate between old and fixed calls for existing ones
like __kernel_gettimeofday.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-11-16 14:05:11 +11:00
Benjamin Herrenschmidt d3ed658320 [PATCH] ppc: Fix build with CONFIG_CHRP not set
Building ARCH=ppc for multiplatforms with CONFIG_CHRP not set fails
due to some unshielded code in xmon

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-11-16 14:05:07 +11:00
Paul Mackerras 94b212c29f powerpc: Move ppc64 boot wrapper code over to arch/powerpc
This also extends the code to handle 32-bit ELF vmlinux files as well
as 64-bit ones.  This is sufficient for booting on new-world 32-bit
powermacs (i.e. all recent machines).

Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-11-16 13:52:21 +11:00
Olof Johansson 950fc0025f [PATCH] powerpc: add new powerbooks to feature table
Hi,

The previous PowerBook patch didn't contain the feature table updates
for ARCH=powerpc. Here they are.

Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-11-16 13:52:06 +11:00
Benjamin Herrenschmidt de93f0d62c [PATCH] ppc: Fix boot with yaboot with ARCH=ppc
The merge of machine types broke boot with yaboot & ARCH=ppc due to the
old code still retreiving the old-syle machine type passed in by yaboot.
This patch fixes it by translating those old numbers. Since that whole
mecanism is deprecated, this is a temporary fix until ARCH=ppc uses the
new prom_init that the merged architecture now uses for both ppc32 and
ppc64 (after 2.6.15)

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-11-16 13:51:58 +11:00
Benjamin Herrenschmidt b5166cc252 [PATCH] powerpc: pci_64 fixes & cleanups
I discovered that in some cases (PowerMac for example) we wouldn't
properly map the PCI IO space on recent kernels. In addition, the code
for initializing PCI host bridges was scattered all over the place with
some duplication between platforms.

This patch fixes the problem and does a small cleanup by creating a
pcibios_alloc_controller() in pci_64.c that is similar to the one in
pci_32.c (just takes an additional device node argument) that takes care
of all the grunt allocation and initialisation work. It should work for
both boot time and dynamically allocated PHBs.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-11-16 13:29:46 +11:00
Michael Ellerman f9e4ec57c6 [PATCH] powerpc: More debugging fixups
Add a few more missing includes of udbg.h

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-11-16 13:29:40 +11:00
Michael Ellerman eb481899aa [PATCH] powerpc: Fixup debugging in lmb.c
Somewhere we lost the include of udbg.h in lmb.c. While we're there, add a DBG
macro like every other file has and use it in lmb_dump_all().

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-11-16 13:28:40 +11:00
Benjamin Herrenschmidt 5444a5e9e8 [PATCH] powerpc: update defconfigs
My patch moving ppc64 RTC to genrtc was supposed to update all
defconfigs, but for some reason, the patch actually posted only had the
pseries one... ouch. This patch properly updates all defconfigs.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-11-16 13:28:32 +11:00
Marcelo Tosatti eb07d964b4 [PATCH] ppc32 8xx: update_mmu_cache() needs unconditional tlbie
Currently 8xx fails to boot due to endless pagefaults.

Seems the bug is exposed by the change which avoids flushing the
TLB when not necessary (in case the pte has not changed), introduced
recently:

__handle_mm_fault():

        entry = pte_mkyoung(entry);
        if (!pte_same(old_entry, entry)) {
                ptep_set_access_flags(vma, address, pte, entry, write_access);
                update_mmu_cache(vma, address, entry);
                lazy_mmu_prot_update(entry);
        } else {
                /*
                 * This is needed only for protection faults but the arch code
                 * is not yet telling us if this is a protection fault or not.
                 * This still avoids useless tlb flushes for .text page faults
                 * with threads.
                 */
                if (write_access)
                        flush_tlb_page(vma, address);
        }

The "update_mmu_cache()" call was unconditional before, which caused the TLB
to be flushed by:

        if (pfn_valid(pfn)) {
                struct page *page = pfn_to_page(pfn);
                if (!PageReserved(page)
                    && !test_bit(PG_arch_1, &page->flags)) {
                        if (vma->vm_mm == current->active_mm) {
#ifdef CONFIG_8xx
                        /* On 8xx, cache control instructions (particularly 
                         * "dcbst" from flush_dcache_icache) fault as write 
                         * operation if there is an unpopulated TLB entry 
                         * for the address in question. To workaround that, 
                         * we invalidate the TLB here, thus avoiding dcbst 
                         * misbehaviour.
                         */
                                _tlbie(address);
#endif
                                __flush_dcache_icache((void *) address);
                        } else
                                flush_dcache_icache_page(page);
                        set_bit(PG_arch_1, &page->flags);
                }

Which worked to due to pure luck: PG_arch_1 was always unset before, but
now it isnt.

The root of the problem are the changes against the 8xx TLB handlers introduced
during v2.6. What happens is the TLBMiss handlers load the zeroed pte into
the TLB, causing the TLBError handler to be invoked (thats two TLB faults per 
pagefault), which then jumps to the generic MM code to setup the pte.

The bug is that the zeroed TLB is not invalidated (the same reason
for the "dcbst" misbehaviour), resulting in infinite TLBError faults.

The "two exception" approach requires a TLB flush (to nuke the zeroed TLB)
at each PTE update for correct behaviour:

Signed-off-by: Marcelo Tosatti <marcelo.tosatti@cyclades.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-11-16 13:28:22 +11:00
Paul Mackerras fb6d73d301 [PATCH] powerpc: Fix sparsemem with memory holes [was Re: ppc64 oops..]
This patch should fix the crashes we have been seeing on 64-bit
powerpc systems with a memory hole when sparsemem is enabled.
I'd appreciate it if people who know more about NUMA and sparsemem
than me could look over it.

There were two bugs.  The first was that if NUMA was enabled but there
was no NUMA information for the machine, the setup_nonnuma() function
was adding a single region, assuming memory was contiguous.  The
second was that the loops in mem_init() and show_mem() assumed that
all pages within the span of a pgdat were valid (had a valid struct
page).

I also fixed the incorrect setting of num_physpages that Mike Kravetz
pointed out.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-15 16:57:12 -08:00
Chen, Kenneth W 1e185b97b4 [PATCH] ia64: cpu_idle performance bug fix
Our performance validation on 2.6.15-rc1 caught a disastrous performance
regression on ia64 with netperf (-98%) and volanomark (-58%) compares to
previous kernel version 2.6.14-git7.  See the following chart (result
group 1 & 2).

  http://kernel-perf.sourceforge.net/results.machine_id=26.html

We have root caused it to commit 64c7c8f885

This changeset broke the ia64 task resched notification.  In
sched.c:resched_task(), a reschedule IPI is conditioned upon
TIF_POLLING_NRFLAG.  However, the above changeset unconditionally set
the polling thread flag for idle tasks regardless whether pal_halt_light
is in use or not.  As a result, resched IPI is not sent from
resched_task().  And since the default behavior on ia64 is to use
pal_halt_light, we end up delaying the rescheduling task until next
timer tick, and thus cause the performance regression.

This fixes the performance bug.  I'm glad our performance suite is
turning up bad performance bug like this in time.

Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-15 15:50:51 -08:00
Linus Torvalds 47227d50c4 Merge master.kernel.org:/home/rmk/linux-2.6-arm 2005-11-15 12:46:57 -08:00
Russell King 72724382d3 [ARM] Initialise SA1111 core before SA1111 PCMCIA
This avoids a BUG_ON with kref.c when SA1111 tries to register
a driver with an unregistered bus type.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-11-15 19:04:22 +00:00
Christoph Hellwig 0c53508980 [PATCH] v850: use generic hardirq code
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Miles Bader <miles@gnu.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-15 08:59:20 -08:00
Miles Bader 228322f13f [PATCH] v850: Fix show_interrupts
A variable was being used in multiple conflicting ways.  I also restructured
the code a bit for clarity.

Signed-off-by: Miles Bader <miles@gnu.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-15 08:59:19 -08:00
Ben Collins fa63b22982 [PATCH] Add missing EXPORT_SYMBOLS() for __ide_mm_* functions on powerpc
These exported symbols are in arch/ppc/ but missing from arch/powerpc/ for
ppc32 builds.

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-15 08:59:19 -08:00
Vivek Goyal 19842d6734 [PATCH] drop "[PATCH] i386 kexec-on-panic: Don't shutdown the apics"
A patch by Eric was merged (f2b36db692)
and later on reverted back (1e4c85f97f).

Along with above patch, another patch was posted and has been merged
(3d1675b41b).  That patch was dependent on
the above patch and now it should also be reverted.

Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-15 08:59:18 -08:00
Russell King eceab4ac8d [ARM] Use kernel/power/Kconfig
Rather than defining our own PM option, use kernel/power/Kconfig.
This fixes build errors introduced by
bca73e4bf8

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-11-15 11:31:41 +00:00
Linus Torvalds 4060994c3e Merge x86-64 update from Andi 2005-11-14 19:56:02 -08:00
Bob Picco d3ee871e63 [PATCH] x86_64: Fix sparse mem
Fix up booting with sparse mem enabled. Otherwise it would just
cause an early PANIC at boot.

Signed-off-by: Bob Picco <bob.picco@hp.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-14 19:55:18 -08:00
Andi Kleen 8893166ff8 [PATCH] x86_64: Increase the maximum number of local APICs to the maximum
This is needed for large multinode IBM systems which have a sparse
APIC space in clustered mode, fully covering the available 8 bits.

The previous kernels would limit the local APIC number to 127,
which caused it to reject some of the CPUs at boot.

I increased the maximum and shrunk the apic_version array a bit
to make up for that (the version is only 8 bit, so don't need
an full int to store)

Cc:  Chris McDermott <lcm@us.ibm.com>

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-14 19:55:17 -08:00
Andi Kleen 9e43e1b7c7 [PATCH] x86_64: Remove CONFIG_CHECKING and add command line option for pagefault tracing
CONFIG_CHECKING covered some debugging code used in the early times
of the port. But it wasn't even SMP safe for quite some time
and the bugs it checked for seem to be gone.

This patch removes all the code to verify GS at kernel entry. There
haven't been any new bugs in this area for a long time.

Previously it also covered the sysctl for the page fault tracing.
That didn't make much sense because that code was unconditionally
compiled in. I made that a boot option now because it is typically
only useful at boot.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-14 19:55:17 -08:00
Magnus Damm ffd10a2b77 [PATCH] x86_64: Make node boundaries consistent
The current x86_64 NUMA memory code is inconsequent when it comes to node
memory ranges. The exact behaviour varies depending on which config option
that is used.

setup_node_bootmem() has start and end as arguments and these are used to
calculate the size of the node like this: (end - start). This is all fine
if end is pointing to the first non-available byte. The problem is that the
current x86_64 code sometimes treats it as the last present byte and sometimes
as the first non-available byte. The result is that some configurations might
lose a page at the end of the range.

This patch tries to fix CONFIG_ACPI_NUMA, CONFIG_K8_NUMA and CONFIG_NUMA_EMU
so they all treat the end variable as the first non-available byte. This is
the same way as the single node code.

The patch is boot tested on dual x86_64 hardware with the above configurations,
but maybe the removed code is needed as some workaround?

Signed-off-by: Magnus Damm <magnus@valinux.co.jp>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-14 19:55:17 -08:00
Andi Kleen e583538f07 [PATCH] x86_64: Log machine checks from boot on Intel systems
The logging for boot errors was turned off because it was broken
on some AMD systems. But give Intel EM64T systems a chance because they are
supposed to be correct there.

The advantage is that there is a chance to actually log uncorrected
machine checks after the reset.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-14 19:55:17 -08:00
Ravikiran G Thirumalai b0bd35e622 [PATCH] x86_64: Make ACPI NUMA and NUMA emulation peers of K8_NUMA in Kconfig
On x86_64 arches, there is no way to choose ACPI_NUMA without having to choose
K8_NUMA.  CONFIG_K8_NUMA is not needed for Intel EM64T NUMA boxes.  It also
looks odd if you have to select ACPI_NUMA from the power management menu.
This patch fixes those oddities.  Patch does the following:

1. Makes NUMA a config option like other arches
2. Makes topology detection options like K8_NUMA dependent on NUMA
3. Choosing ACPI NUMA detection can be done from the standard
   "Processor type and features" menu

AK: I fixed up the dependencies and changed the help texts a bit
on top of Kiran's patch.

Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: Shai Fultheim <shai@scalex86.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-14 19:55:17 -08:00
Paolo 'Blaisorblade' Giarrusso efbbdce94f [PATCH] x86_64: Use common sys_time64
Keeping this function does not makes sense because it's a copied (and
buggy) copy of sys_time.  The only difference is that now.tv_sec (which is
a time_t, i.e.  a 64-bit long) is copied (and truncated) into a int
(32-bit).

The prototype is the same (they both take a long __user *), so let's drop
this and redirect it to sys_time (and make sure it exists by defining
__ARCH_WANT_SYS_TIME).

Only disadvantage is that the sys_stime definition is also compiled (may be
fixed if needed by adding a separate __ARCH_WANT_SYS_STIME macro, and
defining it for all arch's defining __ARCH_WANT_SYS_TIME except x86_64).

Acked-by: Andi Kleen <ak@suse.de>
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-14 19:55:17 -08:00
Andi Kleen a5b250a428 [PATCH] x86_64: Remove optimization for B stepping AMD K8
B stepping were the first shipping Opterons. memcpy/memset/copy_page/
clear_page had special optimized version for them. These are really
old and in the minority now and the difference to the generic versions
(using rep microcode) is not that big anyways. So just remove them.

TODO: figure out optimized versions for Intel Netburst based EM64T

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-14 19:55:17 -08:00
Andi Kleen a6f5deb2be [PATCH] x86_64: Reduce number of retries for reset through keyboard controller
Old code could retry for 10 seconds worst time. Only try it
for one second now.

Suggested by Yinghai Lu

Cc: Yinghai.Lu@amd.com

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-14 19:55:16 -08:00
Siddha, Suresh B 2b0918758d [PATCH] x86_64: x86_64/i386 fix Intel cache detection code assumption about threads sharing
Fix the Intel cache detection code assumption that number of threads
sharing the cache will either be equal to number of HT or core siblings.

This also cleans up the code in general a bit.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-14 19:55:16 -08:00
Siddha, Suresh B 94605eff57 [PATCH] x86-64/i386: Intel HT, Multi core detection fixes
Fields obtained through cpuid vector 0x1(ebx[16:23]) and
vector 0x4(eax[14:25], eax[26:31]) indicate the maximum values and might not
always be the same as what is available and what OS sees.  So make sure
"siblings" and "cpu cores" values in /proc/cpuinfo reflect the values as seen
by OS instead of what cpuid instruction says. This will also fix the buggy BIOS
cases (for example where cpuid on a single core cpu says there are "2" siblings,
even when HT is disabled in the BIOS.
http://bugzilla.kernel.org/show_bug.cgi?id=4359)

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-14 19:55:16 -08:00
Andi Kleen 3506229ff9 [PATCH] x86_64: Don't enable interrupt unconditionally in reboot path
When they were disabled before (e.g. after a panic) it's better
to keep them off, otherwise followon panics can happen from timer
interrupt handlers etc.

Drawback is that pageup in the console won't work anymore though.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-14 19:55:16 -08:00
Andi Kleen a88cde13ba [PATCH] x86_64: Formatting fixes for arch/x86_64/kernel/process.c
No functional changes.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-14 19:55:16 -08:00
Andi Kleen ea0be473a1 [PATCH] x86_64: Allow modular build of ia32 aout loader
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-14 19:55:16 -08:00
Shaohua Li af9c142de9 [PATCH] x86_64: Force correct address space size for MTRR on some 64bit Intel Xeons
They report 40bit, but only have 36bits of physical address space.
This caused problems with setting up the correct masks for MTRR.

CPUID workaround for steppings 0F33h(supporting x86) and 0F34h(supporting x86
and EM64T). Detail info can be found at:
http://download.intel.com/design/Xeon/specupdt/30240216.pdf
http://download.intel.com/design/Pentium4/specupdt/30235221.pdf

Signed-off-by: Shaohua Li<shaohua.li@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-14 19:55:16 -08:00
Eric Dumazet 529a340402 [PATCH] x86_64: Optimize NUMA node hash function
Compute the highest possible value for memnode_shift, in order to reduce
footprint of memnodemap[] to the minimum, thus making all users
(phys_to_nid(), kfree()), more cache friendly.

Before the patch :

 Node 0 MemBase 0000000000000000 Limit 00000001ffffffff
 Node 1 MemBase 0000000200000000 Limit 00000003ffffffff
 Using 23 for the hash shift. Max adder is 3ffffffff

After the patch :

 Node 0 MemBase 0000000000000000 Limit 00000001ffffffff
 Node 1 MemBase 0000000200000000 Limit 00000003ffffffff
 Using 33 for the hash shift.

In this case, only 2 bytes of memnodemap[] are used, instead of 2048

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-14 19:55:15 -08:00
Bryan Ford e4e5d324b9 [PATCH] x86_64: Save/restore CS in 64bit signal handlers and force __USER_CS for CS
This allows to run 64bit signal handlers in 64bit processes that run small
code snippets in compat mode.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-14 19:55:15 -08:00
Andi Kleen 420f8f68c9 [PATCH] x86_64: New heuristics to find out hotpluggable CPUs.
With a NR_CPUS==128 kernel with CPU hotplug enabled we would waste 4MB
on per CPU data of all possible CPUs.  The reason was that HOTPLUG
always set up possible map to NR_CPUS cpus and then we need to allocate
that much (each per CPU data is roughly ~32k now)

The underlying problem is that ACPI didn't tell us how many hotplug CPUs
the platform supports.  So the old code just assumed all, which would
lead to this memory wastage.

This implements some new heuristics:

 - If the BIOS specified disabled CPUs in the ACPI/mptables assume they
   can be enabled later (this is bending the ACPI specification a bit,
   but seems like a obvious extension)
 - The user can overwrite it with a new additionals_cpus=NUM option
 - Otherwise use half of the available CPUs or 2, whatever is more.

Cc: ashok.raj@intel.com
Cc: len.brown@intel.com

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-14 19:55:15 -08:00
Andi Kleen 5917089104 [PATCH] x86_64: Replace swiotlb extern with include
Minor victory on the continuous quest against all stray extern.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-14 19:55:15 -08:00
Andi Kleen 4d74dbd79a [PATCH] x86_64: Replace cpu_pda extern with include
Minor cleanup - remove obsolete extern

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-14 19:55:15 -08:00
Andi Kleen 2bc0414ee0 [PATCH] x86_64: Only use asm/sections.h to declare section symbols
Adding __initdata_* to asm-generic/sections.h
Replaces a lot of open coded externs in arch/x86_64/*
I had to change __bss_end to __bss_stop to match the other architectures.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-14 19:55:14 -08:00
Siddha, Suresh B f6c2e3330d [PATCH] x86_64: Unmap NULL during early bootup
We should zap the low mappings, as soon as possible, so that we can catch
kernel bugs more effectively. Previously early boot had NULL mapped
and didn't trap on NULL references.

This patch introduces boot_level4_pgt, which will always have low identity
addresses mapped.  Druing boot, all the processors will use this as their
level4 pgt.  On BP, we will switch to init_level4_pgt as soon as we enter C
code and zap the low mappings as soon as we are done with the usage of
identity low mapped addresses.  On AP's we will zap the low mappings as
soon as we jump to C code.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-14 19:55:14 -08:00
Andi Kleen 69d81fcde7 [PATCH] x86_64: Speed up numa_node_id by putting it directly into the PDA
Not go from the CPU number to an mapping array.
Mode number is often used now in fast paths.

This also adds a generic numa_node_id to all the topology includes

Suggested by Eric Dumazet

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-14 19:55:14 -08:00
Andi Kleen 50895c5d76 [PATCH] x86_64: Fix gcc 4 warning in aperture.c
Fix

  arch/x86_64/kernel/aperture.c: In function #iommu_hole_init#:
  arch/x86_64/kernel/aperture.c:199: warning: #aper_order# may be used uninitialized in this function

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-14 19:55:14 -08:00
Suresh Siddha f5f786d045 [PATCH] x86-64/i386: Fix CPU model for family 6
According to cpuid instruction in IA32 SDM-Vol2, when computing cpu model,
we need to consider extended model ID for family 0x6 also.

AK: Also added fixes/simplifcation from Petr Vandrovec

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-14 19:55:14 -08:00
Ashok Raj e9b59d834f [PATCH] x86_64: Remove duplicate __cpuinit define
Remove duplicate __cpuinit in smp.c. Already defined in init.h which is
already included.

Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-14 19:55:14 -08:00
Andi Kleen 47492d3667 [PATCH] x86_64: Use the DMA32 zone for dma_alloc_coherent()/pci_alloc_consistent
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-14 19:55:14 -08:00
James Cleverdon 6004e1b7ef [PATCH] i386/x86-64: Share interrupt vectors when there is a large number of interrupt sources
Here's a patch that builds on Natalie Protasevich's IRQ compression
patch and tries to work for MPS boots as well as ACPI.  It is meant for
a 4-node IBM x460 NUMA box, which was dying because it had interrupt
pins with GSI numbers > NR_IRQS and thus overflowed irq_desc.

The problem is that this system has 270 GSIs (which are 1:1 mapped with
I/O APIC RTEs) and an 8-node box would have 540.  This is much bigger
than NR_IRQS (224 for both i386 and x86_64).  Also, there aren't enough
vectors to go around.  There are about 190 usable vectors, not counting
the reserved ones and the unused vectors at 0x20 to 0x2F.  So, my patch
attempts to compress the GSI range and share vectors by sharing IRQs.

Cc: "Protasevich, Natalie" <Natalie.Protasevich@unisys.com>

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-14 19:55:13 -08:00
Jacob Shin 89b831ef8b [PATCH] x86_64: Support for AMD specific MCE Threshold.
MC4_MISC - DRAM Errors Threshold Register realized under AMD K8 Rev F.
This register is used to count correctable and uncorrectable ECC errors that occur during DRAM read operations.
The user may interface through sysfs files in order to change the threshold configuration.

bank%d/error_count - reads current error count, write to clear.
bank%d/interrupt_enable - set/clear interrupt enable.
bank%d/threshold_limit - read/write the threshold limit.

APIC vector 0xF9 in hw_irq.h.
5 software defined bank ids in mce.h.
new apic.c function to setup threshold apic lvt.
defaults to interrupt off, count enabled, and threshold limit max.
sysfs interface created on /sys/devices/system/threshold.

AK: added some ifdefs to make it compile on UP

Signed-off-by: Jacob Shin <jacob.shin@amd.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-14 19:55:13 -08:00
Andi Kleen e18c6874a5 [PATCH] x86_64: Account mem_map in VM holes accounting
The VM needs to know about lost memory in zones to accurately
balance dirty pages. This patch accounts mem_map in there too,
which fixes a constant errror of a few percent. Also some
other misc mappings and the kernel text itself are accounted
too.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-14 19:55:13 -08:00
Andi Kleen fed644132f [PATCH] x86_64: Make i386 compile again with fourth DMA32 zone
The code should deal with an additional empty zone, so fix up the
#error.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-14 19:55:13 -08:00
Andi Kleen d1e3dfdc2c [PATCH] x86_64: Set compatibility flag for 4GB zone on IA64
IA64 traditionally had a 4GB DMA32 zone. Set the compatibility flag
to keep old drivers working.

For new drivers it would be better to use ZONE_DMA32 now.

Cc: tony.luck@intel.com
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-14 19:55:13 -08:00
Andi Kleen a2f1b42490 [PATCH] x86_64: Add 4GB DMA32 zone
Add a new 4GB GFP_DMA32 zone between the GFP_DMA and GFP_NORMAL zones.

As a bit of historical background: when the x86-64 port
was originally designed we had some discussion if we should
use a 16MB DMA zone like i386 or a 4GB DMA zone like IA64 or
both. Both was ruled out at this point because it was in early
2.4 when VM is still quite shakey and had bad troubles even
dealing with one DMA zone.  We settled on the 16MB DMA zone mainly
because we worried about older soundcards and the floppy.

But this has always caused problems since then because
device drivers had trouble getting enough DMA able memory. These days
the VM works much better and the wide use of NUMA has proven
it can deal with many zones successfully.

So this patch adds both zones.

This helps drivers who need a lot of memory below 4GB because
their hardware is not accessing more (graphic drivers - proprietary
and free ones, video frame buffer drivers, sound drivers etc.).
Previously they could only use IOMMU+16MB GFP_DMA, which
was not enough memory.

Another common problem is that hardware who has full memory
addressing for >4GB misses it for some control structures in memory
(like transmit rings or other metadata).  They tended to allocate memory
in the 16MB GFP_DMA or the IOMMU/swiotlb then using pci_alloc_consistent,
but that can tie up a lot of precious 16MB GFPDMA/IOMMU/swiotlb memory
(even on AMD systems the IOMMU tends to be quite small) especially if you have
many devices.  With the new zone pci_alloc_consistent can just put
this stuff into memory below 4GB which works better.

One argument was still if the zone should be 4GB or 2GB. The main
motivation for 2GB would be an unnamed not so unpopular hardware
raid controller (mostly found in older machines from a particular four letter
company) who has a strange 2GB restriction in firmware. But
that one works ok with swiotlb/IOMMU anyways, so it doesn't really
need GFP_DMA32. I chose 4GB to be compatible with IA64 and because
it seems to be the most common restriction.

The new zone is so far added only for x86-64.

For other architectures who don't set up this
new zone nothing changes. Architectures can set a compatibility
define in Kconfig CONFIG_DMA_IS_DMA32 that will define GFP_DMA32
as GFP_DMA. Otherwise it's a nop because on 32bit architectures
it's normally not needed because GFP_NORMAL (=0) is DMA able
enough.

One problem is still that GFP_DMA means different things on different
architectures. e.g. some drivers used to have #ifdef ia64  use GFP_DMA
(trusting it to be 4GB) #elif __x86_64__ (use other hacks like
the swiotlb because 16MB is not enough) ... . This was quite
ugly and is now obsolete.

These should be now converted to use GFP_DMA32 unconditionally. I haven't done
this yet. Or best only use pci_alloc_consistent/dma_alloc_coherent
which will use GFP_DMA32 transparently.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-14 19:55:13 -08:00
Andi Kleen 56720367cd [PATCH] x86_64: Update defconfig
Rerun and enable autofs 4, relayfs and softdog

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-14 19:55:12 -08:00
Paul Mackerras ba76cd575f powerpc: Remove __init from a function used in suspend/resume.
Suspend/resume on powermacs uses the pmac_get_boot_time function,
so it can't be marked as __init.

Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-11-15 11:17:09 +11:00
Paul Mackerras cc657f5392 powerpc: Fix clearing of the FPSCR when invoking a signal handler
As pointed out by Gary Byers, we were clearing the image of the FPSCR
(floating point status and control register) in the thread_struct before
copying it to the user stack when invoking a signal.  Thus the task
would see its FPSCR getting cleared when it took a signal.

While fixing it I noticed that our swapcontext system call was also
clearing FPSCR.  It shouldn't, so I fixed that too.

Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-11-15 11:11:32 +11:00