These are defined as static cpumask_var_t so if MAXSMP is not used,
they are cleared already. Avoid surprises when MAXSMP is enabled.
Signed-off-by: Yinghai Lu <yinghai.lu@kernel.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* 'x86/uv' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: UV BAU distribution and payload MMRs
x86: UV: BAU partition-relative distribution map
x86, uv: add Kconfig dependency on NUMA for UV systems
x86: prevent /sys/firmware/sgi_uv from being created on non-uv systems
x86, UV: Fix for nodes with memory and no cpus
x86, UV: system table in bios accessed after unmap
x86: UV BAU messaging timeouts
x86: UV BAU and nodes with no memory
This patch correctly sets BAU memory mapped registers to point
to the sending activation descriptor table and target payload table.
The "Broadcast Assist Unit" is used for TLB shootdown in UV.
The memory mapped registers that point to sending and receiving
memory structures contain node numbers.
In one case the __pa() function did not provide the node id of
memory on blade zero in configurations where that id is nonzero.
In another case, it was assumed that memory was allocated on
the local node. That assumption is not true in a configuration
in which the node has no memory.
Tested on the UV hardware simulator.
[ Impact: fix possible runtime crash due to incorrect TLB logic ]
Signed-off-by: Cliff Wickman <cpw@sgi.com>
LKML-Reference: <E1LuR5Z-0007An-B8@eag09.americas.sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch enables each partition's BAU distribution bit map
to be partition-relative.
The distribution bitmap had been constructed assuming 0 as the base
node number. That construct would not have allowed a total system of
greater than 256 nodes.
It also corrects an error that occurred when the first blade's nasid
was not zero. That nasid was stored as the base node.
The base node number gets added by hardware to the node numbers implied
in the distribution bitmap, resulting in invalid target nasids.
Tested on the UV hardware simulator.
Signed-off-by: Cliff Wickman <cpw@sgi.com>
LKML-Reference: <E1Ltl0C-0004Ob-37@eag09.americas.sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch replaces a 'nop' uv_enable_timeouts() in the
UV TLB shootdown code. (somehow, long ago that function got
eviscerated)
If any cpu in the destination node does not get interrupted by the
message and post completion in a reasonable time the hardware
should respond to the sender with an error. This function
enables such timeouts.
Tested on the UV hardware simulator.
Signed-off-by: Cliff Wickman <cpw@sgi.com>
LKML-Reference: <E1LpjXU-00007e-Qh@eag09.americas.sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch fixes BAU initialization for systems containing
nodes with no memory and for systems with non-consecutive
node numbers.
Fixes and clarifies situations where pnode should be used instead
of node id.
Tested on the UV hardware simulator.
Signed-off-by: Cliff Wickman <cpw@sgi.com>
LKML-Reference: <E1LpjX3-00007N-12@eag09.americas.sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: fix boot crash on UV systems
Commit 76ba0ecda0 "cpumask: use
cpumask_var_t in uv_flush_tlb_others" used cur_cpu as an iterator;
it was supposed to be zero for the code below it.
Reported-by: Cliff Wickman <cpw@sgi.com>
Original-From: Cliff Wickman <cpw@sgi.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Mike Travis <travis@sgi.com>
Cc: steiner@sgi.com
Cc: <stable@kernel.org>
LKML-Reference: <200903180822.31196.rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
In uv_flush_tlb_others() (arch/x86/kernel/tlb_uv.c),
the "WARN_ON(!in_atomic())" fails if CONFIG_PREEMPT is not enabled.
And CONFIG_PREEMPT is not enabled by default in the distribution that
most UV owners will use.
We could #ifdef CONFIG_PREEMPT the warning, but that is not good form.
And there seems to be no suitable fix to in_atomic() when CONFIG_PREMPT
is not on.
As Ingo commented:
> and we have no proper primitive to test for atomicity. (mainly
> because we dont know about atomicity on a non-preempt kernel)
So we drop the WARN_ON.
Signed-off-by: Cliff Wickman <cpw@sgi.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: fix possible tlb mis-flushing on UV
uv_flush_send_and_wait() should return a pointer if the broadcast
remote tlb shootdown requests fail. That causes the conventional IPI
method of shootdown to be used.
Signed-off-by: Cliff Wickman <cpw@sgi.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Impact: cleanup
Make the following uv related cleanups.
* collect visible uv related definitions and interfaces into uv/uv.h
and use it. this cleans up the messy situation where on 64bit, uv
is defined properly, on 32bit generic it's dummy and on the rest
undefined. after this clean up, uv is defined on 64 and dummy on
32.
* update uv_flush_tlb_others() such that it takes cpumask of
to-be-flushed cpus as argument, instead of that minus self, and
returns yet-to-be-flushed cpumask, instead of modifying the passed
in parameter. this interface change will ease dummy implementation
of uv_flush_tlb_others() and makes uv tlb flush related stuff
defined in tlb_uv proper.
Signed-off-by: Tejun Heo <tj@kernel.org>
The function uv_wait_completion() spins on reads of a memory-mapped
register, waiting for completion of BAU hardware replies.
It should call "cpu_relax()" between those reads to improve performance
on hyperthreaded configurations.
Signed-off-by: Cliff Wickman <cpw@sgi.com>
Acked-by: Jack Steiner <steiner@sgi.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: reduce stack usage, use new cpumask API.
This is made a little more tricky by uv_flush_tlb_others which
actually alters its argument, for an IPI to be sent to the remaining
cpus in the mask.
I solve this by allocating a cpumask_var_t for this case and falling back
to IPI should this fail.
To eliminate temporaries in the caller, all flush_tlb_others implementations
now do the this-cpu-elimination step themselves.
Note also the curious "cpus_or(f->flush_cpumask, cpumask, f->flush_cpumask)"
which has been there since pre-git and yet f->flush_cpumask is always zero
at this point.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Mike Travis <travis@sgi.com>
Impact: fix crash on x86/UV
UV is the SGI "UltraViolet" machine, which is x86_64 based.
BAU is the "Broadcast Assist Unit", used for TLB shootdown in UV.
This patch removes the allocation and initialization of an unused table.
This table is left over from a development test mode. It is unused in
the present code.
And it was incorrectly initialized: 8 entries allocated but 17 initialized,
causing slab corruption.
This patch should go into 2.6.27 and 2.6.28 as well as the current tree.
Diffed against 2.6.28 (linux-next, 12/30/08)
Signed-off-by: Cliff Wickman <cpw@sgi.com>
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: fix double entry creation in /proc
There is a collision between two UV functions:
both uv_ptc_init() and gru_proc_init() try to make /proc/sgi_uv
So move it's creation to a single place: uv_system_init()
Signed-off-by: Cliff Wickman <cpw@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
For some reason tlb_uv was including linux/mc146818rtc.h. It really
just needs linux/seq_file.h
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citix.com>
Cc: Cliff Wickman <cpw@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The UV TLB shootdown mechanism needs a system interrupt vector.
Its vector had been hardcoded as 200, but needs to moved to the reserved
system vector range so that it does not collide with some device vector.
This is still temporary until dynamic system IRQ allocation is provided.
But it will be needed when real UV hardware becomes available and runs 2.6.27.
Signed-off-by: Cliff Wickman <cpw@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Someone could write 0 bytes to /proc/sgi_uv/ptc_statistics,
causing
optstr[count - 1] = '\0';
to write to who-knows-where.
(Andi Kleen noticed this need from a patch I sent for
similar code in the ia64 world (sn2_ptc_proc_write()).)
(count less than zero is not possible here, as count is unsigned)
Signed-off-by: Cliff Wickman <cpw@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
v6: 6/19 close the security hole in uv_ptc_proc_write())
> Found a potential security hole while doing that:
> static ssize_t uv_ptc_proc_write(struct file *file, const char __user *user,
> size_t count, loff_t *data)
> if (copy_from_user(optstr, user, count))
> return -EFAULT;
>
> is count guaranteed to never be larger than 64?
is fixed below.
It adds tlb_uv.o to the Makefile.
Signed-off-by: Cliff Wickman <cpw@sgi.com>
Cc: mingo@elte.hu
Signed-off-by: Ingo Molnar <mingo@elte.hu>
fix:
arch/x86/kernel/tlb_uv.c: In function ‘uv_table_bases_init':
arch/x86/kernel/tlb_uv.c:612: error: ‘bau_tabsp' undeclared (first use in this function)
arch/x86/kernel/tlb_uv.c:612: error: (Each undeclared identifier is reported only once
arch/x86/kernel/tlb_uv.c:612: error: for each function it appears in.)
Signed-off-by: Ingo Molnar <mingo@elte.hu>
TLB shootdown for SGI UV.
v1: 6/2 original
v2: 6/3 corrections/improvements per Ingo's review
v3: 6/4 split atomic operations off to a separate patch (Jeremy's review)
v4: 6/12 include <mach_apic.h> rather than <asm/mach-bigsmp/mach_apic.h>
(fixes a !SMP build problem that Ingo found)
fix the index on uv_table_bases[blade]
Signed-off-by: Cliff Wickman <cpw@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
TLB shootdown for SGI UV.
Depends on patch (in tip/x86/irq):
x86-update-macros-used-by-uv-platform.patch Jack Steiner May 29
This patch provides the ability to flush TLB's in cpu's that are not on
the local node. The hardware mechanism for distributing the flush
messages is the UV's "broadcast assist unit".
The hook to intercept TLB shootdown requests is a 2-line change to
native_flush_tlb_others() (arch/x86/kernel/tlb_64.c).
This code has been tested on a hardware simulator. The real hardware
is not yet available.
The shootdown statistics are provided through /proc/sgi_uv/ptc_statistics.
The use of /sys was considered, but would have required the use of
many /sys files. The debugfs was also considered, but these statistics
should be available on an ongoing basis, not just for debugging.
Issues to be fixed later:
- The IRQ for the messaging interrupt is currently hardcoded as 200
(see UV_BAU_MESSAGE). It should be dynamically assigned in the future.
- The use of appropriate udelay()'s is untested, as they are a problem
in the simulator.
Signed-off-by: Cliff Wickman <cpw@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>