Commit Graph

48 Commits

Author SHA1 Message Date
Linus Torvalds fb7a0e3653 Merge kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6.git
Do arch/ia64/defconfig by hand.
2005-06-22 12:22:12 -07:00
Jes Sorensen f14f75b811 [PATCH] ia64 uncached alloc
This patch contains the ia64 uncached page allocator and the generic
allocator (genalloc).  The uncached allocator was formerly part of the SN2
mspec driver but there are several other users of it so it has been split
off from the driver.

The generic allocator can be used by device driver to manage special memory
etc.  The generic allocator is based on the allocator from the sym53c8xx_2
driver.

Various users on ia64 needs uncached memory.  The SGI SN architecture requires
it for inter-partition communication between partitions within a large NUMA
cluster.  The specific user for this is the XPC code.  Another application is
large MPI style applications which use it for synchronization, on SN this can
be done using special 'fetchop' operations but it also benefits non SN
hardware which may use regular uncached memory for this purpose.  Performance
of doing this through uncached vs cached memory is pretty substantial.  This
is handled by the mspec driver which I will push out in a seperate patch.

Rather than creating a specific allocator for just uncached memory I came up
with genalloc which is a generic purpose allocator that can be used by device
drivers and other subsystems as they please.  For instance to handle onboard
device memory.  It was derived from the sym53c7xx_2 driver's allocator which
is also an example of a potential user (I am refraining from modifying sym2
right now as it seems to have been under fairly heavy development recently).

On ia64 memory has various properties within a granule, ie.  it isn't safe to
access memory as uncached within the same granule as currently has memory
accessed in cached mode.  The regular system therefore doesn't utilize memory
in the lower granules which is mixed in with device PAL code etc.  The
uncached driver walks the EFI memmap and pulls out the spill uncached pages
and sticks them into the uncached pool.  Only after these chunks have been
utilized, will it start converting regular cached memory into uncached memory.
Hence the reason for the EFI related code additions.

Signed-off-by: Jes Sorensen <jes@wildopensource.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21 18:46:18 -07:00
Martin Hicks 753ee72896 [PATCH] VM: early zone reclaim
This is the core of the (much simplified) early reclaim.  The goal of this
patch is to reclaim some easily-freed pages from a zone before falling back
onto another zone.

One of the major uses of this is NUMA machines.  With the default allocator
behavior the allocator would look for memory in another zone, which might be
off-node, before trying to reclaim from the current zone.

This adds a zone tuneable to enable early zone reclaim.  It is selected on a
per-zone basis and is turned on/off via syscall.

Adding some extra throttling on the reclaim was also required (patch
4/4).  Without the machine would grind to a crawl when doing a "make -j"
kernel build.  Even with this patch the System Time is higher on
average, but it seems tolerable.  Here are some numbers for kernbench
runs on a 2-node, 4cpu, 8Gig RAM Altix in the "make -j" run:

			wall  user   sys   %cpu  ctx sw.  sleeps
			----  ----   ---   ----   ------  ------
No patch		1009  1384   847   258   298170   504402
w/patch, no reclaim     880   1376   667   288   254064   396745
w/patch & reclaim       1079  1385   926   252   291625   548873

These numbers are the average of 2 runs of 3 "make -j" runs done right
after system boot.  Run-to-run variability for "make -j" is huge, so
these numbers aren't terribly useful except to seee that with reclaim
the benchmark still finishes in a reasonable amount of time.

I also looked at the NUMA hit/miss stats for the "make -j" runs and the
reclaim doesn't make any difference when the machine is thrashing away.

Doing a "make -j8" on a single node that is filled with page cache pages
takes 700 seconds with reclaim turned on and 735 seconds without reclaim
(due to remote memory accesses).

The simple zone_reclaim syscall program is at
http://www.bork.org/~mort/sgi/zone_reclaim.c

Signed-off-by: Martin Hicks <mort@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21 18:46:14 -07:00
Matthew Chapman 4ea78729b8 [IA64] ptrace and restore_sigcontext() allow ar.rsc.pl==0
This patch fixes handling of accesses to ar.rsc via ptrace & restore_sigcontext
[With Thanks to Chris Wright for noticing the restore_sigcontext path]

Signed-off-by: Matthew Chapman <matthewc@hp.com>
Acked-by: David Mosberger <davidm@hpl.hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-06-21 16:19:20 -07:00
Ken Chen 0393eed5c3 [IA64] fix nested_dtlb_miss handler for hugetlb address
The nested_dtlb_miss handler currently does not handle fault from
hugetlb address correctly.  It walks the page table assuming PAGE_SIZE.
Thus when taking a fault triggered from hugetlb address, it would not
calculate the pgd/pmd/pte address correctly and thus result an incorrect
invocation of ia64_do_page_fault().  In there, kernel will signal SIGBUS
and application dies (The faulting address is perfectly legal and we
have a valid pte for the corresponding user hugetlb address as well).
This patch fix the described kernel bug.  Since nested_dtlb_miss is a
rare event and a slow path anyway, I'm making the change without #ifdef
CONFIG_HUGETLB_PAGE for code readability.  Tony, please apply.

Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-06-21 14:40:31 -07:00
Christophe Lucas 52a0de2cd2 [IA64] printk needs KERN_INFO arch/ia64/kernel/smp.c
printk() calls should include appropriate KERN_* constant.

Signed-off-by: Christophe Lucas <clucas@rotomalug.org>
Signed-off-by: Domen Puncer <domen@coderock.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-06-21 14:21:17 -07:00
David Mosberger-Tang 34b727c135 [IA64] Drop spurious paren in entry.h
The latest assembler catches this typo.  (reported by Jim Wilson).

Signed-off-by: David Mosberger-Tang <davidm@hpl.hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-06-20 09:34:02 -07:00
Christoph Lameter a2a64769d0 [IA64] Fix race condition in the rt_sigprocmask fastcall
current->blocked will be set to the value of current->thread_info->flags if the
cmpxchg to update thread_info->flags fails. For performance reasons the store into
current->blocked was placed in the cmpxchg loop. However, the cmpxchg overwrites the
register holding the value to be stored. In the rare case of a retry the value of
thread_info->flags will be written into current->blocked.

The fix is to use another register so that the register containing the current->blocked
value is not overwritten.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-06-09 13:04:30 -07:00
Peter Chubb 05062d96a2 [PATCH] ia64: fix floating-point preemption problem
There've been reports of problems with CONFIG_PREEMPT=y and the high
floating point partition.  This is caused by the possibility of preemption
and rescheduling on a different processor while saving or restioirng the
high partition.

The only places where the FPU state is touched are in ptrace, in
switch_to(), and where handling a floating-point exception.  In switch_to()
preemption is off.  So it's only in trap.c and ptrace.c that we need to
prevent preemption.

Here is a patch that adds commentary to make the conditions clear, and adds
appropriate preempt_{en,dis}able() calls to make it so.  In trap.c I use
preempt_enable_no_resched(), as we're about to return to user space where
the preemption flag will be checked anyway.

Signed-off-by: Peter Chubb <peterc@gelato.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-08 16:21:14 -07:00
Keith Owens 70aa488cff [IA64] Extract correct break number for break.b
break.b does not store the break number in cr.iim, instead it stores 0,
which makes all break.b instructions look like BUG().  Extract the
break number from the instruction itself.

Signed-off-by: Keith Owens <kaos@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-06-08 12:25:24 -07:00
Tony Luck 86ebacd360 [IA64] Update comment to describe modes set in default control register.
Christian Hildner pointed out that the comment did not match what the
code does in cpu_init() when we set up the default control register.
Patch based on suggestions from Ken Chen.

Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-06-08 12:12:48 -07:00
Keith Owens 866ba633a8 [IA64] Module gp must point to valid memory
Some bits of the kernel assume that gp always points to valid memory,
in particular PHYSICAL_MODE_ENTER() assumes that both gp and sp are
valid virtual addresses with associated physical pages.  The IA64
module loader puts gp well past the end of the module, with no physical
backing.  Offsets on gp are still valid, but physical mode addressing
breaks for modules.  Ensure that gp always falls within the module
body.  Also ensure that gp is 8 byte aligned.

Signed-off-by: Keith Owens <kaos@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-06-08 11:41:31 -07:00
Peter Chubb b655913bf3 [IA64] Cleanup compile warnings for ski config
The attached patch cleans up a compilation warning when ACPI
is turned off (i.e., when compiling for the Ski simulator).

Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-06-01 15:20:17 -07:00
Tony Luck fffcc150a2 [IA64] Use "PER_CPU" form of EXPORT macro
I was gently reminded that there are per-cpu forms of the EXPORT_SYMBOL macros.

Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-05-31 10:38:32 -07:00
Zhang Yanmin d11cf326bd [IA64] sys_mmap doesn't follow posix.1 when parameter len=0
In IA64 kernel, sys_mmap calls do_mmap2 and do_mmap2 returns addr if
len=0, which means the mmap sys call succeeds.

Posix.1 says:
The mmap() function shall fail if:
[EINVAL] The value of len is zero. 

Here is a patch to fix it.

Signed-off-by: Zhang Yanmin <yanmin.zhang@intel.com>
Acked-by: David Mosberger <davidm@hpl.hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-05-26 10:19:07 -07:00
Tony Luck fe12e25ebd [IA64] initialize spinlock pfm_alt_install_check
I applied the penultimate version of the perfmon patch, which didn't have
the initialization of the new spinlock that was added.

Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-05-18 17:09:06 -07:00
Tony Luck a1ecf7f6e6 [IA64] alternate perfmon handler
Patch from Charles Spirakis

Some linux customers want to optimize their applications on the latest
hardware but are not yet willing to upgrade to the latest kernel. This
patch provides a way to plug in an alternate, basic, and GPL'ed PMU
subsystem to help with their monitoring needs or for specialty work. It
can also be used in case of serious unexpected bugs in perfmon. Mutual
exclusion between the two subsystems is guaranteed, hence no conflict
can arise from both subsystem being present.

Acked-by: Stephane Eranian <eranian@hpl.hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-05-18 16:14:30 -07:00
Russ Anderson bb68c12b40 [IA64-SGI] cpe interrupts are not being enabled.
acpi_request_vector() is called in ia64_mca_init() to get the cpe_vector.
The problem is that acpi_request_vector() looks in platform_intr_list[] to 
get the vector, but platform_intr_list[] is not initialized with a valid
vector until later (in sn_setup()).  Without a valid vector the code
defaults to polling mode.

This patch moves the call to acpi_request_vector() from ia64_mca_init()
to ia64_mca_late_init(), which is after platform_intr_list[] is initialized.

Signed-off-by: Russ Anderson (rja@sgi.com)
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-05-17 12:52:43 -07:00
David Mosberger-Tang 02a017a9f3 [IA64] Correct convert_to_non_syscall()
convert_to_non_syscall() has the same problem that unwind_to_user()
used to have.  Fix it likewise.

Signed-off-by: David Mosberger-Tang <davidm@hpl.hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-05-17 12:33:15 -07:00
David Mosberger-Tang bfd6859408 [IA64] Avoid .spillpsp directive in handcoded assembly
Some time ago, GAS was fixed to bring the .spillpsp directive in line
with the Intel assembler manual (there was some disagreement as to
whether or not there is a built-in 16-byte offset).  Unfortunately,
there are two places in the kernel where this directive is used in
handwritten assembly files and those of course relied on the "buggy"
behavior.  As a result, when using a "fixed" assembler, the kernel
picks up the UNaT bits from the wrong place (off by 16) and randomly
sets NaT bits on the scratch registers.  This can be noticed easily by
looking at a coredump and finding various scratch registers with
unexpected NaT values.  The patch below fixes this by using the
.spillsp directive instead, which works correctly no matter what
assembler is in use.

Signed-off-by: David Mosberger-Tang <davidm@hpl.hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-05-10 13:52:00 -07:00
David Mosberger-Tang 66302f211a [IA64] fix "section mismatch" compile-time-error
I noticed this typo when trying to compile a kernel which had
CONFIG_HOTPLUG turned off.  In that case, __devinit is no longer a
no-op and the compiler then detects a section-conflict.  Fix by using
__devinitdata instead of __devinit.

Same patch also submitted by Darren Williams to fix compilation error
using sim_defconfig (which has CONFIG_HOTPLUG=n).

Signed-off-by: David Mosberger-Tang <davidm@hpl.hp.com>
Signed-off-by:  Darren Williams <dsw@gelato.unsw.edu.au>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-05-09 10:16:17 -07:00
David Mosberger-Tang 966dc11fcc [IA64] Fix stack placement when INIT hits in kernel mode.
Without this patch, the stack is placed _below_ the current task
structure, which is risky at best.

Tony, I think this patch needs to go into 2.6.12, since it fixes a
real bug.  Without it, INIT may case secondary errors, which would be
most unpleasant.

Signed-off-by: David Mosberger-Tang <davidm@hpl.hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-05-06 10:16:07 -07:00
David Woodhouse bfd4bda097 Merge with master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6.git 2005-05-05 13:59:37 +01:00
Tony Luck a71f62edc9 [IA64] Fix two warnings introduced by perfmon patches.
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-05-03 16:21:45 -07:00
stephane eranian a5a70b75d9 [IA64] another perfmon fix (take2)
- pfm_context_load(): change return value from EINVAL to EBUSY
  when context is already loaded.

- pfm_check_task_state(): pass test if context state is MASKED.
  It is safe to give access on PFM_CTX_MASKED because the PMU
  state (PMD) is stable and saved in software state.
  This helps multiplexing programs such as the example given
  in libpfm-3.1.

Signed-off-by: stephane eranian <eranian@hpl.hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-05-03 15:47:58 -07:00
Stephane Eranian 8df5a500a3 [IA64] perfmon & PAL_HALT again
The pmu_active test is based on the values of PSR.up. THIS IS THE PROBLEM as
it does not take into account the lazy restore logic which is as follow (simplified):

context switch out:
	save PMDs
	clear psr.up
	release ownership

context switch in:
	if (ctx->last_cpu == smp_processor_id() && ctx->cpu_activation == cpu_activation) {
		set psr.up
		return
	}
	restore PMD
	restore PMC
	ctx->last_cpu   = smp_processor_id();
	ctx->activation = ++cpu_activation;
	set psr.up

The key here is that on context switch out, we clear psr.up and on context switch in
we check if nobody else used the PMU on that processor since last time we came. In
that case, we assume the PMD/PMC are ours and we simply reactivate.

The Caliper problem is that between the moment we context switch out and the moment we
come back, nobody effectively used the PMU BUT the processor went idle. Normally this
would have no incidence but PAL_HALT does alter the PMU registers.  In default_idle(),
the test on psr.up is not strong enough to cover this case and we go into PAL which
trashed the PMU resgisters. When we come back we falsely assume that this is our state
yet it is corrupted. Very nasty indeed.

To avoid the problem it is necessary to forbid going to PAL_HALT as soon as perfmon
installs some valid state in the PMU registers. This happens with an application
attaches a context to a thread or CPU. It is not enough to check the psr/dcr bits.
Hence I propose the attached patch. It adds a callback in process.c to modify the
condition to enter PAL on idle. Basically, now it is conditional to pal_halt=1 AND
perfmon saying it is okay.

Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-05-03 15:44:48 -07:00
Russ Anderson b1b901c202 [IA64] MCA recovery improvements
Jack Steiner uncovered some opportunities for improvement in
the MCA recovery code.

  1) Set bsp to save registers on the kernel stack.
  2) Disable interrupts while in the MCA recovery code.
  3) Change the way the user process is killed, to avoid 
     a panic in schedule.

Testing shows that these changes make the recovery code much 
more reliable with the 2.6.12 kernel.

Signed-off-by: Russ Anderson <rja@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-05-03 13:47:42 -07:00
David Woodhouse 446b8831f5 [IA64] fix ia64 syscall auditing
Attached is a patch against David's audit.17 kernel that adds checks
for the TIF_SYSCALL_AUDIT thread flag to the ia64 system call and
signal handling code paths.  The patch enables auditing of system
calls set up via fsys_bubble_down, as well as ensuring that
audit_syscall_exit() is called on return from sigreturn.

Neglecting to check for TIF_SYSCALL_AUDIT at these points results in
incorrect information in audit_context, causing frequent system panics
when system call auditing is enabled on an ia64 system.

I have tested this patch and have seen no problems with it.

[Original patch from Amy Griffis ported to current kernel by David Woodhouse]

From: Amy Griffis <amy.griffis@hp.com>
From: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Chris Wright <chrisw@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-05-03 13:45:39 -07:00
Zwane Mwaikambo 7d5f9c0f10 [IA64] reduce cacheline bouncing in cpu_idle_wait
Andi noted that during normal runtime cpu_idle_map is bounced around a lot,
and occassionally at a higher frequency than the timer interrupt wakeup
which we normally exit pm_idle from.  So switch to a percpu variable.

I didn't move things to the slow path because it would involve adding
scheduler code to wakeup the idle thread on the cpus we're waiting for.

Signed-off-by: Zwane Mwaikambo <zwane@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-05-03 13:40:18 -07:00
Alex Williamson bb0fc08545 [IA64] use common pxm function
This patch simplifies a couple places where we search for _PXM
values in ACPI namespace.  Thanks,

Signed-off-by: Alex Williamson <alex.williamson@hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-05-03 13:33:18 -07:00
David Mosberger-Tang 9df6f705c0 [IA64] fix typos caught by new assembler
Patch below fixes 3 trivial typos which are caught by the new
assembler (v2.169.90).  Please apply.

[Note: fix to memcpy that was also part of this patch was separately
 applied from patches by H.J. and Andreas ... so the delta here only
 has the other two fixes. -Tony]

Signed-off-by: David Mosberger-Tang <davidm@hpl.hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-05-03 10:56:42 -07:00
David Woodhouse 27b030d58c Merge with master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6.git 2005-05-03 08:14:09 +01:00
Jesper Juhl 7ed20e1ad5 [PATCH] convert that currently tests _NSIG directly to use valid_signal()
Convert most of the current code that uses _NSIG directly to instead use
valid_signal().  This avoids gcc -W warnings and off-by-one errors.

Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-01 08:59:14 -07:00
Stephen Rothwell 7d87e14c23 [PATCH] consolidate sys_shmat
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-01 08:59:12 -07:00
Amy Griffis 3ac3ed555b [PATCH] fix ia64 syscall auditing
Attached is a patch against David's audit.17 kernel that adds checks
for the TIF_SYSCALL_AUDIT thread flag to the ia64 system call and
signal handling code paths.The patch enables auditing of system
calls set up via fsys_bubble_down, as well as ensuring that
audit_syscall_exit() is called on return from sigreturn.

Neglecting to check for TIF_SYSCALL_AUDIT at these points results in
incorrect information in audit_context, causing frequent system panics
when system call auditing is enabled on an ia64 system.

Signed-off-by: Amy Griffis <amy.griffis@hp.com>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2005-04-29 16:12:55 +01:00
2fd6f58ba6 [AUDIT] Don't allow ptrace to fool auditing, log arch of audited syscalls.
We were calling ptrace_notify() after auditing the syscall and arguments,
but the debugger could have _changed_ them before the syscall was actually
invoked. Reorder the calls to fix that.

While we're touching ever call to audit_syscall_entry(), we also make it
take an extra argument: the architecture of the syscall which was made,
because some architectures allow more than one type of syscall.

Also add an explicit success/failure flag to audit_syscall_exit(), for
the benefit of architectures which return that in a condition register
rather than only returning a single register.

Change type of syscall return value to 'long' not 'int'.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2005-04-29 16:08:28 +01:00
Kenji Kaneshige b9e41d7fb6 [IA64] iosapic.c: typo ... s/spin_unlock_irq/spin_unlock/
vector sharing patch had a typo ... mismatched spin_lock() with
a spin_unlock_irq().  Fix from Kenji Kaneshige.

Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-04-25 13:27:48 -07:00
Tony Luck e1ed81ab7a [IA64] print "siblings" before {physical,core,thread} id
Rohit and Suresh changed their mind about the order to print things
in /proc/cpuinfo, but didn't include the change in the version of
the patch they sent to me.

Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-04-25 13:27:12 -07:00
Kenji Kaneshige 24eeb568ae [IA64] vector sharing (Large I/O system support)
Current ia64 linux cannot handle greater than 184 interrupt sources
because of the lack of vectors. The following patch enables ia64 linux
to handle greater than 184 interrupt sources by allowing the same
vector number to be shared by multiple IOSAPIC's RTEs. The design of
this patch is besed on "Intel(R) Itanium(R) Processor Family Interrupt
Architecture Guide".

Even if you don't have a large I/O system, you can see the behavior of
vector sharing by changing IOSAPIC_LAST_DEVICE_VECTOR to fewer value.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-04-25 13:26:23 -07:00
Suresh Siddha e927ecb05e [IA64] multi-core/multi-thread identification
Version 3 - rediffed to apply on top of Ashok's hotplug cpu
patch.  /proc/cpuinfo output in step with x86.

This is an updated MC/MT identification patch based on the 
previous discussions on list. 

Add the Multi-core and Multi-threading detection for IPF.
  - Add new core and threading related fields in /proc/cpuinfo.
		Physical id
		Core id
		Thread id
		Siblings
  - setup the cpu_core_map and cpu_sibling_map appropriately
  - Handles Hot plug CPU
 
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Gordon Jin <gordon.jin@intel.com>
Signed-off-by: Rohit Seth <rohit.seth@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-04-25 13:25:06 -07:00
David Mosberger-Tang a37d98f6a9 [IA64] fix syscall-optimization goof
Sadly, I goofed in this syscall-tuning patch:

ChangeSet 1.1966.1.40 2005/01/22 13:31:05 davidm@hpl.hp.com
  [IA64] Improve ia64_leave_syscall() for McKinley-type cores.

  Optimize ia64_leave_syscall() a bit better for McKinley-type cores.
  The patch looks big, but that's mostly due to renaming r16/r17 to r2/r3.
  Good for a 13 cycle improvement.

The problem is that the size of the physical stacked registers was
loaded into the wrong register (r3 instead of r17).  Since r17 by
coincidence always had the value 1, this had the effect of turning
rse_clear_invalid into a no-op.  That poses the risk of leaking kernel
state back to user-land and is hence not acceptable.

The fix below is simple, but unfortunately it costs us about 28 cycles
in syscall overhead. ;-(

Unfortunately, there isn't much we can do about that since those
registers have to be cleared one way or another.

	--david

Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-04-25 13:20:38 -07:00
Stephane Eranian 4944930ab7 [IA64] perfmon: make pfm_sysctl a global, and other cleanup
- make pfm_sysctl a global such that it is possible
  to enable/disable debug printk in sampling formats
  using PFM_DEBUG.

- remove unused pfm_debug_var variable

- fix a bug in pfm_handle_work where an BUG_ON() could
  be triggered. There is a path where pfm_handle_work()
  can be called with interrupts enabled, i.e., when
  TIF_NEED_RESCHED is set. The fix correct the masking
  and unmasking of interrupts in pfm_handle_work() such
  that we restore the interrupt mask as it was upon entry.

signed-off-by: stephane eranian <eranian@hpl.hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-04-25 13:08:30 -07:00
David Mosberger-Tang 30325d1771 [IA64] speed up syscall path a bit more
Recently I noticed that clearing ar.ssd/ar.csd right before srlz.d is
causing significant stalling in the syscall path.  The patch below
fixes that by moving the register-writes after srlz.d.  On a Madison,
this drops break-based getpid() from 241 to 226 cycles (-15 cycles).

Signed-off-by: David Mosberger-Tang <davidm@hpl.hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-04-25 13:03:16 -07:00
Keith Owens e8d1cb2f28 [IA64] Tighten up unw_unwind_to_user check
Detect user space by the unwind frame with predicate PRED_USER_STACK
set, instead of a user space IP.  Tighten up the last ditch check for
running off the top of the kernel stack.

Based on a suggestion by David Mosberger, reworked to fit the current
tree.  This survives my stress test which used to break 2.6.9 kernels.
Unlike 2.6.11, the stress test now unwinds to the correct point, so
gdb can get the user space registers.

Signed-off-by: Keith Owens <kaos@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-04-25 11:45:26 -07:00
David Mosberger-Tang 8297511530 [IA64] add missing cpu_relax() in ITC syncing code
Call cpu_relax() in busy-waiting loops of the ITC-syncing code.

Signed-off-by: David Mosberger-Tang <davidm@hpl.hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-04-25 11:44:02 -07:00
Ashok Raj df6c6804ce [IA64] Fix build errors for !HOTPLUG case.
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-04-22 14:46:24 -07:00
Ashok Raj b8d8b883e6 [IA64] cpu hotplug: return offlined cpus to SAL
This patch is required to support cpu removal for IPF systems. Existing code
just fakes the real offline by keeping it run the idle thread, and polling
for the bit to re-appear in the cpu_state to get out of the idle loop.

For the cpu-offline to work correctly, we need to pass control of this CPU 
back to SAL so it can continue in the boot-rendez mode. This gives the
SAL control to not pick this cpu as the monarch processor for global MCA
events, and addition does not wait for this cpu to checkin with SAL
for global MCA events as well. The handoff is implemented as documented in 
SAL specification section 3.2.5.1 "OS_BOOT_RENDEZ to SAL return State"

Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-04-22 14:44:40 -07:00
Linus Torvalds 1da177e4c3 Linux-2.6.12-rc2
Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.

Let it rip!
2005-04-16 15:20:36 -07:00