Commit Graph

123 Commits

Author SHA1 Message Date
Stuart Menefy d3ea9fa0a5 sh: Minor optimisations to FPU handling
A number of small optimisations to FPU handling, in particular:

 - move the task USEDFPU flag from the thread_info flags field (which
   is accessed asynchronously to the thread) to a new status field,
   which is only accessed by the thread itself. This allows locking to
   be removed in most cases, or can be reduced to a preempt_lock().
   This mimics the i386 behaviour.

 - move the modification of regs->sr and thread_info->status flags out
   of save_fpu() to __unlazy_fpu(). This gives the compiler a better
   chance to optimise things, as well as making save_fpu() symmetrical
   with restore_fpu() and init_fpu().

 - implement prepare_to_copy(), so that when creating a thread, we can
   unlazy the FPU prior to copying the thread data structures.

Also make sure that the FPU is disabled while in the kernel, in
particular while booting, and for newly created kernel threads,

In a very artificial benchmark, the execution time for 2500000
context switches was reduced from 50 to 45 seconds.

Signed-off-by: Stuart Menefy <stuart.menefy@st.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-11-24 17:45:38 +09:00
Paul Mundt 49fb2cd257 Merge branch 'master' into sh/st-integration 2009-11-24 16:32:11 +09:00
Giuseppe CAVALLARO a0458b07c1 sh: add sleazy FPU optimization
sh port of the sLeAZY-fpu feature currently implemented for some architectures
such us i386.

Right now the SH kernel has a 100% lazy fpu behaviour.
This is of course great for applications that have very sporadic or no FPU use.
However for very frequent FPU users...  you take an extra trap every context
switch.
The patch below adds a simple heuristic to this code: after 5 consecutive
context switches of FPU use, the lazy behavior is disabled and the context
gets restored every context switch.
After 256 switches, this is reset and the 100% lazy behavior is returned.

Tests with LMbench showed no regression.
I saw a little improvement due to the prefetching (~2%).

The tests below also show that, with this sLeazy patch, indeed,
the number of FPU exceptions is reduced.
To test this. I hacked the lat_ctx LMBench to use the FPU a little more.

   sLeasy implementation
   ===========================================
   switch_to calls            |  79326
   sleasy   calls             |  42577
   do_fpu_state_restore  calls|  59232
   restore_fpu   calls        |  59032

   Exceptions:  0x800 (FPU disabled  ): 16604

   100% Leazy (default implementation)
   ===========================================
   switch_to  calls            |  79690
   do_fpu_state_restore calls  |  53299
   restore_fpu  calls          |   53101

   Exceptions: 0x800 (FPU disabled  ):  53273

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Stuart Menefy <stuart.menefy@st.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-11-24 16:23:38 +09:00
Paul Mundt c4e708dc52 sh: Fix up the CONFIG_PERF_EVENTS=n build for SH-4.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-11-12 16:20:36 +09:00
Paul Mundt 1d823323f2 sh: perf events: Add support for SH7750-style counters.
This adds perf events support for the SH7750/SH7750S/SH7091 performance
counters.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-11-05 17:02:03 +09:00
Paul Mundt ac6a0cf671 Merge branch 'master' into sh/smp
Conflicts:
	arch/sh/mm/cache-sh4.c
2009-09-01 13:54:14 +09:00
Kuninori Morimoto b37c7c66f0 sh: fix CPU_SH7723/7724 numbering bug
Signed-off-by: Kuninori Morimoto <morimoto.kuninori@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-08-27 11:35:46 +09:00
Yoshihiro Shimoda c01f0f1a4a sh: Add initial support for SH7757 CPU subtype
Signed-off-by: Yoshihiro Shimoda <shimoda.yoshihiro@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-08-21 17:25:47 +09:00
Paul Mundt eccee7457d sh: Kill off the unhandled pvr case in SH-4 CPU probing.
This is superfluous, as the default CPU type and family are already
established by the initial cpuinfo definition. Given that we are still
able to probe for the CPU family even if we are not able to detect the
subtype, it's preferable to let the probing code fill out what it can and
leave the rest.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-08-15 13:15:02 +09:00
Paul Mundt e82da214d2 sh: Track the CPU family in sh_cpuinfo.
This adds a family member to struct sh_cpuinfo, which allows us to fall
back more on the probe routines to work out what sort of subtype we are
running on. This will be used by the CPU cache initialization code in
order to first do family-level initialization, followed by subtype-level
optimizations.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-08-15 10:48:13 +09:00
Magnus Damm 955c9863bb sh: convert processor device setup functions to arch_initcall()
Convert the processor platform device setup
functions from __initcall() and sometimes
device_initcall() to arch_initcall().

This makes sure that the platform devices are
registered a bit earlier so the devices are
available when drivers register using initcall
levels earlier than device_initcall().

A good example is platform devices needed by
i2c-sh_mobile.c which registers a bit earlier
using subsys_initcall().

Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-07-23 13:06:07 +09:00
Paul Mundt 0bf8513ed0 sh: Tidy up SH-4A boot_cpu_data.flags probing.
This tidies up the boot_cpu_data.flags probing on SH-4A. All of them have
a few things in common, which we can blindly set, rather than having each
subtype have to set the same flags. We can also make assumptions about
cache ways and the validity of PTEA, so this also kills off CPU_HAS_PTEA
as a config option. There was also a bug in the FPU probing, which is now
tidied up.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-06-01 19:50:08 +09:00
Paul Mundt 7863d3f7ae sh: Tidy up the optional L2 probing, wire it up for SH7786.
This tidies up the L2 probing, as it may or may not be implemented on a
CPU, regardless of whether it is supported. This converts the cvr
validity checks from BUG_ON()'s to simply clearing the CPU_HAS_L2_CACHE
flag and moving on with life.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-06-01 19:38:41 +09:00
Kuninori Morimoto 98fbe45bea sh: SH7724 has an L2 cache.
Add the CPU_HAS_L2_CACHE flag to SH7724.

Signed-off-by: Kuninori Morimoto <morimoto.kuninori@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-06-01 15:59:03 +09:00
Paul Mundt 253b0887b3 sh: clkfwk: Rework legacy CPG clock handling.
This moves out the old legacy CPG clocks to their own file, and converts
over the existing users. With these clocks going away and each CPU
dealing with them on their own, CPUs can gradually move over to the new
interface.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-05-13 17:38:11 +09:00
Paul Mundt af777ce42d sh: clkfwk: module_clk -> peripheral_clk rename.
For consistenct naming, and to allow us to fix up some confusion in the
SH-Mobile clock framework, amongst other places.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-05-13 16:59:40 +09:00
Paul Mundt fd5b12458b Merge branch 'master' into sh/clkfwk 2009-05-12 19:54:36 +09:00
Magnus Damm 67d889bd82 sh: add sh4-202 INTC tables
This patch adds INTC tables for sh4-202 with support
for HUDI, TMU0, TMU1, TMU2, RTC, SCIF and WDT.

Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-05-12 19:37:26 +09:00
Magnus Damm 5f8a29ba39 sh: TMU platform data for sh4-202
This patch adds TMU platform data for sh4-202. Both clockevent
and clocksource support is enabled.

Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-05-12 19:37:21 +09:00
Paul Mundt 9fe5ee0efb sh: clkfwk: Use arch_clk_init() for on-chip clock registration.
CPUs registering on-chip clocks should be using arch_clk_init() with the
new scheme so that the CPUs have the opportunity to establish the
topology prior to the initial root clock rate propagation. This ensures
that CPUs with on-chip clocks that use CLK_ENABLE_ON_INIT are properly
enabled at the initial propagation time, without having to further poke
the root clocks.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-05-12 19:29:04 +09:00
Paul Mundt f5c84cf508 sh: clkfwk: Tidy up on-chip clock registration and rate propagation.
This tidies up the set_rate hack that the on-chip clocks were abusing to
trigger rate propagation, which is now handled generically.

Additionally, now that CLK_ENABLE_ON_INIT is wired up where it needs to
be for these clocks, the clk_enable() can go away. In some cases this was
bumping up the refcount higher than it should have been.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-05-12 05:59:27 +09:00
Paul Mundt 4ff29ff8e8 sh: clkfwk: Consolidate the ALWAYS_ENABLED / NEEDS_INIT mess.
There is no real distinction here in behaviour, either a clock needs to
be enabled on initialiation or not. The ALWAYS_ENABLED flag was always
intended to only apply to clocks that were physically always on and could
simply not be disabled at all from software. Unfortunately over time this
was abused and the meaning became a bit blurry.

So, we kill off both of all of those paths now, as well as the newer
NEEDS_INIT flag, and consolidate on a CLK_ENABLE_ON_INIT. Clocks that
need to be enabled on initialization can set this, and it will purposely
enable them and bump the refcount up.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-05-12 05:14:53 +09:00
Paul Mundt b68d820143 sh: clkfwk: Make recalc return an unsigned long.
This is prep work for cleaning up some of the rate propagation bits.
Trivial conversion.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-05-12 03:45:08 +09:00
Magnus Damm 3d6ad46021 sh: multiple vectors per irq - sh7760
Update intc tables and platform data to use one linux irq
per maskable interrupt source instead of keeping the one-to-one
mapping between vectors and linux irqs.

This fixes potential irq masking issues for sh7760 hardware
blocks such as DMAC/TMU2/REF.

Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-05-11 21:59:58 +09:00
Magnus Damm c42f32dca3 sh: TMU platform data for sh7760
This patch adds TMU platform data for sh7760. Both clockevent
and clocksource support is enabled.

Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-05-11 18:45:52 +09:00
Magnus Damm 03f408f1aa sh: TMU platform data for sh775x
This patch adds TMU platform data for sh775x. Both clockevent
and clocksource support is enabled.

Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-05-11 18:45:49 +09:00
Kuninori Morimoto 0207a2efb4 sh: Add support for SH7724 (SH-Mobile R2R) CPU subtype.
This implements initial support for the SH-Mobile R2R CPU.
Based on Rev 0.11 of the initial SH7724 hardware manual.

Signed-off-by: Kuninori Morimoto <morimoto.kuninori@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-04-16 14:40:56 +09:00
Paul Mundt e8208828dc sh: Kill off broken direct-mapped cache mode.
Forcing direct-mapped worked on certain older 2-way set associative
parts, but was always error prone on 4-way parts. As these are the
norm these days, there is not much point in continuing to support this
mode. Most of the folks that used direct-mapped mode generally just
wanted writethrough caching in the first place..

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-04-02 17:40:16 +09:00
Paul Mundt 8263a67e16 sh: Support for extended ASIDs on PTEAEX-capable SH-X3 cores.
This adds support for extended ASIDs (up to 16-bits) on newer SH-X3 cores
that implement the PTAEX register and respective functionality. Presently
only the 65nm SH7786 (90nm only supports legacy 8-bit ASIDs).

The main change is in how the PTE is written out when loading the entry
in to the TLB, as well as in how the TLB entry is selectively flushed.

While SH-X2 extended mode splits out the memory-mapped U and I-TLB data
arrays for extra bits, extended ASID mode splits out the address arrays.
While we don't use the memory-mapped data array access, the address
array accesses are necessary for selective TLB flushes, so these are
implemented newly and replace the generic SH-4 implementation.

With this, TLB flushes in switch_mm() are almost non-existent on newer
parts.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-03-17 17:49:49 +09:00
Magnus Damm 2ef7f0dab6 sh: hibernation support
Add Suspend-to-disk / swsusp / CONFIG_HIBERNATION support
to the SuperH architecture.

To suspend, use "swapon /dev/sda2; echo disk > /sys/power/state"
To resume, pass "resume=/dev/sda2" on the kernel command line.

The patch "pm: rework includes, remove arch ifdefs V2" is
needed to allow the generic swsusp code to build properly.

Hibernation is not enabled with this patch though, a patch
setting ARCH_HIBERNATION_POSSIBLE will be submitted later.

Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-03-10 12:55:40 +09:00
Kuninori Morimoto 55ba99eb21 sh: Add support for SH7786 CPU subtype.
This adds preliminary support for the SH7786 CPU subtype.

While this is a dual-core CPU, only UP is supported for now. L2 cache
support is likewise not yet implemented.

More information on this particular CPU subtype is available at:

	http://www.renesas.com/fmwk.jsp?cnt=sh7786_root.jsp&fp=/products/mpumcu/superh_family/sh7780_series/sh7786_group/

Signed-off-by: Kuninori Morimoto <morimoto.kuninori@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-03-03 15:40:25 +09:00
Magnus Damm 69977e7e25 sh: multiple vectors per irq - sh7750
Update intc tables and platform data to use one linux irq
per maskable interrupt source instead of keeping the one-to-one
mapping between vectors and linux irqs.

This fixes potential irq masking issues for sh775x hardware
blocks such as SCI/SCIF/RTC/DMAC/TMU2/REF.

Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-02-27 16:53:50 +09:00
Carmelo AMOROSO 0f6dee232f sh: fcnvds fix with denormalized numbers on SH-4 FPU.
This fixes a bug in the FPU exception handler for the FCNVDS instruction.
To get the register number the instruction is shifted right by 9,
though it should be shifted right by 8.

More information at ST Linux bugzilla:

	https://bugzilla.stlinux.com/show_bug.cgi?id=4892

Signed-off-by: Giuseppe Di Giore <giuseppe.di-giore@st.com>
Signed-off-by: Carmelo Amoroso <carmelo.amoroso@st.com>
Signed-off-by: Stuart Menefy <stuart.menefy@st.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-01-29 11:56:02 +09:00
Paul Mundt e9bf51e5cc sh: __udivdi3 -> do_div() in softfloat lib.
Inhibit the generation of __udivdi3 for the softfloat lib, use do_div()
outright.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2008-12-22 18:42:53 +09:00
Luca Santini 53abf911fa sh: Enable IRLM mode for SH7760 IRQ_MODE_IRQ.
Follows the same setting as SH7750.

Signed-off-by: Luca Santini <luca.santini@spesonline.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2008-09-08 11:54:56 +09:00
Carl Shaw b6ad1e8c3f sh: Subnormal double to float conversion
This patch adds support for the SH4 to convert a subnormal double
into a float by catching the FPE and implementing the FCNVDS
instruction in software.

Signed-off-by: Carl Shaw <carl.shaw@st.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2008-09-08 10:35:05 +09:00
Paul Mundt 6a9545bd95 sh: Fix up broken kerneldoc comments.
These were completely unparseable, so fix them up.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2008-08-04 12:51:06 +09:00
Paul Mundt f15cbe6f1a sh: migrate to arch/sh/include/
This follows the sparc changes a439fe51a1.

Most of the moving about was done with Sam's directions at:

http://marc.info/?l=linux-sh&m=121724823706062&w=2

with subsequent hacking and fixups entirely my fault.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2008-07-29 08:09:44 +09:00
Paul Mundt 068f59143d sh: Record the major cut revision for probed SH-4A parts.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2008-07-28 18:10:32 +09:00
Stuart Menefy 3611ee7acc sh: Stub in silicon cut in CPU info.
Signed-off-by: Stuart Menefy <stuart.menefy@st.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2008-07-28 18:10:32 +09:00
Magnus Damm b76baf4cf5 sh: add probe support for new sh7723 cut
This patch adds support for sh7723 silicon with a prr value of 0x51.

Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2008-05-23 11:53:20 +09:00
Paul Mundt 440fc172ae sh: Fix up L2 cache probe.
SH7723 is the first hard silicon to implement the L2, and unsurprisingly,
does the precise inverse of what the specification alleges. XOR the
URAM/L2 size bits to get back in line with the existing parsing logic.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2008-04-18 09:50:07 -07:00
Paul Mundt e5a4c65bef sh: Fix up SH-4A part probe.
The SH-4A series probe we were relying on doesn't work any more on the
newer parts, bump this up to use CVR.CHIP instead so we have consistent
behaviour across all of the parts, which is what this should have been
testing in the first place.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2008-04-18 09:50:07 -07:00
Paul Mundt 178dd0cd28 sh: Add support for SH7723 CPU subtype.
This adds basic support for the SH7723 MobileR2 CPU.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2008-04-18 09:50:07 -07:00
Paul Mundt 9bbafce2ee sh: Fix occasional FPU register corruption under preempt.
Presently with preempt enabled there's the possibility to be preempted
after the TIF_USEDFPU test and the register save, leading to bogus
state post-__switch_to(). Use an explicit preempt_disable()/enable()
pair around unlazy_fpu()/clear_fpu() to avoid this. Follows the x86
change.

Reported-by: Takuo Koguchi <takuo.koguchi.sw@hitachi.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2008-03-26 19:02:47 +09:00
Harvey Harrison 866e6b9e50 sh: replace remaining __FUNCTION__ occurrences
__FUNCTION__ is gcc-specific, use __func__

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2008-03-06 11:18:22 +09:00
Paul Mundt 96de1a8f02 serial: Move asm-sh/sci.h to linux/serial_sci.h.
This header is needed on other architectures as well (namely h8300),
which currently fails to build without this in place. Rather than
duplicating the port definition completely there, just move this to a
common location instead.

This should get h8300 working again for 2.6.25, in addition to the
changes already pushed by Sato-san in -rc2.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2008-02-26 14:52:45 +09:00
Magnus Damm 9109a30e5a sh: add support for sh7366 processor
This patch adds sh7366 cpu supports. Just the most basic things like interrupt
controller, clocks and serial port are included at this point.

Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2008-02-14 14:22:10 +09:00
Magnus Damm 1cfb629cfa sh: add probe support for new sh7722 cut
This patch adds support for sh7722 devices with prr value 0xa1.

Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2008-02-14 14:22:07 +09:00
Alexey Dobriyan 25478445c4 Fix container_of() usage
Using "attr" twice is not OK, because it effectively prohibits such
container_of() on variables not named "attr".

Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08 09:22:32 -08:00