These should not be in the Git history - they are auto-generated.
Extend the Makefile rules of the parser files to include the generation
run.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20120327183335.GA27621@gmail.com
[ committer note: Fixed up O= handling ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
kernel.org is hosting patches and kernel compressed with xz (lzma2+).
Allow scripts/patch-kernel to decompress these files.
Signed-off-by: Shawn Landden <shawnlandden@gmail.com>
Signed-off-by: Michal Marek <mmarek@suse.cz>
Provide a -r option to display when fragments contain redundant
options. This is really useful when breaking apart a config into
fragments, as well as cleaning up older fragments.
Signed-off-by: John Stultz <john.stultz@linaro.org>
Acked-by: Darren Hart <dvhart@linux.intel.com>
Acked-by: Bruce Ashfield <bruce.ashfield@windriver.com>
Signed-off-by: Michal Marek <mmarek@suse.cz>
Somehow the merge_config.sh script didn't get its execute bit
set when it was merged. Fix this.
Signed-off-by: John Stultz <john.stultz@linaro.org>
Acked-by: Darren Hart <dvhart@linux.intel.com>
Acked-by: Bruce Ashfield <bruce.ashfield@windriver.com>
Signed-off-by: Michal Marek <mmarek@suse.cz>
ACPI 5.0 adds the BGRT, a table that contains a pointer to the firmware
boot splash and associated metadata. This simple driver exposes it via
/sys/firmware/acpi in order to allow bootsplash applications to draw their
splash around the firmware image and reduce the number of jarring graphical
transitions during boot.
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Drivers may wish to add entries to /sys/firmware/acpi, so export acpi_kobj
in order to let them do that.
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Make sure the removal of mappings uses the same logic that put the
mappings in place.
Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Len Brown <len.brown@intel.com>
The function apei_estatus_print() and apei_estatus_check() forget to move ahead
the gdata pointer when dealing with multiple generic error data sections.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Signed-off-by: Len Brown <len.brown@intel.com>
The acpi_processor_cst_has_changed() function is invoked from a
CPU_ONLINE or CPU_DEAD function, which might well execute on CPU 0
even though the CPU being hotplugged is some other CPU. In addition,
acpi_processor_cst_has_changed() invokes smp_processor_id() without
protection, resulting in splats when onlining CPUs.
This commit therefore changes the smp_processor_id() to pr->id, as is
used elsewhere in the code, for example, in acpi_processor_add().
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Tested-by: Yong Zhang <yong.zhang0@gmail.com>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Len Brown <len.brown@intel.com>
... so that acpi_unmap()'s behavior gets in sync with acpi_map()'s.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Len Brown <len.brown@intel.com>
During testing pci root bus removal, found some root bus bridge is not freed.
If booting with pnpacpi=off, those hostbridge could be freed without problem.
It turns out that some devices reference are not released during acpi_pnp_match.
that match should not hold one device ref during every calling.
Add pu_device calling before returning.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Len Brown <len.brown@intel.com>
When processor is being hot-added to the system, acpi_map_lsapic invokes
ACPI _MAT method to find APIC ID and flags, verifies that returned structure
is indeed ACPI's local APIC structure, and that flags contain MADT_ENABLED
bit. Then saves APIC ID, frees structure - and accesses structure when
computing arguments for acpi_register_lapic call. Which sometime leads
to acpi_register_lapic call being made with second argument zero, failing
to bring processor online with error 'Unable to map lapic to logical cpu
number'.
As lapic->lapic_flags & ACPI_MADT_ENABLED was already confirmed to be non-zero
few lines above, we can just pass unconditional ACPI_MADT_ENABLED to the
acpi_register_lapic.
Signed-off-by: Petr Vandrovec <petr@vmware.com>
Signed-off-by: Alok N Kataria <akataria@vmware.com>
Reviewed-by: Toshi Kani <toshi.kani@hp.com>
Signed-off-by: Len Brown <len.brown@intel.com>
The function acpi_processor_add is stored in the ops.add field of a
acpi_driver structure. This function is then called in
acpi_bus_driver_init. On failure, this function clears the field
device->driver_data, but does not free its contents. Thus the free has to
be done by the add function. In acpi_processor_add, the corresponding
value is pr. This value is currently freed on failure before storing it in
device->driver_data, but not after. This free is added in the error
handling code at the end of the function. The per_cpu variable
processors is also cleared so that it does not refer to a dangling pointer.
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Acked-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Signed-off-by: Len Brown <len.brown@intel.com>
The current code incorrectly assumes that
(1) the APEI register bit width is always 8, 16, 32, or 64 and
(2) the APEI register bit width is always equal to the APEI
register access width.
ERST serialization instructions entries such as:
[030h 0048 1] Action : 00 [Begin Write Operation]
[031h 0049 1] Instruction : 03 [Write Register Value]
[032h 0050 1] Flags (decoded below) : 01
Preserve Register Bits : 1
[033h 0051 1] Reserved : 00
[034h 0052 12] Register Region : [Generic Address Structure]
[034h 0052 1] Space ID : 00 [SystemMemory]
[035h 0053 1] Bit Width : 03
[036h 0054 1] Bit Offset : 00
[037h 0055 1] Encoded Access Width : 03 [DWord Access:32]
[038h 0056 8] Address : 000000007F2D7038
[040h 0064 8] Value : 0000000000000001
[048h 0072 8] Mask : 0000000000000007
break this assumption by yielding:
[Firmware Bug]: APEI: Invalid bit width in GAR [0x7f2d7038/3/0]
I have found no ACPI specification requirements corresponding
with the above assumptions. There is even a good example in
the Serialization Instruction Entries section (ACPI 4.0 section
17.4,1.2, ACPI 4.0a section 2.5.1.2, ACPI 5.0 section 18.5.1.2)
that mentions a serialization instruction with a bit range of
[6:2] which is 5 bits wide, _not_ 8, 16, 32, or 64 bits wide.
Compile and boot tested with 3.3.0-rc7 on a IBM HX5.
Signed-off-by: Gary Hade <garyhade@us.ibm.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Add description of parameter notrigger in the einj.txt.
One can utilize this new parameter to do some SRAR injection
test. Pay attention, the operation is highly depended on the
BIOS implementation. If no proper BIOS supports it, even if
enabling this parameter, expected result will not happen.
v2:
Update the documentation suggested by Tony
Suggested-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Some APEI firmware implementation will access injected address
specified in param1 to trigger the error when injecting memory
error, which means if one SRAR error is injected, the crash
always happens because it is executed in kernel context. This
new parameter can disable trigger action and control is taken
over by the user. In this way, an SRAR error can happen in user
context instead of crashing the system. This function is highly
depended on BIOS implementation so please ensure you know the
BIOS trigger procedure before you enable this switch.
v2:
notrigger should be created together with param1/param2
Tested-by: Tony Luck <tony.luck@lintel.com>
Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
On the platforms with ACPI4.x support, parameter extension
is not always doable, which means only parameter extension
is enabled, einj_param can take effect.
v2->v1: stopping early in einj_get_parameter_address for einj_param
Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
This fixes a trivial copy & paste error in ERST header length check.
It's just for future safety because sizeof(struct acpi_table_einj)
equals to sizeof(struct acpi_table_erst) with current ACPI5.0
specification. It applies to v3.3-rc6.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Acked-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
power_usage is always assigned a negative value and should be declared
a signed integer
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@amd.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Currently when a CPU is off-lined it enters either MWAIT-based idle or,
if MWAIT is not desired or supported, HLT-based idle (which places the
processor in C1 state). This patch allows processors without MWAIT
support to stay in states deeper than C1.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@amd.com>
Signed-off-by: Len Brown <len.brown@intel.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.15 (GNU/Linux)
iEYEABECAAYFAk91TL0ACgkQGkmNcg7/o7hEjwCgmuz6QQKkow7e5q0x7DR5Z2NH
1YoAn3TpODDmpaBiou26uMRPhcR6e1qC
=JCA0
-----END PGP SIGNATURE-----
Merge tag 'sh-for-linus' of git://github.com/pmundt/linux-sh
Pull SuperH updates from Paul Mundt.
* tag 'sh-for-linus' of git://github.com/pmundt/linux-sh: (25 commits)
sh: Support I/O space swapping where needed.
sh: use set_current_blocked() and block_sigmask()
sh: no need to reset handler if SA_ONESHOT
sh: intc: Fix up section mismatch for intc_ack_data
sh: select ARCH_DISCARD_MEMBLOCK.
sh: Consolidate duplicate _32/_64 unistd definitions.
sh: ecovec: switch SDHI controllers to card polling
sh: Avoid exporting unimplemented syscalls.
sh: add platform_device for RSPI in setup-sh7757
SH: pci-sh7780: enable big-endian operation.
serial: sh-sci: fix a race of DMA submit_tx on transfer
sh: dma: Collect up CHCR of SH7763, SH7764, SH7780 and SH7785
sh: dma: Collect up CHCR of SH7723 and SH7730
sh/next: Fix build fail by asm/system.h in asm/bitops.h
arch/sh/drivers/dma/{dma-g2,dmabrg}.c: ensure arguments to request_irq and free_irq are compatible
sh: cpufreq: Wire up scaling_available_freqs support.
sh: cpufreq: notify about rate rounding fallback.
sh: cpufreq: Support CPU clock frequency table.
sh: cpufreq: struct device lookup from CPU topology.
sh: cpufreq: percpu struct clk accounting.
...
acpi_processor_install_hotplug_notify() registers processor objects to
receive ACPI CPU hotplug event notifications. This patch additionally
registers processor device objects (ACPI0007) to receive the notifications
as well.
Signed-off-by: Toshi Kani <toshi.kani@hp.com>
Reviewed-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Print physical address info in a style consistent with the %pR style used
elsewhere in the kernel.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Len Brown <len.brown@intel.com>
An HP laptop (Pavilion G4-1016tx) has the following code in _TMP:
Store (\_SB.PCI0.LPCB.EC0.RTMP, Local0)
If (LGreaterEqual (Local0, S4TP))
{
Store (One, HTS4)
}
S4TP is initialised at 0 and not programmed further until either _HOT or
_CRT is called. If we evaluate _TMP before the trip points then HTS4 will
always be set, causing the firmware to generate a message on boot
complaining that the system shut down because of overheating. The simplest
solution is just to reverse the checking of trip points and _TMP in thermal
init.
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Len Brown <len.brown@intel.com>
acpi_dev_run_wake() is a generic function which can be used by
other subsystem too. Rename it to acpi_pm_device_run_wake, to be
consistent with acpi_pm_device_sleep_wake.
Then move it to ACPI core.
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
As far as I can see, this field is never used in the code.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Len Brown <len.brown@intel.com>
All the modules name are ro-data, it is never copied to the array.
eg.
static struct cpuidle_driver intel_idle_driver = {
.name = "intel_idle",
.owner = THIS_MODULE,
};
It safe to assign the pointer of this ro-data to a const char *.
By this way we save 12 bytes.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Tested-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Signed-off-by: Len Brown <len.brown@intel.com>
If the state_count is not initialized for the device use
the driver's state count as the default. That will prevent
to add it manually in the cpuidle driver initialization
routine and will save us from duplicate line of code.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Len Brown <len.brown@intel.com>
Some C states of new CPU might be not good. One reason is BIOS might
configure them incorrectly. To help developers root cause it quickly, the
patch adds a new sysfs entry, so developers could disable specific C state
manually.
In addition, C state might have much impact on performance tuning, as it
takes much time to enter/exit C states, which might delay interrupt
processing. With the new debug option, developers could check if a deep C
state could impact performance and how much impact it could cause.
Also add this option in Documentation/cpuidle/sysfs.txt.
[akpm@linux-foundation.org: check kstrtol return value]
Signed-off-by: ShuoX Liu <shuox.liu@intel.com>
Reviewed-by: Yanmin Zhang <yanmin_zhang@intel.com>
Reviewed-and-Tested-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Len Brown <len.brown@intel.com>
Devices may share same list of power resources in _PR0, for example
Device(Dev0)
{
Name (_PR0, Package (0x01)
{
P0PR,
P1PR
})
}
Device(Dev1)
{
Name (_PR0, Package (0x01)
{
P0PR,
P1PR
}
}
Assume Dev0 and Dev1 were runtime suspended.
Then Dev0 is resumed first and it goes into D0 state.
But Dev1 is left in D0_Uninitialised state.
This is wrong. In this case, Dev1 must be resumed too.
In order to hand this case, each power resource maintains a list of
devices which relies on it.
When power resource is ON, it will check if the devices on its list
can be resumed. The device can only be resumed when all the power
resouces of its _PR0 are ON.
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
If a device has _PR3, it means the device supports D3_COLD.
Add the ability to validate and enter D3_COLD state in ACPI.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Version 20120320.
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Repair a common problem with objects that are defined to return
a variable-length Package of sub-objects. If there is only one
sub-object, some BIOS code mistakenly simply declares the single
object instead of a Package with one sub-object. This function
attempts to repair this error by wrapping a Package object around
the original object, creating the correct and expected Package
with one sub-object.
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
We accidentally removed the check for NULL in 3aac0ef10b "Input: wacom -
isolate input registration".
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Chris Bagwell <chris@cnpbagwell.com>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Some ACPI interrupt actions may need to wait, and it's easiest to
have a thread context for this. So turn the ACPI interrupt
into a threaded interrupt.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
WARN() is not supposed to have side effects, so move the request_regions
outside.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
jump_label.c needs asm/cacheflush.h to get flushi().
kgdb_64.c needs asm/cacheflush.h to get flushw_all().
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes this build error:
kernel/signal.c: In function 'ptrace_stop':
kernel/signal.c:1860:3: error: implicit declaration of function 'synchronize_user_stack' [-Werror=implicit-function-declaration]
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull urgent cgroup fix from Tejun Heo:
"Commit 61d1d219c4 ('cgroup: remove extra calls to
find_existing_css_set') which was part of the rc1 cgroup pull request
made writes to the cgroup "tasks" file return an uninitialized retval
on success which can cause boot failures with systemd.
The change stayed in linux-next for quite some time but gcc
interestingly failed to emit warning about using uninitialized
variable and the problem seems to materialize only for certain build
combinations (probably depends on register allocation).
It's just missing local variable initialization and the fix is trivial
& safe. As the problem is critical when it materializes, I'm
fast-tracking it. Also included is Li's email address change in
MAINTAINERS."
* 'for-3.4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup: cgroup_attach_task() could return -errno after success
cgroup: update MAINTAINERS entry
61d1d219c4 "cgroup: remove extra calls to find_existing_css_set" made
cgroup_task_migrate() return void. An unfortunate side effect was
that cgroup_attach_task() was depending on that function's return
value to clear its @retval on the success path. On cgroup mounts
without any subsystem with ->can_attach() callback,
cgroup_attach_task() ended up returning @retval without initializing
it on success.
For some reason, gcc failed to warn about it and it didn't cause
cgroup_attach_task() to return non-zero value in many cases, probably
due to difference in register allocation. When the problem
materializes, systemd fails to populate /systemd cgroup mount and
fails to boot.
Fix it by initializing @retval to zero on declaration.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Jiri Kosina <jkosina@suse.cz>
LKML-Reference: <alpine.LNX.2.00.1203282354440.25526@pobox.suse.cz>
Reviewed-by: Mandeep Singh Baines <msb@chromium.org>
Acked-by: Li Zefan <lizefan@huawei.com>
This is a first pass of some of the merge window fallout for ARM platforms.
Nothing controversial:
* A system.h fallout fix for OMAP
* PXA fixes for breakage caused by the regulator struct changes
* GPIO fixes for OMAP to properly deal with dynamic IRQ allocation
* A mismerge in our arm-soc tree of an lpc32xx change for networking
* A fix for USB setup on tegra
* An undo of __init annotation of display mux setup on OMAP that's needed
at runtime
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJPdSzXAAoJEIwa5zzehBx3b6sQAIhmAQbkYmlkU5GefUmPR++B
JIu8XFKEgWYaviRPj5cUZ4Lx3DWIIcwcGD2AmFdA2RY6JJmULY2Hw2v9yn8mhDPS
7319kMK/Niu3c/DIwxyXHihOQ0LSF+V4dKyHcX1zhG8MFXtUVNudNMg9DwbWMqhk
IXNNREuTTc5313i6cocSIKizGMqED6xxhnuAAt4cTzT/vfSDicNq7DdH6kmPqTgV
azuAlgvDI+k7RvPQY360hLVXFBEGDZV/uOWJFfgtmZcxwhfRVJuEp6+lGFDHNLXU
jbYt/NeLOOv4C+KEDKolR2YJ/xe/gM//rLD2vErT+DTPUuEyQblo8JXEGsRCdNhF
UWK2PW/gBgVW/tFAaDSCLRYPxWL5IUoE2NA7XOF4Gi8hPn1o8JaRXeAhDRK47xiv
NnS4nYJF6dFsxYCH+SJkiotY8/y+CkDPWhE4lChPWxJTDeMuREGrDkUNCRs06by8
I7AVXqBv00q+e3QjE8zE+1AxQEXBty+DwhyzefGKwI8PntfqhuVSVXb2x7DO2hXk
hBAJt/6nBy0IHa0pRPZ4QkA+phXRES5zAyZ9R3oDEW37t1EqhKEnOsmLU4s9CxAk
Lqelzx31SzTpN1VkqjCkRDxXNC5Ww02WmAO+mWoeCUlvnM4GwpmWriXn9lej87KQ
fbIgpnFen1ftSJn6fLnN
=36Eb
-----END PGP SIGNATURE-----
Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull arm-soc fixes from Olof Johansson:
"This is a first pass of some of the merge window fallout for ARM
platforms.
Nothing controversial:
- A system.h fallout fix for OMAP
- PXA fixes for breakage caused by the regulator struct changes
- GPIO fixes for OMAP to properly deal with dynamic IRQ allocation
- A mismerge in our arm-soc tree of an lpc32xx change for networking
- A fix for USB setup on tegra
- An undo of __init annotation of display mux setup on OMAP that's
needed at runtime"
* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: pxa: fix build issue on stargate2
ARM: pxa: fix build issue on cm-x300
ARM: pxa: fix build failure for regulator consumer in em-x270.c
ARM: LPC32xx: clock.c: Fix lpc-eth clock reference
ARM: OMAP: pm: fix compilation break
ARM: OMAP: Remove OMAP_GPIO_IRQ macro definition
drivers: input: Fix OMAP_GPIO_IRQ with gpio_to_irq() in ams_delta_serio_exit()
ARM: OMAP: boards: Fix OMAP_GPIO_IRQ usage with gpio_to_irq()
ARM: pxa: fix regulator related build fail in magician_defconfig
ARM: tegra: Fix device tree AUXDATA for USB/EHCI
ARM: OMAP2+: Remove __init from DSI mux functions
Sometimes users have turbostat running in interval mode
when they take processors offline/online.
Previously, turbostat would survive, but not gracefully.
Tighten up the error checking so turbostat notices
changesn sooner, and print just 1 line on change:
turbostat: re-initialized with num_cpus %d
Signed-off-by: Len Brown <len.brown@intel.com>
turbostat uses /dev/cpu/*/msr interface to read MSRs.
For modern systems, it reads 10 MSR/CPU. This can
be observed as 10 "Function Call Interrupts"
per CPU per sample added to /proc/interrupts.
This overhead is measurable on large idle systems,
and as Yoquan Song pointed out, it can even trick
cpuidle into thinking the system is busy.
Here turbostat re-schedules itself in-turn to each
CPU so that its MSR reads will always be local.
This replaces the 10 "Function Call Interrupts"
with a single "Rescheduling interrupt" per sample
per CPU.
On an idle 32-CPU system, this shifts some residency from
the shallow c1 state to the deeper c7 state:
# ./turbostat.old -s
%c0 GHz TSC %c1 %c3 %c6 %c7 %pc2 %pc3 %pc6 %pc7
0.27 1.29 2.29 0.95 0.02 0.00 98.77 20.23 0.00 77.41 0.00
0.25 1.24 2.29 0.98 0.02 0.00 98.75 20.34 0.03 77.74 0.00
0.27 1.22 2.29 0.54 0.00 0.00 99.18 20.64 0.00 77.70 0.00
0.26 1.22 2.29 1.22 0.00 0.00 98.52 20.22 0.00 77.74 0.00
0.26 1.38 2.29 0.78 0.02 0.00 98.95 20.51 0.05 77.56 0.00
^C
i# ./turbostat.new -s
%c0 GHz TSC %c1 %c3 %c6 %c7 %pc2 %pc3 %pc6 %pc7
0.27 1.20 2.29 0.24 0.01 0.00 99.49 20.58 0.00 78.20 0.00
0.27 1.22 2.29 0.25 0.00 0.00 99.48 20.79 0.00 77.85 0.00
0.27 1.20 2.29 0.25 0.02 0.00 99.46 20.71 0.03 77.89 0.00
0.28 1.26 2.29 0.25 0.01 0.00 99.46 20.89 0.02 77.67 0.00
0.27 1.20 2.29 0.24 0.01 0.00 99.48 20.65 0.00 78.04 0.00
cc: Youquan Song <youquan.song@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Pull x86 cleanups from Peter Anvin:
"The biggest textual change is the cleanup to use symbolic constants
for x86 trap values.
The only *functional* change and the reason for the x86/x32 dependency
is the move of is_ia32_task() into <asm/thread_info.h> so that it can
be used in other code that needs to understand if a system call comes
from the compat entry point (and therefore uses i386 system call
numbers) or not. One intended user for that is the BPF system call
filter. Moving it out of <asm/compat.h> means we can define it
unconditionally, returning always true on i386."
* 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86: Move is_ia32_task to asm/thread_info.h from asm/compat.h
x86: Rename trap_no to trap_nr in thread_struct
x86: Use enum instead of literals for trap values
v2: 2nd draft
- Editorial cleanups (Randy Dunlap and Stephen Warren)
- Added missing Microblaze reference (Stephen Neuendorffer)
- Make example of platform_device creation clearer (Shawn Guo)
- Expand on PowerPC history and mention i2c mess (David Gibson)
- convert to plain text (remove bits of html formating)
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>