original_kernel

Commit Graph

Author	SHA1	Message	Date
Al Viro	6450578f32	[PATCH] ia64: task_pt_regs() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-12 09:08:58 -08:00
akpm@osdl.org	198e2f1811	[PATCH] scheduler cache-hot-autodetect ) From: Ingo Molnar <mingo@elte.hu> This is the latest version of the scheduler cache-hot-auto-tune patch. The first problem was that detection time scaled with O(N^2), which is unacceptable on larger SMP and NUMA systems. To solve this: - I've added a 'domain distance' function, which is used to cache measurement results. Each distance is only measured once. This means that e.g. on NUMA distances of 0, 1 and 2 might be measured, on HT distances 0 and 1, and on SMP distance 0 is measured. The code walks the domain tree to determine the distance, so it automatically follows whatever hierarchy an architecture sets up. This cuts down on the boot time significantly and removes the O(N^2) limit. The only assumption is that migration costs can be expressed as a function of domain distance - this covers the overwhelming majority of existing systems, and is a good guess even for more assymetric systems. [ People hacking systems that have assymetries that break this assumption (e.g. different CPU speeds) should experiment a bit with the cpu_distance() function. Adding a ->migration_distance factor to the domain structure would be one possible solution - but lets first see the problem systems, if they exist at all. Lets not overdesign. ] Another problem was that only a single cache-size was used for measuring the cost of migration, and most architectures didnt set that variable up. Furthermore, a single cache-size does not fit NUMA hierarchies with L3 caches and does not fit HT setups, where different CPUs will often have different 'effective cache sizes'. To solve this problem: - Instead of relying on a single cache-size provided by the platform and sticking to it, the code now auto-detects the 'effective migration cost' between two measured CPUs, via iterating through a wide range of cachesizes. The code searches for the maximum migration cost, which occurs when the working set of the test-workload falls just below the 'effective cache size'. I.e. real-life optimized search is done for the maximum migration cost, between two real CPUs. This, amongst other things, has the positive effect hat if e.g. two CPUs share a L2/L3 cache, a different (and accurate) migration cost will be found than between two CPUs on the same system that dont share any caches. (The reliable measurement of migration costs is tricky - see the source for details.) Furthermore i've added various boot-time options to override/tune migration behavior. Firstly, there's a blanket override for autodetection: migration_cost=1000,2000,3000 will override the depth 0/1/2 values with 1msec/2msec/3msec values. Secondly, there's a global factor that can be used to increase (or decrease) the autodetected values: migration_factor=120 will increase the autodetected values by 20%. This option is useful to tune things in a workload-dependent way - e.g. if a workload is cache-insensitive then CPU utilization can be maximized by specifying migration_factor=0. I've tested the autodetection code quite extensively on x86, on 3 P3/Xeon/2MB, and the autodetected values look pretty good: Dual Celeron (128K L2 cache): --------------------- migration cost matrix (max_cache_size: 131072, cpu: 467 MHz): --------------------- [00] [01] [00]: - 1.7(1) [01]: 1.7(1) - --------------------- cacheflush times [2]: 0.0 (0) 1.7 (1784008) --------------------- Here the slow memory subsystem dominates system performance, and even though caches are small, the migration cost is 1.7 msecs. Dual HT P4 (512K L2 cache): --------------------- migration cost matrix (max_cache_size: 524288, cpu: 2379 MHz): --------------------- [00] [01] [02] [03] [00]: - 0.4(1) 0.0(0) 0.4(1) [01]: 0.4(1) - 0.4(1) 0.0(0) [02]: 0.0(0) 0.4(1) - 0.4(1) [03]: 0.4(1) 0.0(0) 0.4(1) - --------------------- cacheflush times [2]: 0.0 (33900) 0.4 (448514) --------------------- Here it can be seen that there is no migration cost between two HT siblings (CPU#0/2 and CPU#1/3 are separate physical CPUs). A fast memory system makes inter-physical-CPU migration pretty cheap: 0.4 msecs. 8-way P3/Xeon [2MB L2 cache]: --------------------- migration cost matrix (max_cache_size: 2097152, cpu: 700 MHz): --------------------- [00] [01] [02] [03] [04] [05] [06] [07] [00]: - 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) [01]: 19.2(1) - 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) [02]: 19.2(1) 19.2(1) - 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) [03]: 19.2(1) 19.2(1) 19.2(1) - 19.2(1) 19.2(1) 19.2(1) 19.2(1) [04]: 19.2(1) 19.2(1) 19.2(1) 19.2(1) - 19.2(1) 19.2(1) 19.2(1) [05]: 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) - 19.2(1) 19.2(1) [06]: 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) - 19.2(1) [07]: 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) - --------------------- cacheflush times [2]: 0.0 (0) 19.2 (19281756) --------------------- This one has huge caches and a relatively slow memory subsystem - so the migration cost is 19 msecs. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Ken Chen <kenneth.w.chen@intel.com> Cc: <wilder@us.ibm.com> Signed-off-by: John Hawkes <hawkes@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-12 09:08:50 -08:00
Ingo Molnar	4dc7a0bbeb	[PATCH] sched: add cacheflush() asm Add per-arch sched_cacheflush() which is a write-back cacheflush used by the migration-cost calibration code at bootup time. Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-12 09:08:49 -08:00
Tony Luck	38c0b2c2aa	[IA64] Fix compile warnings in setup.c arch/ia64/kernel/setup.c: In function `show_cpuinfo': arch/ia64/kernel/setup.c:576: warning: long unsigned int format, different type arg (arg 12) arch/ia64/kernel/setup.c:576: warning: long unsigned int format, different type arg (arg 13) Introduced by `95235ca2c2` Signed-off-by: Tony Luck <tony.luck@intel.com>	2006-01-05 13:30:52 -08:00
Venkatesh Pallipadi	95235ca2c2	[CPUFREQ] CPU frequency display in /proc/cpuinfo What is the value shown in "cpu MHz" of /proc/cpuinfo when CPUs are capable of changing frequency? Today the answer is: It depends. On i386: SMP kernel - It is always the boot frequency UP kernel - Scales with the frequency change and shows that was last set. On x86_64: There is one single variable cpu_khz that gets written by all the CPUs. So, the frequency set by last CPU will be seen on /proc/cpuinfo of all the CPUs in the system. What you see also depends on whether you have constant_tsc capable CPU or not. On ia64: It is always boot time frequency of a particular CPU that gets displayed. The patch below changes this to: Show the last known frequency of the particular CPU, when cpufreq is present. If cpu doesnot support changing of frequency through cpufreq, then boot frequency will be shown. The patch affects i386, x86_64 and ia64 architectures. Signed-off-by: Venkatesh Pallipadi<venkatesh.pallipadi@intel.com> Signed-off-by: Dave Jones <davej@redhat.com>	2005-12-06 19:35:11 -08:00
Tony Luck	7669a22592	Pull context-bitmap into release branch	2005-11-10 10:39:49 -08:00
John W. Linville	e1531b4218	[PATCH] ia64: re-implement dma_get_cache_alignment to avoid EXPORT_SYMBOL The current ia64 implementation of dma_get_cache_alignment does not work for modules because it relies on a symbol which is not exported. Direct access to a global is a little ugly anyway, so this patch re-implements dma_get_cache_alignment in a manner similar to what is currently used for x86_64. Signed-off-by: John W. Linville <linville@tuxdriver.com> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-07 07:53:23 -08:00
Peter Keilty	dcc17d1bae	[IA64] Use bitmaps for efficient context allocation/free Corrects the very inefficent method of finding free context_ids in get_mmu_context(). Instead of walking the task_list of all processes, 2 bitmaps are used to efficently store and lookup state, inuse and needs flushing. The entire rid address space is now used before calling wrap_mmu_context and global tlb flushing. Special thanks to Ken and Rohit for their review and modifications in using a bit flushmap. Signed-off-by: Peter Keilty <peter.keilty@hp.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2005-10-31 14:36:05 -08:00
Tony Luck	5a2b1722e1	Pull proc-cpuinfo-siblings into release branch	2005-10-28 14:33:35 -07:00
Tony Luck	5833f1420b	Pull new-efi-memmap into release branch	2005-10-28 14:32:30 -07:00
Siddha, Suresh B	ce6e71ad48	[IA64] fix siblings field value in /proc/cpuinfo Fix the "siblings" field value in /proc/cpuinfo so that it now shows the number of siblings as seen by OS, instead of what is available from hardware perspective. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2005-10-25 15:00:36 -07:00
Tony Luck	d719948e62	[IA64] end of kernel 'data' is at _end, not _edata /proc/iomem describes a block of memory as "Kernel data", but the end address is derived from "_edata". The kernel actually has many other sections beyond _edata. Get the real end address from _end. Acked-by: Khalid Aziz <khalid_aziz@hp.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2005-09-28 16:09:46 -07:00
Bjorn Helgaas	44c451208d	[IA64] ia64: add ar.k0 usage note Update comment about how ar.k0 is used. Make the initialization the same as in start_secondary() (no functional change, just make it look more similar). Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2005-09-19 15:55:48 -07:00
Khalid Aziz	be379124c0	[IA64] include EFI memory information in /proc/iomem User mode kexec tools expect to find information about physical memory in /proc/iomem (as they do on x86) to validate the addresses that the new kernel will use. Signed-off-by: Khalid Aziz <khalid.aziz@hp.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2005-09-19 15:42:36 -07:00
Tony Luck	d8c97d5f3a	[IA64] simplified efi memory map parsing New version leaves the original memory map unmodified. Also saves any granule trimmings for use by the uncached memory allocator. Inspired by Khalid Aziz (various traces of his patch still remain). Fixes to uncached_build_memmap() and sn2 testing by Martin Hicks. Signed-off-by: Tony Luck <tony.luck@intel.com>	2005-09-08 12:39:59 -07:00
Len Brown	888ba6c62b	[ACPI] delete CONFIG_ACPI_BOOT it has been a synonym for CONFIG_ACPI since 2.6.12 Signed-off-by: Len Brown <len.brown@intel.com>	2005-08-24 12:08:54 -04:00
Tony Luck	99ad25a313	Auto merge with /home/aegl/GIT/linus	2005-07-13 12:15:43 -07:00
Zoltan Menyhart	08357f82d4	[IA64] improve flush_icache_range() Check with PAL to see what the i-cache line size is for each level of the cache, and so use the correct stride when flushing the cache. Acked-by: David Mosberger Signed-off-by: Tony Luck <tony.luck@intel.com>	2005-07-12 15:33:18 -07:00
Len Brown	5028770a42	[ACPI] merge acpi-2.6.12 branch into latest Linux 2.6.13-rc... Signed-off-by: Len Brown <len.brown@intel.com>	2005-07-12 17:21:56 -04:00
Venkatesh Pallipadi	6c4fa56033	[ACPI] fix C1 patch for IA64 http://bugzilla.kernel.org/show_bug.cgi?id=4233 Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>	2005-07-12 00:10:20 -04:00
Mark Maule	66b7f8a304	[IA64-SGI] pcdp: add PCDP pci interface support Resend 2 with changes per Bjorn Helgaas comments. Changes from original: + Change globals to vga_console_iobase/vga_console_membase and make them unconditional. + Address style-related comments. Patch to extend the PCDP vga setup code to support PCI io/mem translations for the legacy vga ioport and ram spaces on architectures (e.g. altix) which need them. Summary of the changes: drivers/firmware/pcdp.c drivers/firmware/pcdp.h ----------------------- + add declaration for the spec-defined PCI interface struct (pcdp_if_pci) as well as support macros. + extend setup_vga_console() to know about pcdp_if_pci and add a couple of globals to hold the io and mem translation offsets if present. arch/ia64/kernel/setup.c ------------------------ + tweek early_console_setup() to allow multiple early console setup routines to be called. include/asm-ia64/vga.h ---------------------- + make VGA_MAP_MEM vga_console_membase aware Signed-off-by: Mark Maule <maule@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2005-06-28 09:09:06 -07:00
Tony Luck	86ebacd360	[IA64] Update comment to describe modes set in default control register. Christian Hildner pointed out that the comment did not match what the code does in cpu_init() when we set up the default control register. Patch based on suggestions from Ken Chen. Signed-off-by: Tony Luck <tony.luck@intel.com>	2005-06-08 12:12:48 -07:00
Tony Luck	e1ed81ab7a	[IA64] print "siblings" before {physical,core,thread} id Rohit and Suresh changed their mind about the order to print things in /proc/cpuinfo, but didn't include the change in the version of the patch they sent to me. Signed-off-by: Tony Luck <tony.luck@intel.com>	2005-04-25 13:27:12 -07:00
Suresh Siddha	e927ecb05e	[IA64] multi-core/multi-thread identification Version 3 - rediffed to apply on top of Ashok's hotplug cpu patch. /proc/cpuinfo output in step with x86. This is an updated MC/MT identification patch based on the previous discussions on list. Add the Multi-core and Multi-threading detection for IPF. - Add new core and threading related fields in /proc/cpuinfo. Physical id Core id Thread id Siblings - setup the cpu_core_map and cpu_sibling_map appropriately - Handles Hot plug CPU Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Gordon Jin <gordon.jin@intel.com> Signed-off-by: Rohit Seth <rohit.seth@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2005-04-25 13:25:06 -07:00
Linus Torvalds	1da177e4c3	Linux-2.6.12-rc2 Initial git repository build. I'm not bothering with the full history, even though we have it. We can create a separate "historical" git archive of that later if we want to, and in the meantime it's about 3.2GB when imported into git - space that would just make the early git days unnecessarily complicated, when we don't have a lot of good infrastructure for it. Let it rip!	2005-04-16 15:20:36 -07:00

25 Commits