linux-stable-rt

Commit Graph

Author	SHA1	Message	Date
Avi Kivity	e48bb497b9	KVM: MMU: Fix memory leak on guest demand faults While backporting `72dc67a696`, a gfn_to_page() call was duplicated instead of moved (due to an unrelated patch not being present in mainline). This caused a page reference leak, resulting in a fairly massive memory leak. Fix by removing the extraneous gfn_to_page() call. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-03-25 10:22:17 +02:00
Marcelo Tosatti	707a18a51d	KVM: VMX: convert init_rmode_tss() to slots_lock init_rmode_tss was forgotten during the conversion from mmap_sem to slots_lock. INFO: task qemu-system-x86:3748 blocked for more than 120 seconds. Call Trace: [<ffffffff8053d100>] __down_read+0x86/0x9e [<ffffffff8053fb43>] do_page_fault+0x346/0x78e [<ffffffff8053d235>] trace_hardirqs_on_thunk+0x35/0x3a [<ffffffff8053dcad>] error_exit+0x0/0xa9 [<ffffffff8035a7a7>] copy_user_generic_string+0x17/0x40 [<ffffffff88099a8a>] :kvm:kvm_write_guest_page+0x3e/0x5f [<ffffffff880b661a>] :kvm_intel:init_rmode_tss+0xa7/0xf9 [<ffffffff880b7d7e>] :kvm_intel:vmx_vcpu_reset+0x10/0x38a [<ffffffff8809b9a5>] :kvm:kvm_arch_vcpu_setup+0x20/0x53 [<ffffffff8809a1e4>] :kvm:kvm_vm_ioctl+0xad/0x1cf [<ffffffff80249dea>] __lock_acquire+0x4f7/0xc28 [<ffffffff8028fad9>] vfs_ioctl+0x21/0x6b [<ffffffff8028fd75>] do_vfs_ioctl+0x252/0x26b [<ffffffff8028fdca>] sys_ioctl+0x3c/0x5e [<ffffffff8020b01b>] system_call_after_swapgs+0x7b/0x80 Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-03-25 10:22:17 +02:00
Marcelo Tosatti	15aaa819e2	KVM: MMU: handle page removal with shadow mapping Do not assume that a shadow mapping will always point to the same host frame number. Fixes crash with madvise(MADV_DONTNEED). [avi: move after first printk(), add another printk()] Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-03-25 10:22:17 +02:00
Avi Kivity	4b1a80fa65	KVM: MMU: Fix is_rmap_pte() with io ptes is_rmap_pte() doesn't take into account io ptes, which have the avail bit set. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-03-25 10:22:16 +02:00
Avi Kivity	5dc8326282	KVM: VMX: Restore tss even on x86_64 The vmx hardware state restore restores the tss selector and base address, but not its length. Usually, this does not matter since most of the tss contents is within the default length of 0x67. However, if a process is using ioperm() to grant itself I/O port permissions, an additional bitmap within the tss, but outside the default length is consulted. The effect is that the process will receive a SIGSEGV instead of transparently accessing the port. Fix by restoring the tss length. Note that i386 had this working already. Closes bugzilla 10246. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-03-25 10:22:16 +02:00
Linus Torvalds	b9e76a0074	x86-32: Pass the full resource data to ioremap() It appears that 64-bit PCI resources cannot possibly ever have worked on x86-32 even when the RESOURCES_64BIT config option was set, because any driver that tried to [pci_]ioremap() the resource would have been unable to do so because the high 32 bits would have been silently dropped on the floor by the ioremap() routines that only used "unsigned long". Change them to use "resource_size_t" instead, which properly encodes the whole 64-bit resource data if RESOURCES_64BIT is enabled. Acked-by: H. Peter Anvin <hpa@kernel.org> Acked-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-03-24 11:22:39 -07:00
Thomas Gleixner	9e9630481e	x86: revert: reserve dma32 early for gart Revert commit `f62f1fc9ef` Author: Yinghai Lu <yhlu.kernel@gmail.com> Date: Fri Mar 7 15:02:50 2008 -0800 x86: reserve dma32 early for gart The patch has a dependency on bootmem modifications which are not .25 material that late in the -rc cycle. The problem which is addressed by the patch is limited to machines with 256G and more memory booted with NUMA disabled. This is not a .25 regression and the audience which is affected by this problem is very limited, so it's safer to do the revert than pulling in intrusive bootmem changes right now. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-03-22 19:25:41 +01:00
Yinghai Lu	37bff62e98	x86_64: free_bootmem should take phys so use nodedata_phys directly. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-03-21 17:06:15 +01:00
Yinghai Lu	5dca6a1bb0	x86: trim mtrr don't close gap for resource allocation. fix the bug reported here: http://bugzilla.kernel.org/show_bug.cgi?id=10232 use update_memory_range() instead of add_memory_range() directly to avoid closing the gap. ( the new code only affects and runs on systems where the MTRR workaround triggers. ) Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-03-21 17:06:15 +01:00
Heinz-Ado Arnolds	fc1c8925c8	x86: fix reboot problem with Dell Optiplex 745, 0KW626 board we have seen a little problem in rebooting Dell Optiplex 745 with the 0KW626 board. Here is a small patch enabling reboot with this board, which forces the default reboot path it into the BIOS reboot mode. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-03-21 17:06:15 +01:00
Jiri Slaby	e215f3c2c5	x86: fix fault_msg nul termination The fault_msg text is not explictly nul terminated now in startup assembly. Do so by converting .ascii to .asciz. Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-03-21 17:06:15 +01:00
Pavel Machek	2050d45d7c	x86: fix long standing bug with usb after hibernation with 4GB ram aperture_64.c takes a piece of memory and makes it into iommu window... but such window may not be saved by swsusp -- that leads to oops during hibernation. Signed-off-by: Pavel Machek <pavel@suse.cz> Acked-by: "Rafael J. Wysocki" <rjw@sisk.pl> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-03-21 17:06:15 +01:00
Zbigniew Luszpinski	96bcf458cb	x86: hpet clock enable quirk on nVidia nForce 430 this patch allows hpet=force on nVidia nForce 430 southbridge. This patch was tested by me on my old Asus A8N-VM CSM (where bios does not support hpet and does not advertise it via acpi entry). My nForce430 version: lspci -nn \| grep LPC 00:0a.0 ISA bridge [0601]: nVidia Corporation MCP51 LPC Bridge [10de:0260] (rev a2) Kernel 2.6.24.3 after patching and using hpet=force reports this: dmesg \| grep -i hpet Kernel command line: root=/dev/sda8 ro vga=773 video=vesafb:mtrr:4,ywrap vt.default_utf8=0 hpet=force Force enabled HPET at base address 0xfed00000 hpet clockevent registered Time: hpet clocksource has been installed. grep -i hpet /proc/timer_list Clock Event Device: hpet set_next_event: hpet_legacy_next_event set_mode: hpet_legacy_set_mode grep Clock /proc/timer_list (before patching) Clock Event Device: pit Clock Event Device: lapic grep Clock /proc/timer_list (after patching) Clock Event Device: hpet Clock Event Device: lapic Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-03-21 17:06:15 +01:00
Yinghai Lu	f62f1fc9ef	x86: reserve dma32 early for gart a system with 256 GB of RAM, when NUMA is disabled crashes the following way: Your BIOS doesn't leave a aperture memory hole Please enable the IOMMU option in the BIOS setup This costs you 64 MB of RAM Cannot allocate aperture memory hole (ffff8101c0000000,65536K) Kernel panic - not syncing: Not enough memory for aperture Pid: 0, comm: swapper Not tainted 2.6.25-rc4-x86-latest.git #33 Call Trace: [<ffffffff84037c62>] panic+0xb2/0x190 [<ffffffff840381fc>] ? release_console_sem+0x7c/0x250 [<ffffffff847b1628>] ? __alloc_bootmem_nopanic+0x48/0x90 [<ffffffff847b0ac9>] ? free_bootmem+0x29/0x50 [<ffffffff847ac1f7>] gart_iommu_hole_init+0x5e7/0x680 [<ffffffff847b255b>] ? alloc_large_system_hash+0x16b/0x310 [<ffffffff84506a2f>] ? _etext+0x0/0x1 [<ffffffff847a2e8c>] pci_iommu_alloc+0x1c/0x40 [<ffffffff847ac795>] mem_init+0x45/0x1a0 [<ffffffff8479ff35>] start_kernel+0x295/0x380 [<ffffffff8479f1c2>] _sinittext+0x1c2/0x230 the root cause is : memmap PMD is too big, [ffffe200e0600000-ffffe200e07fffff] PMD ->ffff81383c000000 on node 0 almost near 4G..., and vmemmap_alloc_block will use up the ram under 4G. solution will be: 1. make memmap allocation get memory above 4G... 2. reserve some dma32 range early before we try to set up memmap for all. and release that before pci_iommu_alloc, so gart or swiotlb could get some range under 4g limit for sure. the patch is using method 2. because method1 may need more code to handle SPARSEMEM and SPASEMEM_VMEMMAP will get Your BIOS doesn't leave a aperture memory hole Please enable the IOMMU option in the BIOS setup This costs you 64 MB of RAM Mapping aperture over 65536 KB of RAM @ 4000000 Memory: 264245736k/268959744k available (8484k kernel code, 4187464k reserved, 4004k data, 724k init) Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-03-21 17:06:15 +01:00
Coleman Kane	fc115bf19b	x86: add the DFF (Desktop Form Factor) Dell Optiplex 745 to the reboot errata list We recently got some of the "Desktop Form Factor" Optiplex 745's in. I noticed that there's an entry for the SFF one's, but the BIOS model number of the DFF differs from that of the SFF. We have been reliably experiencing the same (as far as I can tell) reboot bug as the SFF boxes. Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-03-21 17:06:15 +01:00
Randy Dunlap	3178071560	x86/visws: fix printk format warnings Fix visws printk format warnings: /local/linsrc/linux-2.6.24-git15/arch/x86/mach-visws/traps.c:50: warning: format '%#lx' expects type 'long unsigned int', but argument 2 has type 'u32' /local/linsrc/linux-2.6.24-git15/arch/x86/mach-visws/traps.c:50: warning: format '%#lx' expects type 'long unsigned int', but argument 3 has type 'u32' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-03-21 17:06:15 +01:00
Yinghai Lu	7d2de13762	x86: tight online check in setup_per_cpu_areas when numa disabled I got this compile warning: arch/x86/kernel/setup64.c: In function setup_per_cpu_areas: arch/x86/kernel/setup64.c:147: warning: the address of contig_page_data will always evaluate as true it seems we missed checking if the node is online before we try to refer NODE_DATA. Fix it. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-03-21 17:06:15 +01:00
Yinghai Lu	6721fc0a0d	x86: fix dma_alloc_pages memory-less node support: this patch uses updated dev_to_node, because dev_to_node already makes sure it returns an online node. Signed-off-by: Yinghai Lu <yinghai.lu@sun.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-03-21 17:06:14 +01:00
Randy Dunlap	53471121a8	documentation: Move power-related files to Documentation/power/ Move 00-INDEX entries to power/00-INDEX (and add entry for pm_qos_interface.txt). Update references to moved filenames. Fix some trailing whitespace. Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Len Brown <len.brown@intel.com>	2008-03-12 18:10:51 -04:00
Thomas Gleixner	985a34bd75	x86: remove quicklists quicklists cause a serious memory leak on 32-bit x86, as documented at: http://bugzilla.kernel.org/show_bug.cgi?id=9991 the reason is that the quicklist pool is a special-purpose cache that grows out of proportion. It is not accounted for anywhere and users have no way to even realize that it's the quicklists that are causing RAM usage spikes. It was supposed to be a relatively small pool, but as demonstrated by KOSAKI Motohiro, they can grow as large as: Quicklists: 1194304 kB given how much trouble this code has caused historically, and given that Andrew objected to its introduction on x86 (years ago), the best option at this point is to remove them. [ any performance benefits of caching constructed pgds should be implemented in a more generic way (possibly within the page allocator), while still allowing constructed pages to be allocated by other workloads. ] Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-03-11 17:11:55 +01:00
Roland McGrath	40f0933d51	x86: ia32 syscall restart fix The code to restart syscalls after signals depends on checking for a negative orig_ax, and for particular negative -ERESTART* values in ax. These fields are 64 bits and for a 32-bit task they get zero-extended. The syscall restart behavior is lost, a regression from a native 32-bit kernel and from 64-bit tasks' behavior. This patch fixes the problem by doing sign-extension where it matters. For orig_ax, the only time the value should be -1 but winds up as 0x0ffffffff is via a 32-bit ptrace call. So the patch changes ptrace to sign-extend the 32-bit orig_eax value when it's stored; it doesn't change the checks on orig_ax, though it uses the new current_syscall() inline to better document the subtle importance of the used of signedness there. The ax value is stored a lot of ways and it seems hard to get them all sign-extended at their origins. So for that, we use the current_syscall_ret() to sign-extend it only for 32-bit tasks at the time of the -ERESTART* comparisons. Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-03-11 17:11:54 +01:00
Ingo Molnar	9a46d7e5b6	x86: ioremap, remove WARN_ON() Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-03-11 17:11:54 +01:00
Ingo Molnar	f5dbb55b99	fix BIOS PCI config cycle buglet causing ACPI boot regression I figured out another ACPI related regression today. randconfig testing triggered an early boot-time hang on a laptop of mine (32-bit x86, config attached) - the screen was scrolling ACPI AML exceptions [with no serial port and no early debugging available]. v2.6.24 works fine on that laptop with the same .config, so after a few hours of bisection (had to restart it 3 times - other regressions interacted), it honed in on this commit: \| `10270d4838` is first bad commit \| \| Author: Linus Torvalds <torvalds@woody.linux-foundation.org> \| Date: Wed Feb 13 09:56:14 2008 -0800 \| \| acpi: fix acpi_os_read_pci_configuration() misuse of raw_pci_read() reverting this commit ontop of -rc5 gave a correctly booting kernel. But this commit fixes a real bug so the real question is, why did it break the bootup? After quite some head-scratching, the following change stood out: - pci_id->bus = tu8; + pci_id->bus = val; pci_id->bus is defined as u16: struct acpi_pci_id { u16 segment; u16 bus; ... and 'tu8' changed from u8 to u32. So previously we'd unconditionally mask the return value of acpi_os_read_pci_configuration() (raw_pci_read()) to 8 bits, but now we just trust whatever comes back from the PCI access routines and only crop it to 16 bits. But if the high 8 bits of that result contains any noise then we'll write that into ACPI's PCI ID descriptor and confuse the heck out of the rest of ACPI. So lets check the PCI-BIOS code on that theory. We have this codepath for 8-bit accesses (arch/x86/pci/pcbios.c:pci_bios_read()): switch (len) { case 1: __asm__("lcall (%%esi); cld\n\t" "jc 1f\n\t" "xor %%ah, %%ah\n" "1:" : "=c" (value), "=a" (result) : "1" (PCIBIOS_READ_CONFIG_BYTE), "b" (bx), "D" ((long)reg), "S" (&pci_indirect)); Aha! The "=a" output constraint puts the full 32 bits of EAX into value. But if the BIOS's routines set any of the high bits to nonzero, we'll return a value with more set in it than intended. The other, more common PCI access methods (v1 and v2 PCI reads) clear out the high bits already, for example pci_conf1_read() does: switch (len) { case 1: value = inb(0xCFC + (reg & 3)); which explicitly converts the return byte up to 32 bits and zero-extends it. So zero-extending the result in the PCI-BIOS read routine fixes the regression on my laptop. ( It might fix some other long-standing issues we had with PCI-BIOS during the past decade ... ) Both 8-bit and 16-bit accesses were buggy. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-03-10 18:09:05 -07:00
Rusty Russell	4357bd9453	lguest: Revert `1ce70c4fac`, fix real problem. Ahmed managed to crash the Host in release_pgd(), which cannot be a Guest bug, and indeed it wasn't. The bug was that handing a 0 as the address of the toplevel page table being manipulated can cause the lookup code in find_pgdir() to return an uninitialized cache entry (we shadow up to 4 top level page tables for each Guest). Commit `37cc8d7f96` introduced this behaviour in the Guest, uncovering the bug. The patch which he submitted (which removed the /4 from the index calculation) simply ensured that these high-indexed entries hit the early exit path of guest_set_pmd(). But you get lots of segfaults in guest userspace as the PMDs aren't being updated. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-03-11 09:35:58 +11:00
Rusty Russell	3fabc55f34	lguest: Sanitize the lguest clock. Now the TSC code handles a zero return from calculate_cpu_khz(), lguest can simply pass through the value it gets from the Host: if non-zero, all the normal TSC code applies. Otherwise (or if the Host really doesn't support TSC), the clocksource code will fall back to the slower but reasonable lguest clock. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-03-11 09:35:57 +11:00
Roland McGrath	84c6f6046c	x86_64: make ptrace always sign-extend orig_ax to 64 bits This makes 64-bit ptrace calls setting the 64-bit orig_ax field for a 32-bit task sign-extend the low 32 bits up to 64. This matches what a 64-bit debugger expects when tracing a 32-bit task. This follows on my "x86_64 ia32 syscall restart fix". This didn't matter until that was fixed. The debugger ignores or zeros the high half of every register slot it sets (including the orig_rax pseudo-register) uniformly. It expects that the setting of the low 32 bits always has the same meaning as a 32-bit debugger setting those same 32 bits with native 32-bit facilities. This never arose before because the syscall restart check never matched any -ERESTART* values due to lack of sign extension. Before that fix, even 32-bit ptrace setting orig_eax to -1 failed to trigger the restart check anyway. So this was never noticed as a regression of 64-bit debuggers vs 32-bit debuggers on the same 64-bit kernel. Signed-off-by: Roland McGrath <roland@redhat.com> [ Changed to just do the sign-extension unconditionally on x86-64, since orig_ax is always just a small integer and doesn't need the full 64-bit range ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-03-07 19:05:58 -08:00
Peter Korsgaard	1722770f13	x86-boot: don't request VBE2 information The new x86 setup code (`4fd06960f1`) broke booting on an old P3/500MHz with an onboard Voodoo3 of mine. After debugging it, it turned out to be caused by the fact that the vesa probing now asks for VBE2 data. Disassembing the video BIOS shows that it overflows the vesa_general_info structure when VBE2 data is requested because the source addresses for the information strings which get strcpy'ed to the buffer lie outside the 32K BIOS code (and hence contain long sequences of 0xff's). E.G.: get_vbe_controller_info: 00002A9C 60 pushaw 00002A9D 1E push ds 00002A9E 0E push cs 00002A9F 1F pop ds 00002AA0 2BC9 sub cx,cx 00002AA2 6626813D56424532 cmp dword [es:di],0x32454256 ; "VBE2" 00002AAA 7501 jnz .1 00002AAC 41 inc cx .1: 00002AAD 51 push cx 00002AAE B91400 mov cx,0x14 00002AB1 BED47F mov si, controller_header 00002AB4 57 push di 00002AB5 F3A4 rep movsb ; copy vbe1.2 header 00002AB7 B9EC00 mov cx,0xec 00002ABA 2AC0 sub al,al 00002ABC F3AA rep stosb ; zero pad remainder 00002ABE 5F pop di 00002ABF E8EB0D call word get_memory 00002AC2 C1E002 shl ax,0x2 00002AC5 26894512 mov [es:di+0x12],ax ; total memory 00002AC9 26C745040003 mov word [es:di+0x4],0x300 ; VBE version 00002ACF 268C4D08 mov [es:di+0x8],cs 00002AD3 268C4D10 mov [es:di+0x10],cs 00002AD7 59 pop cx 00002AD8 E361 jcxz .done ; VBE2 requested? 00002ADA 8D9D0001 lea bx,[di+0x100] 00002ADE 53 push bx 00002ADF 87DF xchg bx,di ; di now points to 2nd half 00002AE1 26C747140001 mov word [es:bx+0x14],0x100 ; sw rev 00002AE7 26897F06 mov [es:bx+0x6],di ; oem string 00002AEB 268C4708 mov [es:bx+0x8],es 00002AEF BE5280 mov si,0x8052 ; oem string 00002AF2 E87A1B call word strcpy 00002AF5 26897F0E mov [es:bx+0xe],di ; video mode list 00002AF9 268C4710 mov [es:bx+0x10],es 00002AFD B91E00 mov cx,0x1e 00002B00 BEE87F mov si,vidmodes 00002B03 F3A5 rep movsw 00002B05 26897F16 mov [es:bx+0x16],di ; oem vendor 00002B09 268C4718 mov [es:bx+0x18],es 00002B0D BE2480 mov si,0x8024 ; oem vendor 00002B10 E85C1B call word strcpy 00002B13 26897F1A mov [es:bx+0x1a],di ; oem product 00002B17 268C471C mov [es:bx+0x1c],es 00002B1B BE3880 mov si,0x8038 ; oem product 00002B1E E84E1B call word strcpy 00002B21 26897F1E mov [es:bx+0x1e],di ; oem product rev 00002B25 268C4720 mov [es:bx+0x20],es 00002B29 BE4580 mov si,0x8045 ; oem product rev 00002B2C E8401B call word strcpy 00002B2F 58 pop ax 00002B30 B90001 mov cx,0x100 00002B33 2BCF sub cx,di 00002B35 03C8 add cx,ax 00002B37 2AC0 sub al,al 00002B39 F3AA rep stosb ; zero pad .done: 00002B3B 1F pop ds 00002B3C 61 popaw 00002B3D B84F00 mov ax,0x4f 00002B40 C3 ret (The full BIOS can be found at http://peter.korsgaard.com/vgabios.bin if interested). The old setup code didn't ask for VBE2 info, and the new code doesn't actually do anything with the extra information, so the fix is to simply not request it. Other BIOS'es might have the same problem. Signed-off-by: Peter Korsgaard <jacmet@sunsite.dk> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-03-07 16:39:14 +01:00
Ingo Molnar	7432d149fd	x86: re-add reboot fixups Jan Beulich noticed that the reboot fixups went missing during reboot.c unification. (commit `4d022e35fd`) Geode and a few other rare boards with special reboot quirks are affected. Reported-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-03-07 16:39:14 +01:00
Jan Beulich	d032b31a3a	x86: fix typo in step.c TIF_DEBUGCTLMSR has no meaning in the actual MSR... Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-03-07 16:39:14 +01:00
Jan Beulich	609b5297bc	x86: fix merge mistake in i387.c convert_fxsr_to_user() in 2.6.24's i387_32.c did this, and convert_to_fxsr() also does the inverse, so I assume it's an oversight that it is no longer being done. [ mingo@elte.hu: we encode it this way because there's no space for the 'FPU Last Instruction Opcode' (->fop) field in the legacy user_i387_ia32_struct that PTRACE_GETFPREGS/PTRACE_SETFPREGS uses. it's probably pure legacy - i'd be surprised if any user-space relied on the FPU Last Opcode in any way. But indeed we used to do it previously so the most conservative thing is to preserve that piece of information. ] Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-03-07 16:39:14 +01:00
Aurelien Jarno	e40cd10ccf	x86: clear DF before calling signal handler The Linux kernel currently does not clear the direction flag before calling a signal handler, whereas the x86/x86-64 ABI requires that. Linux had this behavior/bug forever, but this becomes a real problem with gcc version 4.3, which assumes that the direction flag is correctly cleared at the entry of a function. This patches changes the setup_frame() functions to clear the direction before entering the signal handler. Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: H. Peter Anvin <hpa@zytor.com>	2008-03-07 16:39:14 +01:00
Dave Jones	0e5aa8d621	[CPUFREQ] Remove debugging message from e_powersaver We don't need to printk a message every time we transition. Leave the code there, but ifdef'd out, as it's useful when adding support for new processors. Reported-by: Petr Titěra <P.Titera@century.cz> Signed-off-by: Dave Jones <davej@redhat.com>	2008-03-05 14:45:31 -05:00
Ananth N Mavinakayanahalli	9edddaa200	Kprobes: indicate kretprobe support in Kconfig Add CONFIG_HAVE_KRETPROBES to the arch/<arch>/Kconfig file for relevant architectures with kprobes support. This facilitates easy handling of in-kernel modules (like samples/kprobes/kretprobe_example.c) that depend on kretprobes being present in the kernel. Thanks to Sam Ravnborg for helping make the patch more lean. Per Mathieu's suggestion, added CONFIG_KRETPROBES and fixed up dependencies. Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Acked-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-03-04 16:35:11 -08:00
Hugh Dickins	fcab59a318	x86: a P4 is a P6 not an i486 P4 has been coming out as CPU_FAMILY=4 instead of 6: fix MPENTIUM4 typo. Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-03-04 11:55:34 -08:00
Linus Torvalds	34f10fc988	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86: x86/xen: fix DomU boot problem x86: not set node to cpu_to_node if the node is not online x86, i387: fix ptrace leakage using init_fpu()	2008-03-04 09:22:32 -08:00
Linus Torvalds	67171a3f03	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm: x86: disable KVM for Voyager and friends KVM: VMX: Avoid rearranging switched guest msrs while they are loaded KVM: MMU: Fix race when instantiating a shadow pte KVM: Route irq 0 to vcpu 0 exclusively KVM: Avoid infinite-frequency local apic timer KVM: make MMU_DEBUG compile again KVM: move alloc_apic_access_page() outside of non-preemptable region KVM: SVM: fix Windows XP 64 bit installation crash KVM: remove the usage of the mmap_sem for the protection of the memory slots. KVM: emulate access to MSR_IA32_MCG_CTL KVM: Make the supported cpuid list a host property rather than a vm property KVM: Fix kvm_arch_vcpu_ioctl_set_sregs so that set_cr0 works properly KVM: SVM: set NM intercept when enabling CR0.TS in the guest KVM: SVM: Fix lazy FPU switching	2008-03-04 09:22:05 -08:00
Ian Campbell	87d034f313	x86/xen: fix DomU boot problem Construct Xen guest e820 map with a hole between 640K-1M. It's pure luck that Xen kernels have gotten away with it in the past. The patch below seems like the right thing to do. It certainly boots in a domU without the DMI problem (without any of the other related patches such as Alexander's). Signed-off-by: Ian Campbell <ijc@hellion.org.uk> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Tested-by: Mark McLoughlin <markmc@redhat.com> Acked-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: Thomas Gleixner <tglx@linutronix.de>	2008-03-04 17:10:12 +01:00
Yinghai Lu	7c9e92b6cd	x86: not set node to cpu_to_node if the node is not online resolve boot problem reported by Mel Gorman: http://lkml.org/lkml/2008/2/13/404 init_cpu_to_node will use cpu->apic (from MADT or mptable) and apic->node(from SRAT or AMD config space with k8_bus_64.c) to have cpu->node mapping, and later identify_cpu will overwrite them again...(with nearby_node...) this patch checks if the node is online, otherwise it will not update cpu_node map. so keep cpu_node map to online node before identify_cpu..., to prevent possible error. Signed-off-by: Yinghai Lu <yinghai.lu@sun.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: Thomas Gleixner <tglx@linutronix.de>	2008-03-04 17:10:12 +01:00
Suresh Siddha	18a8622101	x86, i387: fix ptrace leakage using init_fpu() This bug got introduced by the recent i387 merge: commit `4421011120` Author: Roland McGrath <roland@redhat.com> Date: Wed Jan 30 13:31:50 2008 +0100 x86: x86 i387 user_regset Current usage of unlazy_fpu() in ptrace specific routines is wrong. unlazy_fpu() will not init fpu if the task never used math. So the ptrace calls can expose the parent tasks FPU data in some cases. Replace it with the init_fpu() which will init the math state, if the task never used math before. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: Thomas Gleixner <tglx@linutronix.de>	2008-03-04 17:10:12 +01:00
Randy Dunlap	1a4e3f89c6	x86: disable KVM for Voyager and friends Most classic Pentiums don't have hardware virtualization extension, and building kvm with Voyager, Visual Workstation, or NUMAQ generates spurious failures. Signed-off-by: Avi Kivity <avi@qumranet.com> Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>	2008-03-04 17:42:55 +02:00
Avi Kivity	33f9c505ed	KVM: VMX: Avoid rearranging switched guest msrs while they are loaded KVM tries to run as much as possible with the guest msrs loaded instead of host msrs, since switching msrs is very expensive. It also tries to minimize the number of msrs switched according to the guest mode; for example, MSR_LSTAR is needed only by long mode guests. This optimization is done by setup_msrs(). However, we must not change which msrs are switched while we are running with guest msr state: - switch to guest msr state - call setup_msrs(), removing some msrs from the list - switch to host msr state, leaving a few guest msrs loaded An easy way to trigger this is to kexec an x86_64 linux guest. Early during setup, the guest will switch EFER to not include SCE. KVM will stop saving MSR_LSTAR, and on the next msr switch it will leave the guest LSTAR loaded. The next host syscall will end up in a random location in the kernel. Fix by reloading the host msrs before changing the msr list. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-03-04 15:19:50 +02:00
Avi Kivity	f7d9c7b7b9	KVM: MMU: Fix race when instantiating a shadow pte For improved concurrency, the guest walk is performed concurrently with other vcpus. This means that we need to revalidate the guest ptes once we have write-protected the guest page tables, at which point they can no longer be modified. The current code attempts to avoid this check if the shadow page table is not new, on the assumption that if it has existed before, the guest could not have modified the pte without the shadow lock. However the assumption is incorrect, as the racing vcpu could have modified the pte, then instantiated the shadow page, before our vcpu regains control: vcpu0 vcpu1 fault walk pte modify pte fault in same pagetable instantiate shadow page lookup shadow page conclude it is old instantiate spte based on stale guest pte We could do something clever with generation counters, but a test run by Marcelo suggests this is unnecessary and we can just do the revalidation unconditionally. The pte will be in the processor cache and the check can be quite fast. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-03-04 15:19:49 +02:00
Avi Kivity	0b975a3c2d	KVM: Avoid infinite-frequency local apic timer If the local apic initial count is zero, don't start a an hrtimer with infinite frequency, locking up the host. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-03-04 15:19:48 +02:00
Marcelo Tosatti	24993d5349	KVM: make MMU_DEBUG compile again the cr3 variable is now inside the vcpu->arch structure. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-03-04 15:19:47 +02:00
Marcelo Tosatti	5e4a0b3c1b	KVM: move alloc_apic_access_page() outside of non-preemptable region alloc_apic_access_page() can sleep, while vmx_vcpu_setup is called inside a non preemptable region. Move it after put_cpu(). Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-03-04 15:19:46 +02:00
Joerg Roedel	a2938c8070	KVM: SVM: fix Windows XP 64 bit installation crash While installing Windows XP 64 bit wants to access the DEBUGCTL and the last branch record (LBR) MSRs. Don't allowing this in KVM causes the installation to crash. This patch allow the access to these MSRs and fixes the issue. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Markus Rechberger <markus.rechberger@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-03-04 15:19:45 +02:00
Izik Eidus	72dc67a696	KVM: remove the usage of the mmap_sem for the protection of the memory slots. This patch replaces the mmap_sem lock for the memory slots with a new kvm private lock, it is needed beacuse untill now there were cases where kvm accesses user memory while holding the mmap semaphore. Signed-off-by: Izik Eidus <izike@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-03-04 15:19:40 +02:00
Rafael J. Wysocki	9b5cf48b06	x86: revert "x86: CPA: avoid split of alias mappings" Revert: commit `8be8f54bae` Author: Thomas Gleixner <tglx@linutronix.de> Date: Sat Feb 23 20:43:21 2008 +0100 x86: CPA: avoid split of alias mappings because it clearly mishandles the case when __change_page_attr(), called from __change_page_attr_set_clr(), changes cpa->processed to 1 and cpa_process_alias(cpa) is executed right after that. This crashes my x86-64 test box early in the boot process (ref. http://bugzilla.kernel.org/show_bug.cgi?id=10140#c4). Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-03-03 14:18:27 +01:00
Joerg Roedel	c7ac679c16	KVM: emulate access to MSR_IA32_MCG_CTL Injecting an GP when accessing this MSR lets Windows crash when running some stress test tools in KVM. So this patch emulates access to this MSR. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Markus Rechberger <markus.rechberger@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-03-03 11:22:37 +02:00
Avi Kivity	674eea0fc4	KVM: Make the supported cpuid list a host property rather than a vm property One of the use cases for the supported cpuid list is to create a "greatest common denominator" of cpu capabilities in a server farm. As such, it is useful to be able to get the list without creating a virtual machine first. Since the code does not depend on the vm in any way, all that is needed is to move it to the device ioctl handler. The capability identifier is also changed so that binaries made against -rc1 will fail gracefully. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-03-03 11:22:25 +02:00

1 2 3 4 5 ...

1458 Commits