The method used to work out whether we were booted by EFI firmware or
via a boot loader is broken. Because efi_main() is always executed
when booting from a boot loader we will dereference invalid pointers
either on the stack (CONFIG_X86_32) or contained in %rdx
(CONFIG_X86_64) when searching for an EFI System Table signature.
Instead of dereferencing these invalid system table pointers, add a
new entry point that is only used when booting from EFI firmware, when
we know the pointer arguments will be valid. With this change legacy
boot loaders will no longer execute efi_main(), but will instead skip
EFI stub initialisation completely.
[ hpa: Marking this for urgent/stable since it is a regression when
the option is enabled; without the option the patch has no effect ]
Signed-off-by: Matt Fleming <matt.hfleming@intel.com>
Link: http://lkml.kernel.org/r/1334584744.26997.14.camel@mfleming-mobl1.ger.corp.intel.com
Reported-by: Jordan Justen <jordan.l.justen@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: <stable@vger.kernel.org> v3.3
There is currently a large divide between kernel development and the
development of EFI boot loaders. The idea behind this patch is to give
the kernel developers full control over the EFI boot process. As
H. Peter Anvin put it,
"The 'kernel carries its own stub' approach been very successful in
dealing with BIOS, and would make a lot of sense to me for EFI as
well."
This patch introduces an EFI boot stub that allows an x86 bzImage to
be loaded and executed by EFI firmware. The bzImage appears to the
firmware as an EFI application. Luckily there are enough free bits
within the bzImage header so that it can masquerade as an EFI
application, thereby coercing the EFI firmware into loading it and
jumping to its entry point. The beauty of this masquerading approach
is that both BIOS and EFI boot loaders can still load and run the same
bzImage, thereby allowing a single kernel image to work in any boot
environment.
The EFI boot stub supports multiple initrds, but they must exist on
the same partition as the bzImage. Command-line arguments for the
kernel can be appended after the bzImage name when run from the EFI
shell, e.g.
Shell> bzImage console=ttyS0 root=/dev/sdb initrd=initrd.img
v7:
- Fix checkpatch warnings.
v6:
- Try to allocate initrd memory just below hdr->inird_addr_max.
v5:
- load_options_size is UTF-16, which needs dividing by 2 to convert
to the corresponding ASCII size.
v4:
- Don't read more than image->load_options_size
v3:
- Fix following warnings when compiling CONFIG_EFI_STUB=n
arch/x86/boot/tools/build.c: In function ‘main’:
arch/x86/boot/tools/build.c:138:24: warning: unused variable ‘pe_header’
arch/x86/boot/tools/build.c:138:15: warning: unused variable ‘file_sz’
- As reported by Matthew Garrett, some Apple machines have GOPs that
don't have hardware attached. We need to weed these out by
searching for ones that handle the PCIIO protocol.
- Don't allocate memory if no initrds are on cmdline
- Don't trust image->load_options_size
Maarten Lankhorst noted:
- Don't strip first argument when booted from efibootmgr
- Don't allocate too much memory for cmdline
- Don't update cmdline_size, the kernel considers it read-only
- Don't accept '\n' for initrd names
v2:
- File alignment was too large, was 8192 should be 512. Reported by
Maarten Lankhorst on LKML.
- Added UGA support for graphics
- Use VIDEO_TYPE_EFI instead of hard-coded number.
- Move linelength assignment until after we've assigned depth
- Dynamically fill out AddressOfEntryPoint in tools/build.c
- Don't use magic number for GDT/TSS stuff. Requested by Andi Kleen
- The bzImage may need to be relocated as it may have been loaded at
a high address address by the firmware. This was required to get my
macbook booting because the firmware loaded it at 0x7cxxxxxx, which
triggers this error in decompress_kernel(),
if (heap > ((-__PAGE_OFFSET-(128<<20)-1) & 0x7fffffff))
error("Destination address too large");
Cc: Mike Waychison <mikew@google.com>
Cc: Matthew Garrett <mjg@redhat.com>
Tested-by: Henrik Rydberg <rydberg@euromail.se>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Link: http://lkml.kernel.org/r/1321383097.2657.9.camel@mfleming-mobl1.ger.corp.intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
In order for global variables and functions to work in the
decompressor, we need to fix up the GOT in assembly code.
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
LKML-Reference: <4C57382E.8050501@zytor.com>
This has the consequence of changing the section name use for head
code from ".text.head" to ".head.text".
Linus suggested that we merge the ".text.head" section with ".text"
(presumably while preserving the fact that the head code starts at 0).
When I tried this it caused the kernel to not boot.
Signed-off-by: Tim Abbott <tabbott@ksplice.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Make the kernel_alignment field adjustable; this allows us to set it
to a large value (intended to be 16 MB to avoid ZONE_DMA contention,
memory holes and other weirdness) while a smart bootloader can still
force a loading at a lesser alignment if absolutely necessary.
Also export pref_address (preferred loading address, corresponding to
the link-time address) and init_size, the total amount of linear
memory the kernel will require during initialization.
[ Impact: allows better kernel placement, gives bootloader more info ]
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Remove a couple of lines of dead code from
arch/x86/boot/compressed/head_*.S; all of these update registers that
are dead in the current code.
[ Impact: cleanup ]
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Determine the compressed code offset (from the kernel runtime address)
at compile time. This allows some minor optimizations in
arch/x86/boot/compressed/head_*.S, but more importantly it makes this
value available to the build process, which will enable a future patch
to export the necessary linear memory footprint into the bzImage
header.
[ Impact: cleanup, future patch enabling ]
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
In the pre-decompression code, use the appropriate largest possible
rep movs and rep stos to move code and clear bss, respectively. For
reverse copy, do note that the initial values are supposed to be the
address of the first (highest) copy datum, not one byte beyond the end
of the buffer.
rep strings are not necessarily the fastest way to perform these
operations on all current processors, but are likely to be in the
future, and perhaps more importantly, we want to encourage the
architecturally right thing to do here.
This also fixes a couple of trivial inefficiencies on 64 bits.
[ Impact: trivial performance enhancement, increase code similarity ]
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
The 64-bit code already clears EFLAGS as soon as it has a stack. This
seems like a reasonable precaution, so do it on 32 bits as well.
[ Impact: extra paranoia ]
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Set up the decompression stack as soon as we know where it needs to
go. That way we have a full-service stack as soon as possible, rather
than relying on the BP_scratch field.
Note that the stack does need to be empty during bss zeroing (or
else the stack needs to be moved out of the bss segment, which is also
an option.)
[ Impact: cleanup, minor paranoia ]
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Both on 32 and 64 bits, we copy all the way up to the end of bss,
except that on 64 bits there is a hack to avoid copying on top of the
page tables. There is no point in copying bss at all, especially
since we are just about to zero it all anyway.
To clean up and unify the handling, we now do:
- copy from startup_32 to _bss.
- zero from _bss to _ebss.
- the _ebss symbol is aligned to an 8-byte boundary.
- the page tables are moved to a separate section.
Use _bss as the copy endpoint since _edata may be misaligned.
[ Impact: cleanup, trivial performance improvement ]
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Reformat arch/x86/boot/compressed/head_32.S to be closer to currently
preferred kernel assembly style, that is:
- opcode and operand separated by tab
- operands separated by ", "
- C-style comments
This also makes it more similar to head_64.S.
[ Impact: cleanup, no object code change ]
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Look at the:
diff -u arch/x86/boot/compressed/vmlinux_*.lds
output and realize that they're basially exactly the same except for
trivial naming differences, and the fact that the 64-bit version has a
"pgtable" thing.
So unify them.
There's some trivial cleanup there (make the output format a Kconfig thing
rather than doing #ifdef's for it, and unify both 32-bit and 64-bit BSS
end to "_ebss", where 32-bit used to use the traditional "_end"), but
other than that it's really very mindless and straigt conversion.
For example, I think we should aim to remove "startup_32" vs "startup_64",
and just call it "startup", and get rid of one more difference. I didn't
do that.
Also, notice the comment in the unified vmlinux.lds.S talks about
"head_64" and "startup_32" which is an odd and incorrect mix, but that was
actually what the old 64-bit only lds file had, so the confusion isn't
new, and now that mixing is arguably more accurate thanks to the
vmlinux.lds.S file being shared between the two cases ;)
[ Impact: cleanup, unification ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: clenaup
Linker script will put startup_32 at predefined
address so using startup_32 will not bloat the
code size.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
In general, the only definitions that assembly files can use
are in _types.S headers (where available), so convert them.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Comments in arch/x86/boot/compressed/head_32.S erroneously refer to the
real mode pointer as the second and the heap area as the third argument
to decompress_kernel(). In fact, these have been the first and second
argument, respectively, since v2.6.20.
This patch corrects the comments. It introduces no code changes.
Signed-off-by: Philipp Kohlbecher <xt28@gmx.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The kernel decompressor wrapper uses memory located beyond the
end of the image. This might lead to hard to debug problems,
but even if it can be proven to be safe, it is at the very
least unclean. I don't see any advantages either, unless you
count it not being zeroed out as an advantage. This patch
moves the boot-heap area to the bss segment.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The kernel only ever supports 1 version of the boot protocol
so there is no need to check the boot protocol revision to
see if a feature is supported.
Both x86 and x86_64 support the same boot protocol so we need
to implement the KEEP_SEGMENTS on x86_64 as well. It isn't
just paravirt bootloaders that could use this functionality.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Vivek Goyal <vgoyal@in.ibm.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Zachary Amsden <zach@vmware.com>
Cc: Andi Kleen <ak@suse.de>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This patch uses the updated boot protocol to do paravirtualized boot.
If the boot version is >= 2.07, then it will do two things:
1. Check the bootparams loadflags to see if we should reload the
segment registers and clear interrupts. This is appropriate
for normal native boot and some paravirtualized environments, but
inapproprate for others.
2. Check the hardware architecture, and dispatch to the appropriate
kernel entrypoint. If the bootloader doesn't set this, then we
simply do the normal boot sequence.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Vivek Goyal <vgoyal@in.ibm.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Zachary Amsden <zach@vmware.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>