The pushf/popf in switch_to are ONLY used to switch IOPL. Making this
explicit in C code is more clear. This pushf/popf pair was added as a
bugfix for leaking IOPL to unprivileged processes when using
sysenter/sysexit based system calls (sysexit does not restore flags).
When requesting an IOPL change in sys_iopl(), it is just as easy to change
the current flags and the flags in the stack image (in case an IRET is
required), but there is no reason to force an IRET if we came in from the
SYSENTER path.
This change is the minimal solution for supporting a paravirtualized Linux
kernel that allows user processes to run with I/O privilege. Other
solutions require radical rewrites of part of the low level fault / system
call handling code, or do not fully support sysenter based system calls.
Unfortunately, this added one field to the thread_struct. But as a bonus,
on P4, the fastest time measured for switch_to() went from 312 to 260
cycles, a win of about 17% in the fast case through this performance
critical path.
Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Privilege checking cleanup. Originally, these diffs were much greater, but
recent cleanups in Linux have already done much of the cleanup. I added
some explanatory comments in places where the reasoning behind certain
tests is rather subtle.
Also, in traps.c, we can skip the user_mode check in handle_BUG(). The
reason is, there are only two call chains - one via die_if_kernel() and one
via do_page_fault(), both entering from die(). Both of these paths already
ensure that a kernel mode failure has happened. Also, the original check
here, if (user_mode(regs)) was insufficient anyways, since it would not
rule out BUG faults from V8086 mode execution.
Saving the %ss segment in show_regs() rather than assuming a fixed value
also gives better information about the current kernel state in the
register dump.
Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Some more assembler cleanups I noticed along the way.
Signed-off-by: Zachary Amsden <zach@vmware.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Noticed by Chuck Ebbert: the .ldt entry of the TSS was set up incorrectly.
It never mattered since this was a leftover from old times, so remove it.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Also, setting PDPEs in PAE mode does not require atomic operations, since the
PDPEs are cached by the processor, and only reloaded on an explicit or
implicit reload of CR3.
Since the four PDPEs must always be present in an active root, and the kernel
PDPE is never updated, we are safe even from SMIs and interrupts / NMIs using
task gates (which reload CR3). Actually, much of this is moot, since the user
PDPEs are never updated either, and the only usage of task gates is by the
doublefault handler. It appears the only place PGDs get updated in PAE mode
is in init_low_mappings() / zap_low_mapping() for initial page table creation
and recovery from ACPI sleep state, and these sites are safe by inspection.
Getting rid of the cmpxchg8b saves code space and 720 cycles in pgd_alloc on
P4.
Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
GCC can generate better code around descriptor update and access functions
when there is not an explicit "eax" register constraint.
Testing: You won't boot if this is messed up, since the TSS descriptor will be
corrupted. Verified the assembler and booted.
Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
i386 inline assembler cleanup.
This change encapsulates descriptor and task register management. Also,
it is possible to improve assembler generation in two cases; savesegment
may store the value in a register instead of a memory location, which
allows GCC to optimize stack variables into registers, and MOV MEM, SEG
is always a 16-bit write to memory, making the casting in math-emu
unnecessary.
Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
i386 arch cleanup. Introduce the serialize macro to serialize processor
state. Why the microcode update needs it I am not quite sure, since wrmsr()
is already a serializing instruction, but it is a microcode update, so I will
keep the semantic the same, since this could be a timing workaround. As far
as I can tell, this has always been there since the original microcode update
source.
Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
i386 Inline asm cleanup. Use cr/dr accessor functions.
Also, a potential bugfix. Also, some CR accessors really should be volatile.
Reads from CR0 (numeric state may change in an exception handler), writes to
CR4 (flipping CR4.TSD) and reads from CR2 (page fault) prevent instruction
re-ordering. I did not add memory clobber to CR3 / CR4 / CR0 updates, as it
was not there to begin with, and in no case should kernel memory be clobbered,
except when doing a TLB flush, which already has memory clobber.
I noticed that page invalidation does not have a memory clobber. I can't find
a bug as a result, but there is definitely a potential for a bug here:
#define __flush_tlb_single(addr) \
__asm__ __volatile__("invlpg %0": :"m" (*(char *) addr))
Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This is subarch update for ES7000. I've modified platform check code and
removed unnecessary OEM table parsing for newer systems that don't use OEM
information during boot. Parsing the table in fact is causing problems,
and the platform doesn't get recognized. The patch only affects the ES7000
subach.
Signed-off-by: <Natalie.Protasevich@unisys.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
i386 generic subarchitecture requires explicit dmi strings or command line
to enable bigsmp mode. The patch below removes that restriction, and uses
bigsmp as soon as it finds more than 8 logical CPUs, Intel processors and
xAPIC support.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The memory descriptors that comprise the EFI memory map are not fixed in
stone such that the size could change in the future. This uses the memory
descriptor size obtained from EFI to iterate over the memory map entries
during boot. This enables the removal of an x86 specific pad (and ifdef)
in the EFI header. I also couldn't stomach the broken up nature of the
function to put EFI runtime calls into virtual mode any longer so I fixed
that up a bit as well.
For reference, this patch only impacts x86.
Signed-off-by: Matt Tolentino <matthew.e.tolentino@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch has fixed the following warnings.
arch/mips/kernel/genex.S:250:5: warning: "CONFIG_64BIT" is not defined
arch/mips/math-emu/cp1emu.c:1128:5: warning: "__mips64" is not defined
arch/mips/math-emu/cp1emu.c:1206:5: warning: "__mips64" is not defined
arch/mips/math-emu/cp1emu.c:1270:5: warning: "__mips64" is not defined
arch/mips/math-emu/cp1emu.c:323:5: warning: "__mips64" is not defined
arch/mips/math-emu/cp1emu.c:808:5: warning: "__mips64" is not defined
arch/mips/math-emu/cp1emu.c:953:5: warning: "__mips64" is not defined
arch/mips/mm/tlbex.c:519:5: warning: "CONFIG_64BIT" is not defined
include/asm/reg.h:73:5: warning: "CONFIG_64BIT" is not defined
Signed-off-by: Yoichi Yuasa <yuasa@hh.iij4u.or.jp>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fix the MIPS coherency configuration such that we always keep the mapping
state in <asm/pci.h> when we need to on non-coherent platforms.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
- MIPS Denmark does no longer exist; the PCI vendor ID is now owned by
MIPS Technologies.
- Add ID for SOC-it, MIPS's system controller.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add support for the virtual MIPS system that is emulated by Qemu. See
http://www.linux-mips.org/wiki/Qemu for a detailed current status.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Remove the one file which managed to survive the removel of HP Laserjet
support.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
There seem to be no more users or interest in the NEC Osprey evaluation
system for the NEC VR4181 SOC which is an old part anyway, so remove the
code. More information on the Osprey can be found at
http://www.linux-mips.org/wiki/Osprey.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch has updated IRQ handling for vr41xx.
o added common IRQ dispatch
o changed IRQ number in int-handler.S
o added resource management to icu.c
Signed-off-by: Yoichi Yuasa <yuasa@hh.iij4u.or.jp>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
We need to indicate to the hypervisor that it needs to save our VMX
registers when switching partitions on a shared-processor system, just as
it needs to for FP and PMC registers.
This could be made to be on-demand when VMX is used, but we don't do that
for FP nor PMC right now either so let's not overcomplicate things.
Signed-off-by: Olof Johansson <olof@lixom.net>
Acked-by: Paul Mackerras <paulus@samba.org>
Cc: Anton Blanchard <anton@samba.org>
Cc: <engebret@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Updates and enhancement to the ppc32 mv64x60 code:
- move code to get mem size from mem ctlr to bootwrapper
- address some errata in the mv64360 pic code
- some minor cleanups
- export one of the bridge's regs via sysfs so user daemon can watch for
extraction events
Signed-off-by: Mark A. Greer <mgreer@mvista.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add declaration and cacheable_memcpy(). I'll be needing this function in
new 4xx EMAC driver I'm going to submit to netdev soon.
IMHO, the better place for the declaration would be asm-powerpc/string.h,
unfortunately, ppc64 doesn't have this function, so asm-ppc/system.h is the
next best place.
Signed-off-by: Eugene Surovegin <ebs@ebshome.net>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add dcr_base field to ocp_func_mal_data. This is preparation step for the
new EMAC driver.
Signed-off-by: Eugene Surovegin <ebs@ebshome.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Move 4xx PHY_MODE_XXX defines to asm-ppc/ibm_ocp.h. This is a preparation
step for the new EMAC driver.
Signed-off-by: Eugene Surovegin <ebs@ebshome.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
While ppc32 has the CONFIG_HZ Kconfig option, it wasnt actually being used.
Connect it up and set all platforms to 250Hz. This pretty much mimics the
ppc64 patch from Anton Blanchard.
Signed-off-by: Kumar Gala <kumar.gala@freescale.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add the ability to identify an SOC by a name and id. There are cases in
which the integer identifier is not sufficient to specify a specific SOC.
In these cases we can use a string to further qualify the match.
Signed-off-by: Vitaly Bordug <vbordug@ru.mvista.com>
Signed-off-by: Kumar Gala <kumar.gala@freescale.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
flush_dcache_icache_page() will be called on an instruction page fault. We
can't sleep in the fault handler, so use kmap_atomic() instead of just
kmap() for the Book-E case.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Acked-by: Matt Porter <mporter@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Renamed global variables used to convey if the watchdog is enabled and
periodicity of the timer and moved the declarations into a header for these
variables
Signed-off-by: Matt McClintock <msm@freescale.com>
Signed-off-by: Kumar Gala <kumar.gala@freescale.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch adds a field to struct ocp_func_emac_data that allows
platform-specific unsupported PHY features to be passed in to the ibm_emac
ethernet driver.
This patch also adds some logic for the Bamboo eval board to populate this
field based on the dip switches on the board. This is a workaround for the
improperly biased RJ-45 sockets on the Rev. 0 Bamboo.
Signed-off-by: Wade Farnsworth <wfarnsworth@mvista.com>
Signed-off-by: Matt Porter <mporter@kernel.crashing.org>
Cc: Jeff Garzik <jgarzik@pobox.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Added ppc_sys device and system definitions for PowerQUICC II devices.
This will allow drivers for PQ2 to be proper platform device drivers.
Which can be shared on PQ3 processors with the same peripherals.
Signed-off-by: Matt McClintock <msm@freescale.com>
Signed-off-by: Kumar Gala <kumar.gala@freescale.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
GFP flags must be passed as unisgned int __nocast these days, else we'll
get tons of sparse warnings in every driver.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Support for the SPD823TS board is no longer maintained and thus being removed
Signed-off-by: Kumar Gala <kumar.gala@freescale.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Support for the REDWOOD board is no longer maintained and thus being removed
Signed-off-by: Kumar Gala <kumar.gala@freescale.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Support for the OAK board is no longer maintained and thus being removed
Signed-off-by: Kumar Gala <kumar.gala@freescale.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Support for the MCPN765 board is no longer maintained and thus being removed
Signed-off-by: Kumar Gala <kumar.gala@freescale.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Support for the ASH board is no longer maintained and thus being removed
Signed-off-by: Kumar Gala <kumar.gala@freescale.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add page_state info to the per-node meminfo file in sysfs. This is mostly
just for informational purposes.
The lack of this information was brought up recently during a discussion
regarding pagecache clearing, and I put this patch together to test out one
of the suggestions.
It seems like interesting info to have, so I'm submitting the patch.
Signed-off-by: Martin Hicks <mort@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Any architecture that has hardware updated A/D bits that require
synchronization against other processors during PTE operations can benefit
from doing non-atomic PTE updates during address space destruction.
Originally done on i386, now ported to x86_64.
Doing a read/write pair instead of an xchg() operation saves the implicit
lock, which turns out to be a big win on 32-bit (esp w PAE).
Signed-off-by: Zachary Amsden <zach@vmware.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add a new accessor for PTEs, which passes the full hint from the mmu_gather
struct; this allows architectures with hardware pagetables to optimize away
atomic PTE operations when destroying an address space. Removing the
locked operation should allow better pipelining of memory access in this
loop. I measured an average savings of 30-35 cycles per zap_pte_range on
the first 500 destructions on Pentium-M, but I believe the optimization
would win more on older processors which still assert the bus lock on xchg
for an exclusive cacheline.
Update: I made some new measurements, and this saves exactly 26 cycles over
ptep_get_and_clear on Pentium M. On P4, with a PAE kernel, this saves 180
cycles per ptep_get_and_clear, for a whopping 92160 cycles savings for a
full address space destruction.
pte_clear_full is not yet used, but is provided for future optimizations
(in particular, when running inside of a hypervisor that queues page table
updates, the full hint allows us to avoid queueing unnecessary page table
update for an address space in the process of being destroyed.
This is not a huge win, but it does help a bit, and sets the stage for
further hypervisor optimization of the mm layer on all architectures.
Signed-off-by: Zachary Amsden <zach@vmware.com>
Cc: Christoph Lameter <christoph@lameter.com>
Cc: <linux-mm@kvack.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This is used only in slab.c and each architecture gets to define whcih
underlying type is to be used.
Seems a bit silly - move it to slab.c and use the same type for all
architectures: unsigned int.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
I don't think we need to call hugetlb_clean_stale_pgtable() anymore
in 2.6.13 because of the rework with free_pgtables(). It now collect
all the pte page at the time of munmap. It used to only collect page
table pages when entire one pgd can be freed and left with staled pte
pages. Not anymore with 2.6.13. This function will never be called
and We should turn it into a BUG_ON.
I also spotted two problems here, not Adam's fault :-)
(1) in huge_pte_alloc(), it looks like a bug to me that pud is not
checked before calling pmd_alloc()
(2) in hugetlb_clean_stale_pgtable(), it also missed a call to
pmd_free_tlb. I think a tlb flush is required to flush the mapping
for the page table itself when we clear out the pmd pointing to a
pte page. However, since hugetlb_clean_stale_pgtable() is never
called, so it won't trigger the bug.
Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Cc: Adam Litke <agl@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch adds a macro pte_huge(pte) for i386/x86_64 which is needed by a
patch later in the series. Instead of repeating (_PAGE_PRESENT |
_PAGE_PSE), I've added __LARGE_PTE to i386 to match x86_64.
Signed-off-by: Adam Litke <agl@us.ibm.com>
Cc: <linux-mm@kvack.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Version 6 of the ARM architecture introduces the concept of 16MB pages
(supersections) and 36-bit (40-bit actually, but nobody uses this) physical
addresses. 36-bit addressed memory and I/O and ARMv6 can only be mapped
using supersections and the requirement on these is that both virtual and
physical addresses be 16MB aligned. In trying to add support for ioremap()
of 36-bit I/O, we run into the issue that get_vm_area() allows for a
maximum of 512K alignment via the IOREMAP_MAX_ORDER constant. To work
around this, we can:
- Allocate a larger VM area than needed (size + (1ul << IOREMAP_MAX_ORDER))
and then align the pointer ourselves, but this ends up with 512K of
wasted VM per ioremap().
- Provide a new __get_vm_area_aligned() API and make __get_vm_area() sit
on top of this. I did this and it works but I don't like the idea
adding another VM API just for this one case.
- My preferred solution which is to allow the architecture to override
the IOREMAP_MAX_ORDER constant with it's own version.
Signed-off-by: Deepak Saxena <dsaxena@plexity.net>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
_PAGE_FILE does not indicate whether a file is in page / swap cache, it is
set just for non-linear PTE's. Correct the comment for i386, x86_64, UML.
Also clearify _PAGE_NONE.
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add a capability check to sys_set_zone_reclaim(). This syscall is not
something that should be available to a user.
Signed-off-by: Martin Hicks <mort@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This bitop does not need to be atomic because it is performed when there will
be no references to the page (ie. the page is being freed).
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch was recently discussed on linux-mm:
http://marc.theaimsgroup.com/?t=112085728500002&r=1&w=2
I inherited a large code base from Ray for page migration. There was a
small patch in there that I find to be very useful since it allows the
display of the locality of the pages in use by a process. I reworked that
patch and came up with a /proc/<pid>/numa_maps that gives more information
about the vma's of a process. numa_maps is indexes by the start address
found in /proc/<pid>/maps. F.e. with this patch you can see the page use
of the "getty" process:
margin:/proc/12008 # cat maps
00000000-00004000 r--p 00000000 00:00 0
2000000000000000-200000000002c000 r-xp 00000000 08:04 516 /lib/ld-2.3.3.so
2000000000038000-2000000000040000 rw-p 00028000 08:04 516 /lib/ld-2.3.3.so
2000000000040000-2000000000044000 rw-p 2000000000040000 00:00 0
2000000000058000-2000000000260000 r-xp 00000000 08:04 54707842 /lib/tls/libc.so.6.1
2000000000260000-2000000000268000 ---p 00208000 08:04 54707842 /lib/tls/libc.so.6.1
2000000000268000-2000000000274000 rw-p 00200000 08:04 54707842 /lib/tls/libc.so.6.1
2000000000274000-2000000000280000 rw-p 2000000000274000 00:00 0
2000000000280000-20000000002b4000 r--p 00000000 08:04 9126923 /usr/lib/locale/en_US.utf8/LC_CTYPE
2000000000300000-2000000000308000 r--s 00000000 08:04 60071467 /usr/lib/gconv/gconv-modules.cache
2000000000318000-2000000000328000 rw-p 2000000000318000 00:00 0
4000000000000000-4000000000008000 r-xp 00000000 08:04 29576399 /sbin/mingetty
6000000000004000-6000000000008000 rw-p 00004000 08:04 29576399 /sbin/mingetty
6000000000008000-600000000002c000 rw-p 6000000000008000 00:00 0 [heap]
60000fff7fffc000-60000fff80000000 rw-p 60000fff7fffc000 00:00 0
60000ffffff44000-60000ffffff98000 rw-p 60000ffffff44000 00:00 0 [stack]
a000000000000000-a000000000020000 ---p 00000000 00:00 0 [vdso]
cat numa_maps
2000000000000000 default MaxRef=43 Pages=11 Mapped=11 N0=4 N1=3 N2=2 N3=2
2000000000038000 default MaxRef=1 Pages=2 Mapped=2 Anon=2 N0=2
2000000000040000 default MaxRef=1 Pages=1 Mapped=1 Anon=1 N0=1
2000000000058000 default MaxRef=43 Pages=61 Mapped=61 N0=14 N1=15 N2=16 N3=16
2000000000268000 default MaxRef=1 Pages=2 Mapped=2 Anon=2 N0=2
2000000000274000 default MaxRef=1 Pages=3 Mapped=3 Anon=3 N0=3
2000000000280000 default MaxRef=8 Pages=3 Mapped=3 N0=3
2000000000300000 default MaxRef=8 Pages=2 Mapped=2 N0=2
2000000000318000 default MaxRef=1 Pages=1 Mapped=1 Anon=1 N2=1
4000000000000000 default MaxRef=6 Pages=2 Mapped=2 N1=2
6000000000004000 default MaxRef=1 Pages=1 Mapped=1 Anon=1 N0=1
6000000000008000 default MaxRef=1 Pages=1 Mapped=1 Anon=1 N0=1
60000fff7fffc000 default MaxRef=1 Pages=1 Mapped=1 Anon=1 N0=1
60000ffffff44000 default MaxRef=1 Pages=1 Mapped=1 Anon=1 N0=1
getty uses ld.so. The first vma is the code segment which is used by 43
other processes and the pages are evenly distributed over the 4 nodes.
The second vma is the process specific data portion for ld.so. This is
only one page.
The display format is:
<startaddress> Links to information in /proc/<pid>/map
<memory policy> This can be "default" "interleave={}", "prefer=<node>" or "bind={<zones>}"
MaxRef= <maximum reference to a page in this vma>
Pages= <Nr of pages in use>
Mapped= <Nr of pages with mapcount >
Anon= <nr of anonymous pages>
Nx= <Nr of pages on Node x>
The content of the proc-file is self-evident. If this would be tied into
the sparsemem system then the contents of this file would not be too
useful.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The idea of a swap_device_lock per device, and a swap_list_lock over them all,
is appealing; but in practice almost every holder of swap_device_lock must
already hold swap_list_lock, which defeats the purpose of the split.
The only exceptions have been swap_duplicate, valid_swaphandles and an
untrodden path in try_to_unuse (plus a few places added in this series).
valid_swaphandles doesn't show up high in profiles, but swap_duplicate does
demand attention. However, with the hold time in get_swap_pages so much
reduced, I've not yet found a load and set of swap device priorities to show
even swap_duplicate benefitting from the split. Certainly the split is mere
overhead in the common case of a single swap device.
So, replace swap_list_lock and swap_device_lock by spinlock_t swap_lock
(generally we seem to prefer an _ in the name, and not hide in a macro).
If someone can show a regression in swap_duplicate, then probably we should
add a hashlock for the swap_map entries alone (shorts being anatomic), so as
to help the case of the single swap device too.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
get_swap_page has often shown up on latency traces, doing lengthy scans while
holding two spinlocks. swap_list_lock is already dropped, now scan_swap_map
drop swap_device_lock before scanning the swap_map.
While scanning for an empty cluster, don't worry that racing tasks may
allocate what was free and free what was allocated; but when allocating an
entry, check it's still free after retaking the lock. Avoid dropping the lock
in the expected common path. No barriers beyond the locks, just let the
cookie crumble; highest_bit limit is volatile, but benign.
Guard against swapoff: must check SWP_WRITEOK before allocating, must raise
SWP_SCANNING reference count while in scan_swap_map, swapoff wait for that to
fall - just use schedule_timeout, we don't want to burden scan_swap_map
itself, and it's very unlikely that anyone can really still be in
scan_swap_map once swapoff gets this far.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The swap header's unsigned int last_page determines the range of swap pages,
but swap_info has been using int or unsigned long in some cases: use unsigned
int throughout (except, in several places a local unsigned long is useful to
avoid overflows when adding).
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Jens Axboe <axboe@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The "Adding %dk swap" message shows the number of swap extents, as a guide to
how fragmented the swapfile may be. But a useful further guide is what total
extent they span across (sometimes scarily large).
And there's no need to keep nr_extents in swap_info: it's unused after the
initial message, so save a little space by keeping it on stack.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
There are several comments that swap's extent_list.prev points to the lowest
extent: that's not so, it's extent_list.next which points to it, as you'd
expect. And a couple of loops in add_swap_extent which go all the way through
the list, when they should just add to the other end.
Fix those up, and let map_swap_page search the list forwards: profiles shows
it to be twice as quick that way - because prefetch works better on how the
structs are typically kmalloc'ed? or because usually more is written to than
read from swap, and swap is allocated ascendingly?
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Someone mentioned that almost all the architectures used basically the same
implementation of get_order. This patch consolidates them into
asm-generic/page.h and includes that in the appropriate places. The
exceptions are ia64 and ppc which have their own (presumably optimised)
versions.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This splits up sparse_index_alloc() into two pieces. This is needed
because we'll allocate the memory for the second level in a different place
from where we actually consume it to keep the allocation from happening
underneath a lock
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Bob Picco <bob.picco@hp.com>
Cc: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
With cleanups from Dave Hansen <haveblue@us.ibm.com>
SPARSEMEM_EXTREME makes mem_section a one dimensional array of pointers to
mem_sections. This two level layout scheme is able to achieve smaller
memory requirements for SPARSEMEM with the tradeoff of an additional shift
and load when fetching the memory section. The current SPARSEMEM
implementation is a one dimensional array of mem_sections which is the
default SPARSEMEM configuration. The patch attempts isolates the
implementation details of the physical layout of the sparsemem section
array.
SPARSEMEM_EXTREME requires bootmem to be functioning at the time of
memory_present() calls. This is not always feasible, so architectures
which do not need it may allocate everything statically by using
SPARSEMEM_STATIC.
Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Bob Picco <bob.picco@hp.com>
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
A new option for SPARSEMEM is ARCH_SPARSEMEM_EXTREME. Architecture
platforms with a very sparse physical address space would likely want to
select this option. For those architecture platforms that don't select the
option, the code generated is equivalent to SPARSEMEM currently in -mm.
I'll be posting a patch on ia64 ml which uses this new SPARSEMEM feature.
ARCH_SPARSEMEM_EXTREME makes mem_section a one dimensional array of
pointers to mem_sections. This two level layout scheme is able to achieve
smaller memory requirements for SPARSEMEM with the tradeoff of an
additional shift and load when fetching the memory section. The current
SPARSEMEM -mm implementation is a one dimensional array of mem_sections
which is the default SPARSEMEM configuration. The patch attempts isolates
the implementation details of the physical layout of the sparsemem section
array.
ARCH_SPARSEMEM_EXTREME depends on 64BIT and is by default boolean false.
I've boot tested under aim load ia64 configured for ARCH_SPARSEMEM_EXTREME.
I've also boot tested a 4 way Opteron machine with !ARCH_SPARSEMEM_EXTREME
and tested with aim.
Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Bob Picco <bob.picco@hp.com>
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This is part of Thomas Gleixner's generic IRQ patch, which converts
ARM to use the generic IRQ subsystem. Here, we wrap calls to
desc->handler() in an inline function, desc_handle_irq(). This
reduces the size of Thomas' patch since the changes become more
localised.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This is part of Thomas Gleixner's generic IRQ patch, which converts
ARM to use the generic IRQ subsystem. Here, we rename two of the
irq_chip methods - wake becomes set_wake, and type becomes set_type.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Adds a new ios for setting the chip select pin on MMC cards. Needed on
SD controllers which use this pin for other things and therefore cannot
have it pulled high at all times.
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This patch is against 2.6.10, but still applies cleanly. It's just
s/driverfs/sysfs/ in these two files.
Signed-off-by: Rolf Eike Beer <eike-kernel@sf-tec.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Need pfn_valid macro, even on MMUless platforms.
Enclose the macro args of __pa and __va in parentheses.
Signed-off-by: Greg Ungerer <gerg@uclinux.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
All we need to do is resegment the queue so that
we record SACK information accurately. The edges
of the SACK blocks guide our resegmenting decisions.
With help from Herbert Xu.
Signed-off-by: David S. Miller <davem@davemloft.net>
I've finally found a potential cause of the sk_forward_alloc underflows
that people have been reporting sporadically.
When tcp_sendmsg tacks on extra bits to an existing TCP_PAGE we don't
check sk_forward_alloc even though a large amount of time may have
elapsed since we allocated the page. In the mean time someone could've
come along and liberated packets and reclaimed sk_forward_alloc memory.
This patch makes tcp_sendmsg check sk_forward_alloc every time as we
do in do_tcp_sendpages.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch introduces sk_stream_wmem_schedule as a short-hand for
the sk_forward_alloc checking on egress.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
The crypto layer currently uses in_atomic() to determine whether it is
allowed to sleep. This is incorrect since spin locks don't always cause
in_atomic() to return true.
Instead of that, this patch returns to an earlier idea of a per-tfm flag
which determines whether sleeping is allowed. Unlike the earlier version,
the default is to not allow sleeping. This ensures that no existing code
can break.
As usual, this flag may either be set through crypto_alloc_tfm(), or
just before a specific crypto operation.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently tun/tap only supports the EN10MB ARP type. For use with
wireless and other networking types it should be possible to set the
ARP type via an ioctl.
Patch v2: Included check that the tap interface is down before changing the
link type out from underneath it
Signed-off-by: Mike Kershaw <dragorn@kismetwireless.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patch from Nicolas Pitre
The prototype for sys_fadvise64_64() is:
long sys_fadvise64_64(int fd, loff_t offset, loff_t len, int advice)
The argument list is therefore as follows on legacy ABI:
fd: type int (r0)
offset: type long long (r1-r2)
len: type long long (r3-sp[0])
advice: type int (sp[4])
With EABI this becomes:
fd: type int (r0)
offset: type long long (r2-r3)
len: type long long (sp[0]-sp[4])
advice: type int (sp[8])
Not only do we have ABI differences here, but the EABI version requires
one additional word on the syscall stack.
To avoid the ABI mismatch and the extra stack space required with EABI
this syscall is now defined with a different argument ordering
on ARM as follows:
long sys_arm_fadvise64_64(int fd, int advice, loff_t offset, loff_t len)
This gives us the following ABI independent argument distribution:
fd: type int (r0)
advice: type int (r1)
offset: type long long (r2-r3)
len: type long long (sp[0]-sp[4])
Now, since the syscall entry code takes care of 5 registers only by
default including the store of r4 to the stack, we need a wrapper to
store r5 to the stack as well. Because that wrapper was missing and was
always required this means that sys_fadvise64_64 never worked on ARM and
therefore we can safely reuse its syscall number for our new
sys_arm_fadvise64_64 interface.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This kills warnings when building drivers/ide/ide-iops.c
and puts us in-line with what other platforms do here.
Signed-off-by: David S. Miller <davem@davemloft.net>
Patch from Sascha Hauer
This patch adds support for setting and getting RTS / CTS via
set_mtctrl / get_mctrl functions.
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from David Vrabel
Correct the ioread* and iowrite* functions. In particular, add an offset to the cookie in ioport_map so we can map I/O port ranges starting from 0 (0 is for reporting errors).
Signed-off-by: David Vrabel <dvrabel@arcom.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The start_tx and stop_tx methods were passed a flag to indicate
whether the start/stop was from the tty start/stop callbacks, and
some drivers used this flag to decide whether to ask the UART to
immediately stop transmission (where the UART supports such a
feature.)
There are other cases when we wish this to occur - when CTS is
lowered, or if we change from soft to hard flow control and CTS
is inactive. In these cases, this flag was false, and we would
allow the transmitter to drain before stopping.
There is really only one case where we want to let the transmitter
drain before disabling, and that's when we run out of characters
to send.
Hence, re-jig the start_tx and stop_tx methods to eliminate this
flag, and introduce new functions for the special "disable and
allow transmitter to drain" case.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Peter Staubach pointed out that it is not correct to check
current->personality & PER_LINUX32 (this will have false
hits on several other personality values).
Signed-off-by: Tony Luck <tony.luck@intel.com>
ATAPI is getting close to being ready. To increase exposure, we enable
the code in the upstream kernel, but default it to off (present
behavior). Users must pass atapi_enabled=1 as a module option (if
module) or on the kernel command line (if built in) to turn on
discovery of their ATAPI devices.
Add register_sound_special_device() function to allow assignment of
device pointer to a specific OSS device for HAL.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
PCM Midlevel,ALSA<-OSS emulation,USB USX2Y
This patch removes open_flag from struct _snd_pcm_substream.
All of its uses are substituted by querying struct _snd_pcm_substream's
member ffile instead.
Signed-off-by: Karsten Wiese <annabellesgarden@yahoo.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
YMFPCI driver
Implements mixer controls for the volume of each playback substream of
the main PCM device.
Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Documentation,AD1816A driver
Added clockfreq module option for the card with a different clock frequency
than 33kHz.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
AC97 Codec,PCI drivers
I've made the review changes and as requested I've pasted the RFC by
Nicolas below:-
'I would like to know what people think of the following patch. It
allows for a codec on an AC97 bus to be shared with other drivers which
are completely unrelated to audio. It registers a new bus type, and
whenever a codec instance is created then a device for it is also
registered with the driver model using that bus type. This allows, for
example, to use the extra features of the UCB1400 like the touchscreen
interface and the additional GPIOs and ADCs available on that chip for
battery monitoring. I have a working UCB1400 touchscreen driver here
that simply registers with the driver model happily working alongside
with audio features using this.'
Changes over RFC:-
o Now matches codec name within codec group.
o Added ac97_dev_release() to stop kernel complaining about no release
method for device.
o Added 'config SND_AC97_BUS' to sound/pci/Kconfig and moved 'config
SND_AC97_CODEC' out with the PCI=n statement.
o module is now called snd-ac97-bus
Signed-off-by: Liam Girdwood <liam.girdwood@wolfsonmicro.com>
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Documentation,CS46xx driver,EMU10K1/EMU10K2 driver,AD1848 driver
SB16/AWE driver,CMIPCI driver,ENS1370/1+ driver,RME32 driver
RME96 driver,ICE1712 driver,ICE1724 driver,KORG1212 driver
RME HDSP driver,RME9652 driver
This patch changes .iface to SNDRV_CTL_ELEM_IFACE_MIXER whre _PCM or
_HWDEP was used in controls that are not associated with a specific PCM
(sub)stream or hwdep device, and changes some controls that got
inconsitent .iface values due to copy+paste errors. Furthermore, it
makes sure that all control that do use _PCM or _HWDEP use the correct
number in the .device field.
Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
AC97 Codec
o Enhanced current WM97xx support to provide additional controls and
use the kcontrol suffix naming convention.
o Added AC97_HAS_NO_MIC, AC97_HAS_NO_TONE and AC97_HAS_NO_STD_PCM.
o Cleaned up WM97xx related comments.
o Removed some wm9713 double mono controls and replaced with stereo
controls.
Signed-off-by: Liam Girdwood <liam.girdwood@wolfsonmicro.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Fix build problem found by compiling driver with DEBUG defined that used tcp.h.
Since pr_debug(arg) expands to printk("<7>" arg) the argument
needs to be string that can be concatenated.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
We can put the __softirq_pending mask in the cpudata,
no need for the silly NR_CPUS array in kernel/softirq.c
Signed-off-by: David S. Miller <davem@davemloft.net>
While ppc64 has the CONFIG_HZ Kconfig option, it wasnt actually being
used. Connect it up and set all platforms to 250Hz.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Here's the 970MP's PVR (processor version register) entry for oprofile.
Signed-off-by: Jake Moilanen <moilanen@austin.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
They differed in either simple comments or in the protecting ifdefs.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Move the identical files from include/asm-ppc{,64}/ to
include/asm-powerpc/. Remove hdreg.h completely as it is unused in
the tree.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The ppc and ppc64 trees are hopefully going to merge over time, so this
patch begins the process by creating a place for the merging of the
header files.
Create include/asm-powerpc (and move linkage.h into it from
asm-{ppc,ppc64} since we don't like empty directories). Modify the
ppc and ppc64 Makefiles to cope.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Create vio_bus_ops so that we just pass a structure to vio_bus_init
instead of three separate function pointers.
Rearrange vio.h to avoid forward references. vio.h only needs
struct device_node from prom.h so remove the include and just
declare it.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Take some assignments out of vio_register_device_common and
rename it to vio_register_device.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
With CONFIG_HUGETLB_PAGE=n:
In file included from kernel/sysctl.c:37:
include/linux/hugetlb.h:104:1: warning: "hugetlb_free_pgd_range" redefined
In file included from include/linux/mm.h:36,
from kernel/sysctl.c:23:
include/asm/pgtable.h:492:1: warning: this is the location of the previous definition
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
So that applications can set dccp_sock->dccps_pkt_size, that in turn
is used in the CCID3 half connection init routines to set
ccid3hc[tr]x_s and use it in its rate calculations.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some shub2 changes were not in the tree when Greg cleaned up the sn2
region definitions in 1b66776da7, so this
one didn't get fixed.
Signed-off-by: Tony Luck <tony.luck@intel.com>
This target allows users to modify the hoplimit header field of the
IPv6 header.
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This new iptables target allows manipulation of the TTL of an IPv4 packet.
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The rest of endian warnings now belongs to tr.c exclusively.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* RCU versions of hlist_***_rcu
* fib_alias partial rcu port just whats needed now.
Signed-off-by: Robert Olsson <Robert.Olsson@data.slu.se>
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Originally written by Henrik Nordstrom <hno@marasystems.com>, taken
from netfilter patch-o-matic and added ip6_tables support.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Originally written by Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>,
taken from netfilter patch-o-matic and fixed up to work with current
kernels.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Pablo Neira Ayuso <pablo@eurodev.net>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds a new field to net device to hold the permanent
hardware address, and adds a new generic ethtool_op function to
get that address.
Signed-off-by: Jon Wetzel <jon_wetzel@dell.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This changes timestamp, timestamp echo, and elapsed time to use units of 10
usecs as per DCCP spec. This has been tested to verify that times are correct.
Also fixed up length and used hton/ntoh more.
Still to add in later patches:
- actually use elapsed time to adjust RTT
(commented out as was prior to this patch)
- send options at times more closely following the spec
(content is now correct)
Signed-off-by: Ian McDonald <iam4@cs.waikato.ac.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Protocols that make extensive use of SKB cloning,
for example TCP, eat at least 2 allocations per
packet sent as a result.
To cut the kmalloc() count in half, we implement
a pre-allocation scheme wherein we allocate
2 sk_buff objects in advance, then use a simple
reference count to free up the memory at the
correct time.
Based upon an initial patch by Thomas Graf and
suggestions from Herbert Xu.
Signed-off-by: David S. Miller <davem@davemloft.net>
This variant is needed to satisfy sparse __user annotations.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Of this type, mostly:
CHECK net/ipv6/netfilter.c
net/ipv6/netfilter.c:96:12: warning: symbol 'ipv6_netfilter_init' was not declared. Should it be static?
net/ipv6/netfilter.c:101:6: warning: symbol 'ipv6_netfilter_fini' was not declared. Should it be static?
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
NETLINK_ADD_MEMBERSHIP/NETLINK_DROP_MEMBERSHIP are used to join/leave
groups, NETLINK_PKTINFO is used to enable nl_pktinfo control messages
for received packets to get the extended destination group number.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Using the group number allows increasing the number of groups without
beeing limited by the size of the bitmask. It introduces one limitation
for netlink users: messages can't be broadcasted to multiple groups anymore,
however this feature was never used inside the kernel.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch ads a new "connbytes" match that utilizes the CONFIG_NF_CT_ACCT
per-connection byte and packet counters. Using it you can do things like
packet classification on average packet size within a connection.
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
As proposed by Andi Kleen, this is required esp. for x86_64 architecture,
where 64bit code needs 8byte aligned 64bit data types, but 32bit userspace
apps will only align to 4bytes.
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>