include/asm-x86/bitops_32.h and include/asm-x86/bitops_64.h are now
almost identical. The 64-bit version sets ARCH_HAS_FAST_MULTIPLIER
and has an extra inline function set_bit_string. The define currently
has no influence on the generated code, but it can be argued that
setting it on i386 is the right thing to do anyhow. The addition
of the extra inline function on i386 does not hurt either.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
x86 has been switched to the generic versions of find_first_bit
and find_first_zero_bit, but the original versions were retained.
This patch just removes the now unused x86-specific versions.
also update UML.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Avoid a call to find_first_bit if the bitmap size is know at
compile time and small enough to fit in a single long integer.
Modeled after an optimization in the original x86_64-specific
code.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Switch x86_64 to generic find_first_bit. The x86_64-specific
implementation is not removed.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Generic versions of __find_first_bit and __find_first_zero_bit
are introduced as simplified versions of __find_next_bit and
__find_next_zero_bit. Their compilation and use are guarded by
a new config variable GENERIC_FIND_FIRST_BIT.
The generic versions of find_first_bit and find_first_zero_bit
are implemented in terms of the newly introduced __find_first_bit
and __find_first_zero_bit.
This patch does not remove the i386-specific implementation,
but it does switch i386 to use the generic functions by setting
GENERIC_FIND_FIRST_BIT=y for X86_32.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Use __fls for fls64 on 64-bit archs. The implementation for
64-bit archs is moved from x86_64 to asm-generic.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Implement __fls on all 64-bit archs:
alpha has an implementation of fls64.
Added __fls(x) = fls64(x) - 1.
ia64 has fls, but not __fls.
Added __fls based on code of fls.
mips and powerpc have __ilog2, which is the same as __fls.
Added __fls = __ilog2.
parisc, s390, sh and sparc64:
Include generic __fls.
x86_64 already has __fls.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add a generic __fls implementation in the same spirit as
the generic __ffs one. It finds the last (most significant)
set bit in the given long value.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Some of those can be written in such a way that the same
inline assembly can be used to generate both 32 bit and
64 bit code.
For ffs and fls, x86_64 unconditionally used the cmov
instruction and i386 unconditionally used a conditional
branch over a mov instruction. In the current patch I
chose to select the version based on the availability
of the cmov instruction instead. A small detail here is
that x86_64 did not previously set CONFIG_X86_CMOV=y.
Improved comments for ffs, ffz, fls and variations.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This moves an optimization for searching constant-sized small
bitmaps form x86_64-specific to generic code.
On an i386 defconfig (the x86#testing one), the size of vmlinux hardly
changes with this applied. I have observed only four places where this
optimization avoids a call into find_next_bit:
In the functions return_unused_surplus_pages, alloc_fresh_huge_page,
and adjust_pool_surplus, this patch avoids a call for a 1-bit bitmap.
In __next_cpu a call is avoided for a 32-bit bitmap. That's it.
On x86_64, 52 locations are optimized with a minimal increase in
code size:
Current #testing defconfig:
146 x bsf, 27 x find_next_*bit
text data bss dec hex filename
5392637 846592 724424 6963653 6a41c5 vmlinux
After removing the x86_64 specific optimization for find_next_*bit:
94 x bsf, 79 x find_next_*bit
text data bss dec hex filename
5392358 846592 724424 6963374 6a40ae vmlinux
After this patch (making the optimization generic):
146 x bsf, 27 x find_next_*bit
text data bss dec hex filename
5392396 846592 724424 6963412 6a40d4 vmlinux
[ tglx@linutronix.de: build fixes ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The versions with inline assembly are in fact slower on the machines I
tested them on (in userspace) (Athlon XP 2800+, p4-like Xeon 2.8GHz, AMD
Opteron 270). The i386-version needed a fix similar to 06024f21 to avoid
crashing the benchmark.
Benchmark using: gcc -fomit-frame-pointer -Os. For each bitmap size
1...512, for each possible bitmap with one bit set, for each possible
offset: find the position of the first bit starting at offset. If you
follow ;). Times include setup of the bitmap and checking of the
results.
Athlon Xeon Opteron 32/64bit
x86-specific: 0m3.692s 0m2.820s 0m3.196s / 0m2.480s
generic: 0m2.622s 0m1.662s 0m2.100s / 0m1.572s
If the bitmap size is not a multiple of BITS_PER_LONG, and no set
(cleared) bit is found, find_next_bit (find_next_zero_bit) returns a
value outside of the range [0, size]. The generic version always returns
exactly size. The generic version also uses unsigned long everywhere,
while the x86 versions use a mishmash of int, unsigned (int), long and
unsigned long.
Using the generic version does give a slightly bigger kernel, though.
defconfig: text data bss dec hex filename
x86-specific: 4738555 481232 626688 5846475 5935cb vmlinux (32 bit)
generic: 4738621 481232 626688 5846541 59360d vmlinux (32 bit)
x86-specific: 5392395 846568 724424 6963387 6a40bb vmlinux (64 bit)
generic: 5392458 846568 724424 6963450 6a40fa vmlinux (64 bit)
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
pointed out by Linus: arch/um/Kconfig.x86_64 should
include arch/x86/Kconfig.cpu instead of defining those
symbols itself.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
fix:
arch/um/os-Linux/helper.c: In function 'run_helper':
arch/um/os-Linux/helper.c:73: error: 'PATH_MAX' undeclared (first use in this function)
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86-fixes:
x86 PAT: decouple from nonpromisc devmem
x86 PAT: tone down debugging messages
* git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/v4l-dvb:
V4L/DVB (7751): ir-kbd-i2c: Save a temporary memory allocation in ir_probe
V4L/DVB (7750): au0828/ cleanups and fixes
V4L/DVB (7748): tuner-core: some adjustments at tuner logs, if debug enabled
V4L/DVB (7746): pvrusb2: make signed one-bit bitfields unsigned
V4L/DVB (7744): pvrusb2-dvb: add atsc/qam support for Hauppauge pvrusb2 model 751xx
V4L/DVB (7742): cx88: Add support for the DViCO FusionHDTV_7_GOLD digital modes
V4L/DVB (7741): s5h1411: Adding support for this ATSC/QAM demodulator
V4L/DVB (7740): tuner-xc2028.c dubious !x & y
V4L/DVB (7739): mt312.h: dubious one-bit signed bitfield
V4L/DVB (7735): Fix compilation for au0828
V4L/DVB (7734): em28xx: copy and paste error in em28xx_init_isoc
V4L/DVB (7733): blackbird_find_mailbox negative return ignored in blackbird_initialize_codec()
V4L/DVB (7732): vivi: fix a warning
* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: (61 commits)
ide: sanitize handling of IDE_HFLAG_NO_SET_MODE host flag
sis5513: fail early for unsupported chipsets
it821x: fix kzalloc() failure handling
qd65xx: use IDE_HFLAG_SINGLE host flag
qd65xx: always use ->selectproc method
ide-cd: put proc-related functions together under single ifdef
ide-cd: Replace __FUNCTION__ with __func__
IDE: Coding Style fixes to drivers/ide/ide-cd.c
IDE: Coding Style fixes to drivers/ide/pci/cy82c693.c
IDE: Coding Style fixes to drivers/ide/pci/it8213.c
IDE: Coding Style fixes to drivers/ide/ide-floppy.c
IDE: Coding Style fixes to drivers/ide/legacy/ali14xx.c
IDE: Coding Style fixes to drivers/ide/legacy/hd.c
IDE: Coding Style fixes to drivers/ide/pci/cmd640.c
IDE: Coding Style fixes to drivers/ide/pci/opti621.c
IDE: Coding Style fixes to drivers/ide/ide-pnp.c
IDE: Coding Style fixes to drivers/ide/ide-proc.c
IDE: Coding Style fixes to drivers/ide/legacy/ide-4drives.c
IDE: Coding Style fixes to drivers/ide/legacy/umc8672.c
IDE: Coding Style fixes to drivers/ide/pci/generic.c
...
Linus pointed it out that PAT should not depend on NONPROMISC_DEVMEM.
Also make PAT non-default.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'agp-patches' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/agp-2.6:
agp: convert drivers/char/agp/frontend.c to use unlocked_ioctl
agp: fix shadowed variable warning in amd-k7-agp.c
* 'drm-patches' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
drm: _end is shadowing real _end, just rename it.
drm/vbl rework: rework how the drm deals with vblank.
drm: reorganise minor number handling using backported modesetting code.
drm/i915: Handle tiled buffers in vblank tasklet
drm/i965: On I965, use correct 3DSTATE_DRAWING_RECTANGLE command in vblank
drm: Remove unneeded dma sync in ATI pcigart alloc
drm: Fix mismerge of non-coherent DMA patch
Arrgghhh...
Sorry about that, I'd been sure I'd folded that one, but it actually got
lost. Please apply - that breaks execve().
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Stephen Rothwell reported that linux-next did not build on powerpc64.
make optimized inlining dependent on architecture opt-in.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
add CONFIG_OPTIMIZE_INLINING=y.
allow gcc to optimize the kernel image's size by uninlining
functions that have been marked 'inline'. Previously gcc was
forced by Linux to always-inline these functions via a gcc
attribute:
#define inline inline __attribute__((always_inline))
Especially when the user has already selected
CONFIG_OPTIMIZE_FOR_SIZE=y this can make a huge difference in
kernel image size (using a standard Fedora .config):
text data bss dec hex filename
5613924 562708 3854336 10030968 990f78 vmlinux.before
5486689 562708 3854336 9903733 971e75 vmlinux.after
that's a 2.3% text size reduction (!).
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* Check for IDE_HFLAG_NO_SET_MODE host flag in ide_set_pio(),
ide_set_[pio,dma]_mode(), ide_set_xfer_rate() and set_pio_mode().
* Remove no longer needed IDE_HFLAG_NO_SET_MODE host flag checking
from ide_tune_dma().
* Remove superfluous ->set_pio_mode checking from do_special().
This is a part of preparations for adding 'struct ide_port_ops'.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Factor out chipset family detection from init_chipset_sis5513()
to sis_find_family().
* Use sis_find_family() in sis5513_init_one() to fail early if the
chipset is unsupported.
* Keep a local copy sis5513_chipset in sis5513_init_one()
and set .udma_mask according to chipset family.
* Remove no longer need ->ultra_mask setting from init_hwif_sis5513().
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Allocate 'struct it821x_dev' objects for both ports in it821x_init_one().
Fixes potential OOPS in it821x_quirkproc() (uses 'itdev' unconditionally)
and other problems ('itdev' is needed for correct operation of the driver).
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Set IDE_HFLAG_SINGLE host flag in qd_probe() for QD6500 and QD6580
with the second port disabled.
* Check for IDE_HFLAG_SINGLE in qd6580_port_init_devs() instead of
using cached value of QD6580 Control register.
* Don't cache QD6580 Control register value in hwif->config_data
(bits 8-15) and remove no longer needed QD_CONTROL() macro.
* Cache QD65xx base address in hwif->config_data (bits 8-15)
instead of hwif->select_data.
* Set hwif->config_data in qd_probe() and remove qd_setup() helper.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
qd_select() checks itself whether timings should be reprogrammed so
remove superfluous qd_timing_ok() and always use ->selectproc method
(rename qd_select() to qd65xx_select() while at it).
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
File is now error free, only a few
WARNING: line over 80 characters
are left.
Compile tested.
[bart: md5sum checked]
Signed-off-by: Paolo Ciarrocchi <paolo.ciarrocchi@gmail.com>
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
File is now error free.
Compile tested.
[bart: minor fixes, md5sum checked]
Signed-off-by: Paolo Ciarrocchi <paolo.ciarrocchi@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Fix a lot of errors and warnings.
Compile tested.
[bart: some fixes, md5sum checked]
Signed-off-by: Paolo Ciarrocchi <paolo.ciarrocchi@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Fix all the errors and a few warnings.
Compile tested.
[bart: some fixes, md5sum checked]
Signed-off-by: Paolo Ciarrocchi <paolo.ciarrocchi@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Lot of errors and warnings removed.
Compile tested.
[bart: minor fixes, md5sum checked]
Signed-off-by: Paolo Ciarrocchi <paolo.ciarrocchi@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
File is now error and warning free.
Compile tested.
[bart: md5sum checked]
Signed-off-by: Paolo Ciarrocchi <paolo.ciarrocchi@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
File is now error free.
Compile tested.
[bart: minor fixes, md5sum checked]
Signed-off-by: Paolo Ciarrocchi <paolo.ciarrocchi@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
About 300 errors and warnings fixed.
File is now error free.
Compile tested.
[bart: minor fixes, md5sum checked]
Signed-off-by: Paolo Ciarrocchi <paolo.ciarrocchi@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
File is now error and warning free.
Compile tested.
[bart: md5sum checked]
Signed-off-by: Paolo Ciarrocchi <paolo.ciarrocchi@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
File is now error free.
Compile tested.
[bart: minor fixes, md5sum checked]
Signed-off-by: Paolo Ciarrocchi <paolo.ciarrocchi@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>