linux-stable-rt/include/linux
Ravikiran G Thirumalai e073ae1b34 [PATCH] x86-64: Set HASHDIST_DEFAULT to 1 for x86_64 NUMA
Enable system hashtable memory to be distributed among nodes on x86_64 NUMA

Forcing the kernel to use node interleaved vmalloc instead of bootmem for
the system hashtable memory (alloc_large_system_hash) reduces the memory
imbalance on node 0 by around 40MB on a 8 node x86_64 NUMA box:

Before the following patch, on bootup of a 8 node box:

Node 0 MemTotal:      3407488 kB
Node 0 MemFree:       3206296 kB
Node 0 MemUsed:        201192 kB
Node 0 Active:           7012 kB
Node 0 Inactive:          512 kB
Node 0 Dirty:               0 kB
Node 0 Writeback:           0 kB
Node 0 FilePages:        1912 kB
Node 0 Mapped:            420 kB
Node 0 AnonPages:        5612 kB
Node 0 PageTables:        468 kB
Node 0 NFS_Unstable:        0 kB
Node 0 Bounce:              0 kB
Node 0 Slab:             5408 kB
Node 0 SReclaimable:      644 kB
Node 0 SUnreclaim:       4764 kB

After the patch (or using hashdist=1 on the kernel command line):

Node 0 MemTotal:      3407488 kB
Node 0 MemFree:       3247608 kB
Node 0 MemUsed:        159880 kB
Node 0 Active:           3012 kB
Node 0 Inactive:          616 kB
Node 0 Dirty:               0 kB
Node 0 Writeback:           0 kB
Node 0 FilePages:        2424 kB
Node 0 Mapped:            380 kB
Node 0 AnonPages:        1200 kB
Node 0 PageTables:        396 kB
Node 0 NFS_Unstable:        0 kB
Node 0 Bounce:              0 kB
Node 0 Slab:             6304 kB
Node 0 SReclaimable:     1596 kB
Node 0 SUnreclaim:       4708 kB

I guess it is a good idea to keep HASHDIST_DEFAULT "on" for x86_64 NUMA
since x86_64 has no dearth of vmalloc space?  Or maybe enable hash
distribution for all 64bit NUMA arches?  The following patch does it only
for x86_64.

I ran a HPC MPI benchmark -- 'Ansys wingsolid', which takes up quite a bit of
memory and uses up tlb entries.  This was on a 4 way, 2 socket
Tyan AMD box (non vsmp), with 8G total memory (4G pernode).

The results with and without hash distribution are:

1. Vanilla - runtime of 1188.000s
2. With hashdist=1 runtime of 1154.000s

Oprofile output for the duration of run is:

1. Vanilla:
PU: AMD64 processors, speed 2411.16 MHz (estimated)
Counted L1_AND_L2_DTLB_MISSES events (L1 and L2 DTLB misses) with a unit
mask of 0x00 (No unit mask) count 500
samples  %        app name                 symbol name
163054    6.5513  libansys1.so             MultiFront::decompose(int, int,
Elemset *, int *, int, int, int)
162061    6.5114  libansys3.so             blockSaxpy6L_fd
162042    6.5107  libansys3.so             blockInnerProduct6L_fd
156286    6.2794  libansys3.so             maxb33_
87879     3.5309  libansys1.so             elmatrixmultpcg_
84857     3.4095  libansys4.so             saxpy_pcg
58637     2.3560  libansys4.so             .st4560
46612     1.8728  libansys4.so             .st4282
43043     1.7294  vmlinux-t                copy_user_generic_string
41326     1.6604  libansys3.so             blockSaxpyBackSolve6L_fd
41288     1.6589  libansys3.so             blockInnerProductBackSolve6L_fd

2. With hashdist=1
CPU: AMD64 processors, speed 2411.13 MHz (estimated)
Counted L1_AND_L2_DTLB_MISSES events (L1 and L2 DTLB misses) with a unit
mask of 0x00 (No unit mask) count 500
samples  %        app name                 symbol name
162993    6.9814  libansys1.so             MultiFront::decompose(int, int,
Elemset *, int *, int, int, int)
160799    6.8874  libansys3.so             blockInnerProduct6L_fd
160459    6.8729  libansys3.so             blockSaxpy6L_fd
156018    6.6826  libansys3.so             maxb33_
84700     3.6279  libansys4.so             saxpy_pcg
83434     3.5737  libansys1.so             elmatrixmultpcg_
58074     2.4875  libansys4.so             .st4560
46000     1.9703  libansys4.so             .st4282
41166     1.7632  libansys3.so             blockSaxpyBackSolve6L_fd
41033     1.7575  libansys3.so             blockInnerProductBackSolve6L_fd
35762     1.5318  libansys1.so             inner_product_sub
35591     1.5245  libansys1.so             inner_product_sub2
28259     1.2104  libansys4.so             addVectors

Signed-off-by: Pravin B. Shelar <pravin.shelar@calsoftinc.com>
Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: Shai Fultheim <shai@scalex86.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Acked-by: Christoph Lameter <clameter@engr.sgi.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2007-05-02 19:27:08 +02:00
..
amba
byteorder
dvb
hdlc
isdn
lockd
mmc
mtd
netfilter
netfilter_arp
netfilter_bridge
netfilter_ipv4
netfilter_ipv6
nfsd
raid
spi
sunrpc
tc_act
tc_ematch
usb
8250_pci.h
Kbuild
a.out.h
ac97_codec.h
acct.h
acpi.h
acpi_pmtmr.h
adb.h
adfs_fs.h
adfs_fs_i.h
adfs_fs_sb.h
aer.h
affs_hardblocks.h
agp_backend.h
agpgart.h
aio.h
aio_abi.h
amifd.h
amifdreg.h
amigaffs.h
apm-emulation.h
apm_bios.h
arcdevice.h
arcfb.h
ata.h libata: Handle drives that require a spin-up command before first access 2007-04-28 14:40:40 -04:00
atalk.h
atm.h
atm_eni.h
atm_he.h
atm_idt77105.h
atm_nicstar.h
atm_suni.h
atm_tcp.h
atm_zatm.h
atmapi.h
atmarp.h
atmbr2684.h
atmclip.h
atmdev.h
atmel_pdc.h
atmioc.h
atmlec.h
atmmpc.h
atmppp.h
atmsap.h
atmsvc.h
attribute_container.h
audit.h
auto_fs.h
auto_fs4.h
auxvec.h
awe_voice.h
ax25.h
b1lli.h
b1pcmcia.h
backing-dev.h
backlight.h
baycom.h
bcd.h
bfs_fs.h
binfmts.h
bio.h [BLOCK] Don't pin lots of memory in mempools 2007-04-30 09:08:17 +02:00
bit_spinlock.h
bitmap.h
bitops.h
bitrev.h
blkdev.h ll_rw_blk: add io_context private pointer 2007-04-30 09:01:23 +02:00
blkpg.h
blktrace_api.h
blockgroup_lock.h
bootmem.h [PATCH] x86-64: Set HASHDIST_DEFAULT to 1 for x86_64 NUMA 2007-05-02 19:27:08 +02:00
bottom_half.h
bpqether.h
buffer_head.h
bug.h
cache.h
calc64.h
capability.h
capi.h
cciss_ioctl.h
cd1400.h
cdev.h
cdk.h
cdrom.h
cfag12864b.h
chio.h
circ_buf.h
clk.h
clockchips.h
clocksource.h
cm4000_cs.h
cn_proc.h
cobalt-nvram.h
coda.h
coda_cache.h
coda_fs_i.h
coda_linux.h
coda_proc.h
coda_psdev.h
coff.h
com20020.h
compat.h
compat_ioctl.h
compiler-gcc.h
compiler-gcc3.h
compiler-gcc4.h
compiler-intel.h
compiler.h
completion.h
comstats.h
concap.h
configfs.h
connector.h
console.h
console_struct.h
consolemap.h
cpu.h
cpufreq.h [PATCH] x86-64: fix x86_64-mm-sched-clock-share 2007-05-02 19:27:08 +02:00
cpumask.h
cpuset.h
cramfs_fs.h
cramfs_fs_sb.h
crash_dump.h
crc-ccitt.h
crc16.h
crc32.h
crc32c.h
crypto.h
cryptohash.h
ctype.h
cuda.h
cyclades.h
cyclomx.h
cycx_cfm.h
cycx_drv.h
cycx_x25.h
dcache.h
dccp.h
dcookies.h
debug_locks.h
debugfs.h
delay.h
delayacct.h
device-mapper.h
device.h
devpts_fs.h
dio.h
dirent.h
dlm.h
dlm_device.h
dm-ioctl.h
dm9000.h
dma-mapping.h
dmaengine.h
dmapool.h
dmi.h
dn.h
dnotify.h
dqblk_v1.h
dqblk_v2.h
dqblk_xfs.h
ds1286.h
ds17287rtc.h
dtlk.h
edd.h
efi.h
efs_dir.h
efs_fs.h
efs_fs_i.h
efs_fs_sb.h
efs_vh.h
eisa.h
elevator.h
elf-em.h
elf-fdpic.h
elf.h
elfcore.h
elfnote.h
err.h
errno.h
errqueue.h
etherdevice.h
ethtool.h
eventpoll.h
ext2_fs.h
ext2_fs_sb.h
ext3_fs.h
ext3_fs_i.h
ext3_fs_sb.h
ext3_jbd.h
ext4_fs.h
ext4_fs_extents.h
ext4_fs_i.h
ext4_fs_sb.h
ext4_jbd2.h
fadvise.h
fault-inject.h
fb.h
fcdevice.h
fcntl.h
fd.h
fd1772.h
fddidevice.h
fdreg.h
fib_rules.h
file.h
filter.h
firmware.h
flat.h
font.h
freezer.h
fs.h
fs_enet_pd.h
fs_stack.h
fs_struct.h
fs_uart_pd.h
fsl_devices.h ucc_geth: migrate ucc_geth to phylib 2007-04-28 11:01:04 -04:00
fsnotify.h
fuse.h
futex.h
gameport.h
gen_stats.h
genalloc.h
generic_acl.h
generic_serial.h
genetlink.h
genhd.h
getcpu.h
gfp.h
gfs2_ondisk.h
gigaset_dev.h
gpio_keys.h
hardirq.h
harrier_defs.h
hash.h
hayesesp.h
hdlc.h Generic HDLC sparse annotations 2007-04-28 11:01:07 -04:00
hdlcdrv.h
hdpu_features.h
hdreg.h
hdsmart.h
hid-debug.h
hid.h
hiddev.h
highmem.h
highuid.h
hil.h
hil_mlc.h
hippidevice.h
hp_sdc.h
hpet.h
hrtimer.h
htirq.h
hugetlb.h
hw_random.h
hwmon-sysfs.h
hwmon-vid.h
hwmon.h
hysdn_if.h
i2c-algo-bit.h
i2c-algo-pca.h
i2c-algo-pcf.h
i2c-algo-sgi.h
i2c-dev.h
i2c-id.h
i2c-isa.h
i2c-ocores.h
i2c-pnx.h
i2c-pxa.h
i2c.h
i2o-dev.h
i2o.h
i8k.h
ibmtr.h
icmp.h
icmpv6.h
ide.h
idr.h
if.h
if_addr.h
if_arcnet.h
if_arp.h
if_bonding.h
if_bridge.h
if_cablemodem.h
if_ec.h
if_eql.h
if_ether.h
if_fc.h
if_fddi.h
if_frad.h
if_hippi.h
if_infiniband.h
if_link.h
if_ltalk.h
if_packet.h
if_plip.h
if_ppp.h
if_pppox.h
if_shaper.h
if_slip.h
if_strip.h
if_tr.h
if_tun.h
if_tunnel.h
if_vlan.h
if_wanpipe.h
igmp.h
in.h
in6.h
in_route.h
inet.h
inet_diag.h
inetdevice.h
init.h [PATCH] i386: Change sysenter_setup to __cpuinit & improve __INIT, __INITDATA 2007-05-02 19:27:05 +02:00
init_task.h
initrd.h
inotify.h
input.h
interrupt.h
io.h
ioc3.h
ioc4.h
ioctl.h
ioctl32.h
ioport.h libata/IDE: remove combined mode quirk 2007-04-28 14:15:59 -04:00
ioprio.h
ip.h
ip6_tunnel.h
ip_mp_alg.h
ipc.h
ipmi.h
ipmi_msgdefs.h
ipmi_smi.h
ipsec.h
ipv6.h
ipv6_route.h
ipx.h
irda.h
irq.h
irq_cpustat.h
irqflags.h
irqreturn.h
isa.h
isapnp.h
isdn.h
isdn_divertif.h
isdn_ppp.h
isdnif.h
isicom.h
iso_fs.h
istallion.h
ixjuser.h
jbd.h
jbd2.h
jffs2.h
jhash.h
jiffies.h
journal-head.h
joystick.h
kallsyms.h Extend print_symbol capability 2007-04-30 16:40:39 -07:00
kbd_diacr.h
kbd_kern.h
kd.h
kdev_t.h
kernel.h Add kvasprintf() 2007-04-30 16:40:40 -07:00
kernel_stat.h
kernelcapi.h
kexec.h
key-ui.h
key.h
keyboard.h
keyctl.h
kfifo.h
klist.h
kmalloc_sizes.h
kmod.h
kobj_map.h
kobject.h
kprobes.h
kref.h
ks0108.h
kthread.h
ktime.h
kvm.h
kvm_para.h
lapb.h
latency.h
lcd.h
leds.h
libata.h libata: separate ATA_EHI_DID_RESET into DID_SOFTRESET and DID_HARDRESET 2007-04-28 14:51:33 -04:00
libps2.h
license.h
limits.h
linkage.h
linux_logo.h
list.h
llc.h
lm_interface.h
lock_dlm_plock.h
lockdep.h
log2.h
loop.h
lp.h
m41t00.h
m48t86.h
magic.h
major.h
matroxfb.h
mbcache.h
mc6821.h
mc146818rtc.h
mca-legacy.h
mca.h
memory.h
memory_hotplug.h
mempolicy.h
mempool.h
meye.h
migrate.h
mii.h
minix_fs.h
miscdevice.h
mm.h
mm_inline.h
mm_types.h
mman.h
mmtimer.h
mmzone.h
mnt_namespace.h
mod_devicetable.h
module.h
moduleloader.h
moduleparam.h
mount.h
mpage.h
mqueue.h
mroute.h
msdos_fs.h
msg.h
msi.h
mtio.h
mutex-debug.h
mutex.h
mv643xx.h
n_r3964.h
namei.h
nbd.h
ncp.h
ncp_fs.h
ncp_fs_i.h
ncp_fs_sb.h
ncp_mount.h
ncp_no.h
neighbour.h
net.h
netdevice.h [NET]: Remove NETIF_F_INTERNAL_STATS, default to internal stats. 2007-04-28 21:04:03 -07:00
netfilter.h
netfilter_arp.h
netfilter_bridge.h
netfilter_decnet.h
netfilter_ipv4.h
netfilter_ipv6.h
netlink.h
netpoll.h
netrom.h
nfs.h
nfs2.h
nfs3.h
nfs4.h
nfs4_acl.h
nfs4_mount.h
nfs_fs.h
nfs_fs_i.h
nfs_fs_sb.h
nfs_idmap.h
nfs_mount.h
nfs_page.h
nfs_xdr.h
nfsacl.h
nfsd_idmap.h
nl80211.h
nls.h
nmi.h
node.h
nodemask.h
notifier.h
nsc_gpio.h
nsproxy.h
nubus.h
numa.h
nvram.h
oom.h
oprofile.h
page-flags.h
pagemap.h
pagevec.h
param.h
parport.h
parport_pc.h
parser.h
pata_platform.h
patchkey.h
pci-acpi.h
pci.h iomap: implement pcim_iounmap_regions() 2007-04-28 14:15:58 -04:00
pci_hotplug.h
pci_ids.h Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6 2007-04-29 10:48:48 -07:00
pci_regs.h
pcieport_if.h
percpu.h
percpu_counter.h
personality.h
pfkeyv2.h
pfn.h
pg.h
phonedev.h
phy.h phylib: add RGMII-ID interface mode definition 2007-04-28 11:01:04 -04:00
pid.h
pid_namespace.h
pipe_fs_i.h
pkt_cls.h
pkt_sched.h
pktcdvd.h
platform_device.h
plist.h
pm.h pm: include EIO from errno-base.h 2007-04-30 16:40:41 -07:00
pm_legacy.h
pmu.h
pnp.h
pnpbios.h
poison.h
poll.h
posix-timers.h
posix_acl.h
posix_acl_xattr.h
posix_types.h
ppdev.h
ppp-comp.h
ppp_channel.h
ppp_defs.h
prctl.h
preempt.h
prefetch.h
prio_tree.h
proc_fs.h
profile.h
ps2esdi.h
ptrace.h
qnx4_fs.h
qnxtypes.h
quota.h
quotaio_v1.h
quotaio_v2.h
quotaops.h
radeonfb.h
radix-tree.h
raid_class.h
ramfs.h
random.h
raw.h
rbtree.h
rcupdate.h
reboot.h
reciprocal_div.h
reiserfs_acl.h
reiserfs_fs.h
reiserfs_fs_i.h
reiserfs_fs_sb.h
reiserfs_xattr.h
relay.h
resource.h
resume-trace.h
rio.h
rio_drv.h
rio_ids.h
rio_regs.h
rmap.h
romfs_fs.h
root_dev.h
rose.h
route.h
rslib.h
rtc-v3020.h
rtc.h
rtmutex.h
rtnetlink.h
rwsem-spinlock.h
rwsem.h
rxrpc.h
sc26198.h
scatterlist.h
scc.h
sched.h
screen_info.h
sctp.h
scx200.h
scx200_gpio.h
sdla.h
seccomp.h
securebits.h
security.h
selection.h
selinux.h
selinux_netlink.h
sem.h
seq_file.h
seqlock.h
serial.h
serial167.h
serialP.h
serial_8250.h
serial_core.h
serial_pnx8xxx.h
serial_reg.h
serio.h
shm.h
shmem_fs.h
signal.h
skbuff.h [SKB]: Introduce skb_queue_walk_safe() 2007-04-30 00:07:31 -07:00
slab.h
slab_def.h
sm501-regs.h
sm501.h
smb.h
smb_fs.h
smb_fs_i.h
smb_fs_sb.h
smb_mount.h
smbno.h
smp.h
smp_lock.h
snmp.h [SNMP]: Add definitions for {In,Out}BcastPkts 2007-04-30 00:58:19 -07:00
socket.h
sockios.h
som.h
sonet.h
sony-laptop.h sony-laptop: add a meye-usable include file for camera ops 2007-04-28 22:06:01 -04:00
sonypi.h
sort.h
sound.h
soundcard.h
spinlock.h
spinlock_api_smp.h
spinlock_api_up.h
spinlock_types.h
spinlock_types_up.h
spinlock_up.h
srcu.h
stacktrace.h
stallion.h
start_kernel.h
stat.h
statfs.h
stddef.h
stop_machine.h
string.h
stringify.h
superhyway.h
suspend.h
svga.h
swap.h
swapops.h
synclink.h
sys.h
syscalls.h
sysctl.h
sysdev.h
sysfs.h
sysrq.h
sysv_fs.h
task_io_accounting.h
task_io_accounting_ops.h
taskstats.h
taskstats_kern.h
tc.h
tcp.h
telephony.h
termios.h
textsearch.h
textsearch_fsm.h
tfrc.h
thread_info.h
threads.h
ticable.h
tick.h
tifm.h
time.h
timer.h
times.h
timex.h
tiocl.h
tipc.h
tipc_config.h
topology.h
toshiba.h
transport_class.h
trdevice.h
tsacct_kern.h
tty.h
tty_driver.h
tty_flip.h
tty_ldisc.h
types.h
uaccess.h
udf_fs.h
udf_fs_i.h
udf_fs_sb.h
udp.h
ufs_fs.h
ufs_fs_i.h
ufs_fs_sb.h
uinput.h
uio.h
ultrasound.h
umem.h
un.h
unistd.h
unwind.h
usb.h
usb_gadget.h
usb_gadgetfs.h
usb_usual.h
usbdevice_fs.h
user.h
utime.h
uts.h
utsname.h
vermagic.h
vfs.h
via.h
video_decoder.h
video_encoder.h
video_output.h
videodev.h
videodev2.h
videotext.h
vmalloc.h
vmstat.h
vt.h
vt_buffer.h
vt_kern.h
wait.h
wanrouter.h
watchdog.h
wireless.h [PATCH] Update my email address from jkmaline@cc.hut.fi to j@w1.fi 2007-04-28 11:01:01 -04:00
workqueue.h
writeback.h
x25.h
xattr.h
xfrm.h [XFRM]: Export SPD info 2007-04-28 21:20:32 -07:00
yam.h
zconf.h
zlib.h
zorro.h
zorro_ids.h
zutil.h