linux-stable-rt/net/ipv4
Eric Dumazet 2f970d8357 [IPV4]: rt_cache_stat can be statically defined
Using __get_cpu_var(obj) is slightly faster than per_cpu_ptr(obj, 
raw_smp_processor_id()).

1) Smaller code and memory use
For static and small objects, DEFINE_PER_CPU(type, object) is preferred over a 
alloc_percpu() : Better and smaller code to access them, and no extra memory 
(storing the pointer, and the percpu array of pointers)

x86_64 code before patch

mov    1237577(%rip),%rax        # ffffffff803e5990 <rt_cache_stat>
not    %rax  # part of per_cpu machinery
mov    %gs:0x3c,%edx # get cpu number
movslq %edx,%rdx # extend 32 bits cpu number to 64 bits
mov    (%rax,%rdx,8),%rax # get the pointer for this cpu
incl   0x38(%rax)

x86_64 code after patch

mov    $per_cpu__rt_cache_stat,%rdx
mov    %gs:0x48,%rax # get percpu data offset
incl   0x38(%rax,%rdx,1)

2) False sharing avoidance for SMP :
For a small NR_CPUS, the array of per cpu pointers allocated in alloc_percpu() 
can be <= 32 bytes. This let slab code gives a part of a cache line. If the 
other part of this 64 bytes (or 128 bytes) cache line is used by a mostly 
written object, we can have false sharing and expensive per_cpu_ptr() operations.

Size of rt_cache_stat is 64 bytes, so this patch is not a danger of a too big 
increase of bss (in UP mode) or static per_cpu data for SMP 
(PERCPU_ENOUGH_ROOM is currently 32768 bytes)

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-17 02:54:36 -08:00
..
ipvs [PATCH] capable/capability.h (net/) 2006-01-11 18:42:14 -08:00
netfilter [NETFILTER]: ip_conntrack_proto_gre.c needs linux/interrupt.h 2006-01-17 02:42:02 -08:00
Kconfig
Makefile
af_inet.c [PATCH] capable/capability.h (net/) 2006-01-11 18:42:14 -08:00
ah4.c
arp.c [PATCH] capable/capability.h (net/) 2006-01-11 18:42:14 -08:00
datagram.c
devinet.c [PATCH] capable/capability.h (net/) 2006-01-11 18:42:14 -08:00
esp4.c
fib_frontend.c x86: Work around compiler code generation bug with -Os 2006-01-14 22:08:28 -08:00
fib_hash.c
fib_lookup.h
fib_rules.c
fib_semantics.c
fib_trie.c
icmp.c
igmp.c [NET]: Remove more unneeded typecasts on *malloc() 2006-01-11 16:32:14 -08:00
inet_connection_sock.c
inet_diag.c
inet_hashtables.c
inet_timewait_sock.c
inetpeer.c
ip_forward.c
ip_fragment.c
ip_gre.c [PATCH] capable/capability.h (net/) 2006-01-11 18:42:14 -08:00
ip_input.c
ip_options.c [PATCH] capable/capability.h (net/) 2006-01-11 18:42:14 -08:00
ip_output.c
ip_sockglue.c [NET]: Remove more unneeded typecasts on *malloc() 2006-01-11 16:32:14 -08:00
ipcomp.c
ipconfig.c
ipip.c [PATCH] capable/capability.h (net/) 2006-01-11 18:42:14 -08:00
ipmr.c [PATCH] capable/capability.h (net/) 2006-01-11 18:42:14 -08:00
multipath.c
multipath_drr.c
multipath_random.c
multipath_rr.c
multipath_wrandom.c
netfilter.c
proc.c
protocol.c
raw.c
route.c [IPV4]: rt_cache_stat can be statically defined 2006-01-17 02:54:36 -08:00
syncookies.c
sysctl_net_ipv4.c
tcp.c
tcp_bic.c
tcp_cong.c
tcp_cubic.c
tcp_diag.c
tcp_highspeed.c
tcp_htcp.c
tcp_hybla.c
tcp_input.c
tcp_ipv4.c
tcp_minisocks.c
tcp_output.c
tcp_scalable.c
tcp_timer.c
tcp_vegas.c
tcp_westwood.c
udp.c
xfrm4_input.c
xfrm4_output.c
xfrm4_policy.c
xfrm4_state.c [XFRM]: IPsec tunnel wildcard address support 2006-01-13 14:34:36 -08:00
xfrm4_tunnel.c