original_kernel/drivers
Matt Carlson f65aac166f tg3: Improve small packet performance
smp_mb() inside tg3_tx_avail() is used twice in the normal
tg3_start_xmit() path (see illustration below).  The full memory
barrier is only necessary during race conditions with tx completion.
We can speed up the tx path by replacing smp_mb() in tg3_tx_avail()
with a compiler barrier.  The compiler barrier is to force the
compiler to fetch the tx_prod and tx_cons from memory.

In the race condition between tg3_start_xmit() and tg3_tx(),
we have the following situation:

tg3_start_xmit()                       tg3_tx()
    if (!tg3_tx_avail())
        BUG();

    ...

    if (!tg3_tx_avail())
        netif_tx_stop_queue();         update_tx_index();
        smp_mb();                      smp_mb();
        if (tg3_tx_avail())            if (netif_tx_queue_stopped() &&
            netif_tx_wake_queue();         tg3_tx_avail())

With smp_mb() removed from tg3_tx_avail(), we need to add smp_mb() to
tg3_start_xmit() as shown above to properly order netif_tx_stop_queue()
and tg3_tx_avail() to check the ring index.  If it is not strictly
ordered, the tx queue can be stopped forever.

This improves performance by about 3% with 2 ports running
bi-directional 64-byte packets.

Reviewed-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-02 15:46:31 -07:00
..
accessibility
acpi
amba
ata
atm
auxdisplay
base
block
bluetooth
cdrom
char
clocksource
connector
cpufreq
cpuidle
crypto
dca
dio
dma
edac
eisa
firewire
firmware
gpio
gpu
hid
hwmon
i2c
ide
idle
ieee1394
ieee802154
infiniband
input
isdn
leds
lguest
macintosh
mca
md
media
memstick
message
mfd
misc
mmc
mtd
net tg3: Improve small packet performance 2010-08-02 15:46:31 -07:00
nubus
of
oprofile
parisc
parport
pci
pcmcia
platform
pnp
power
pps
ps3
rapidio
regulator
rtc
s390
sbus
scsi
serial
sfi
sh
sn
spi
ssb
staging
tc
telephony
thermal
uio
usb
uwb
vhost
video
virtio
vlynq
w1
watchdog
xen
zorro
Kconfig
Makefile