smp_mb() inside tg3_tx_avail() is used twice in the normal tg3_start_xmit() path (see illustration below). The full memory barrier is only necessary during race conditions with tx completion. We can speed up the tx path by replacing smp_mb() in tg3_tx_avail() with a compiler barrier. The compiler barrier is to force the compiler to fetch the tx_prod and tx_cons from memory. In the race condition between tg3_start_xmit() and tg3_tx(), we have the following situation: tg3_start_xmit() tg3_tx() if (!tg3_tx_avail()) BUG(); ... if (!tg3_tx_avail()) netif_tx_stop_queue(); update_tx_index(); smp_mb(); smp_mb(); if (tg3_tx_avail()) if (netif_tx_queue_stopped() && netif_tx_wake_queue(); tg3_tx_avail()) With smp_mb() removed from tg3_tx_avail(), we need to add smp_mb() to tg3_start_xmit() as shown above to properly order netif_tx_stop_queue() and tg3_tx_avail() to check the ring index. If it is not strictly ordered, the tx queue can be stopped forever. This improves performance by about 3% with 2 ports running bi-directional 64-byte packets. Reviewed-by: Benjamin Li <benli@broadcom.com> Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> |
||
---|---|---|
.. | ||
accessibility | ||
acpi | ||
amba | ||
ata | ||
atm | ||
auxdisplay | ||
base | ||
block | ||
bluetooth | ||
cdrom | ||
char | ||
clocksource | ||
connector | ||
cpufreq | ||
cpuidle | ||
crypto | ||
dca | ||
dio | ||
dma | ||
edac | ||
eisa | ||
firewire | ||
firmware | ||
gpio | ||
gpu | ||
hid | ||
hwmon | ||
i2c | ||
ide | ||
idle | ||
ieee1394 | ||
ieee802154 | ||
infiniband | ||
input | ||
isdn | ||
leds | ||
lguest | ||
macintosh | ||
mca | ||
md | ||
media | ||
memstick | ||
message | ||
mfd | ||
misc | ||
mmc | ||
mtd | ||
net | ||
nubus | ||
of | ||
oprofile | ||
parisc | ||
parport | ||
pci | ||
pcmcia | ||
platform | ||
pnp | ||
power | ||
pps | ||
ps3 | ||
rapidio | ||
regulator | ||
rtc | ||
s390 | ||
sbus | ||
scsi | ||
serial | ||
sfi | ||
sh | ||
sn | ||
spi | ||
ssb | ||
staging | ||
tc | ||
telephony | ||
thermal | ||
uio | ||
usb | ||
uwb | ||
vhost | ||
video | ||
virtio | ||
vlynq | ||
w1 | ||
watchdog | ||
xen | ||
zorro | ||
Kconfig | ||
Makefile |