WSL2-Linux-Kernel/net
Willem de Bruijn f2b3ee9e42 ipv6: Fix ip_gre lockless xmits.
Tunnel devices set NETIF_F_LLTX to bypass HARD_TX_LOCK.  Sit and
ipip set this unconditionally in ops->setup, but gre enables it
conditionally after parameter passing in ops->newlink. This is
not called during tunnel setup as below, however, so GRE tunnels are
still taking the lock.

modprobe ip_gre
ip tunnel add test0 mode gre remote 10.5.1.1 dev lo
ip link set test0 up
ip addr add 10.6.0.1 dev test0
 # cat /sys/class/net/test0/features
 # $DIR/test_tunnel_xmit 10 10.5.2.1
ip route add 10.5.2.0/24 dev test0
ip tunnel del test0

The newlink callback is only called in rtnl_netlink, and only if
the device is new, as it calls register_netdevice internally. Gre
tunnels are created at 'ip tunnel add' with ioctl SIOCADDTUNNEL,
which calls ipgre_tunnel_locate, which calls register_netdev.
rtnl_newlink is called at 'ip link set', but skips ops->newlink
and the device is up with locking still enabled. The equivalent
ipip tunnel works fine, btw (just substitute 'method gre' for
'method ipip').

On kernels before /sys/class/net/*/features was removed [1],
the first commented out line returns 0x6000 with method gre,
which indicates that NETIF_F_LLTX (0x1000) is not set. With ipip,
it reports 0x7000. This test cannot be used on recent kernels where
the sysfs file is removed (and ETHTOOL_GFEATURES does not currently
work for tunnel devices, because they lack dev->ethtool_ops).

The second commented out line calls a simple transmission test [2]
that sends on 24 cores at maximum rate. Results of a single run:

ipip:			19,372,306
gre before patch:	 4,839,753
gre after patch:	19,133,873

This patch replicates the condition check in ipgre_newlink to
ipgre_tunnel_locate. It works for me, both with oseq on and off.
This is the first time I looked at rtnetlink and iproute2 code,
though, so someone more knowledgeable should probably check the
patch. Thanks.

The tail of both functions is now identical, by the way. To avoid
code duplication, I'll be happy to rework this and merge the two.

[1] http://patchwork.ozlabs.org/patch/104610/
[2] http://kernel.googlecode.com/files/xmit_udp_parallel.c

Signed-off-by: Willem de Bruijn <willemb@google.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-01-26 16:34:08 -05:00
..
9p virtio: rename virtqueue_add_buf_gfp to virtqueue_add_buf 2012-01-12 15:44:42 +10:30
802
8021q
appletalk
atm
ax25 ax25: avoid overflows in ax25_setsockopt() 2011-12-28 14:08:08 -05:00
batman-adv
bluetooth bluetooth: hci: Fix type of "enable_hs" to bool. 2012-01-22 15:08:46 -05:00
bridge bridge: BH already disabled in br_fdb_cleanup() 2012-01-17 10:17:32 -05:00
caif caif: Remove bad WARN_ON in caif_dev 2012-01-17 10:46:55 -05:00
can
ceph libceph: remove useless return value for osd_client __send_request() 2012-01-10 08:57:03 -08:00
core netns: fix net_alloc_generic() 2012-01-26 13:36:19 -05:00
dcb
dccp inet_diag: Rename inet_diag_req into inet_diag_req_v2 2012-01-11 12:56:06 -08:00
decnet Merge branch 'for-linus' of git://selinuxproject.org/~jmorris/linux-security 2012-01-14 18:36:33 -08:00
dns_resolver
dsa
econet
ethernet
ieee802154
ipv4 ipv6: Fix ip_gre lockless xmits. 2012-01-26 16:34:08 -05:00
ipv6 tcp: md5: using remote adress for md5 lookup in rst packet 2012-01-22 15:08:45 -05:00
ipx
irda irda: use msecs_to_jiffies() rather than manual calculation 2011-12-21 15:46:22 -05:00
iucv af_iucv: get rid of state IUCV_SEVERED 2011-12-20 14:05:03 -05:00
key
l2tp l2tp: l2tp_ip - fix possible oops on packet receive 2012-01-25 21:45:00 -05:00
lapb
llc llc: Fix race condition in llc_ui_recvmsg 2012-01-24 15:33:19 -05:00
mac80211 mac80211: fix work removal on deauth request 2012-01-18 14:38:06 -05:00
netfilter Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2012-01-17 22:26:41 -08:00
netlabel net: reintroduce missing rcu_assign_pointer() calls 2012-01-12 12:26:56 -08:00
netlink Merge branch 'for-linus' of git://selinuxproject.org/~jmorris/linux-security 2012-01-14 18:36:33 -08:00
netrom netrom: avoid overflows in nr_setsockopt() 2011-12-28 14:08:08 -05:00
nfc NFC: Export a new attribute nfcid1 in target info 2012-01-04 14:30:43 -05:00
openvswitch openvswitch: Fix multipart datapath dumps. 2012-01-17 23:56:19 -05:00
packet Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2011-12-30 13:04:14 -05:00
phonet net: reintroduce missing rcu_assign_pointer() calls 2012-01-12 12:26:56 -08:00
rds rds: Make rds_sock_lock BH rather than IRQ safe. 2012-01-24 17:03:44 -05:00
rfkill Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem 2012-01-05 10:13:24 -05:00
rose
rxrpc net: fix assignment of 0/1 to bool variables. 2011-12-19 22:27:29 -05:00
sched netem: Fix off-by-one bug in reordering 2012-01-22 15:08:44 -05:00
sctp Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2012-01-08 13:21:22 -08:00
sunrpc Merge branch 'for-3.3' of git://linux-nfs.org/~bfields/linux 2012-01-14 12:26:41 -08:00
tipc tipc: rename struct bearer_name to struct tipc_bearer_names 2011-12-29 21:53:30 -05:00
unix Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2012-01-09 14:46:52 -08:00
wanrouter
wimax
wireless nl80211: fix old station flags compatibility 2012-01-11 15:14:50 -05:00
x25
xfrm Merge branch 'for-linus' of git://selinuxproject.org/~jmorris/linux-security 2012-01-14 18:36:33 -08:00
Kconfig
Makefile
compat.c
nonet.c
socket.c net: reintroduce missing rcu_assign_pointer() calls 2012-01-12 12:26:56 -08:00
sysctl_net.c