Граф коммитов

31419 Коммитов

Автор SHA1 Сообщение Дата
FX Le Bail d94c1f92bb ipv6: icmp6_send: fix Oops when pinging a not set up IPv6 peer on a sit tunnel
The patch 446fab5933 ("ipv6: enable anycast addresses
as source addresses in ICMPv6 error messages") causes an Oops when pinging a not
set up IPv6 peer on a sit tunnel.

The problem is that ipv6_anycast_destination() uses unconditionally skb_dst(skb),
which is NULL in this case.

The solution is to use instead the ipv6_chk_acast_addr_src() function.

Here are the steps to reproduce it:
modprobe sit
ip link add sit1 type sit remote 10.16.0.121 local 10.16.0.249
ip l s sit1 up
ip -6 a a dev sit1 2001🔢:123 remote 2001🔢:121
ping6 2001🔢:121

Reported-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Tested-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Francois-Xavier Le Bail <fx.lebail@yahoo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-09 18:12:40 -08:00
Rashika Kheria e1d83ee673 net: Mark functions as static in net/sunrpc/svc_xprt.c
Mark functions as static in net/sunrpc/svc_xprt.c because they are not
used outside this file.

This eliminates the following warning in net/sunrpc/svc_xprt.c:
net/sunrpc/svc_xprt.c:574:5: warning: no previous prototype for ‘svc_alloc_arg’ [-Wmissing-prototypes]
net/sunrpc/svc_xprt.c:615:18: warning: no previous prototype for ‘svc_get_next_xprt’ [-Wmissing-prototypes]
net/sunrpc/svc_xprt.c:694:6: warning: no previous prototype for ‘svc_add_new_temp_xprt’ [-Wmissing-prototypes]

Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-09 17:32:50 -08:00
Rashika Kheria bd76ed36ba net: Include appropriate header file in netfilter/nft_lookup.c
Include appropriate header file net/netfilter/nf_tables_core.h in
net/netfilter/nft_lookup.c because it has prototype declaration of
functions defined in net/netfilter/nft_lookup.c.

This eliminates the following warning in net/netfilter/nft_lookup.c:
net/netfilter/nft_lookup.c:133:12: warning: no previous prototype for ‘nft_lookup_module_init’ [-Wmissing-prototypes]
net/netfilter/nft_lookup.c:138:6: warning: no previous prototype for ‘nft_lookup_module_exit’ [-Wmissing-prototypes]

Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-09 17:32:50 -08:00
Rashika Kheria 535d3ae9c8 net: Move prototype declaration to header file include/net/net_namespace.h from net/ipx/af_ipx.c
Move prototype declaration of function to header file
include/net/net_namespace.h from net/ipx/af_ipx.c because they are used
by more than one file.

This eliminates the following warning in net/ipx/sysctl_net_ipx.c:
net/ipx/sysctl_net_ipx.c:33:6: warning: no previous prototype for ‘ipx_register_sysctl’ [-Wmissing-prototypes]
net/ipx/sysctl_net_ipx.c:38:6: warning: no previous prototype for ‘ipx_unregister_sysctl’ [-Wmissing-prototypes]

Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-09 17:32:50 -08:00
Rashika Kheria 7780d8ae4a net: Move prototype declaration to header file include/net/datalink.h from net/ipx/af_ipx.c
Move prototype declarations of function to header file
include/net/datalink.h from net/ipx/af_ipx.c because they are used by
more than one file.

This eliminates the following warning in net/ipx/pe2.c:
net/ipx/pe2.c:20:24: warning: no previous prototype for ‘make_EII_client’ [-Wmissing-prototypes]
net/ipx/pe2.c:32:6: warning: no previous prototype for ‘destroy_EII_client’ [-Wmissing-prototypes]

Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-09 17:32:50 -08:00
Rashika Kheria 578efbc19f net: Move prototype declaration to header file include/net/ipx.h from net/ipx/af_ipx.c
Move prototype declaration of functions to header file include/net/ipx.h
from net/ipx/af_ipx.c because they are used by more than one file.

This eliminates the following warning in
net/ipx/ipx_route.c:33:19: warning: no previous prototype for ‘ipxrtr_lookup’ [-Wmissing-prototypes]
net/ipx/ipx_route.c:52:5: warning: no previous prototype for ‘ipxrtr_add_route’ [-Wmissing-prototypes]
net/ipx/ipx_route.c:94:6: warning: no previous prototype for ‘ipxrtr_del_routes’ [-Wmissing-prototypes]
net/ipx/ipx_route.c:149:5: warning: no previous prototype for ‘ipxrtr_route_skb’ [-Wmissing-prototypes]
net/ipx/ipx_route.c:171:5: warning: no previous prototype for ‘ipxrtr_route_packet’ [-Wmissing-prototypes]
net/ipx/ipx_route.c:261:5: warning: no previous prototype for ‘ipxrtr_ioctl’ [-Wmissing-prototypes]

Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-09 17:32:50 -08:00
Rashika Kheria 493cc5e5ba net: Move prototype declaration to include/net/ipx.h from net/ipx/ipx_route.c
Move prototype definition of function to header file include/net/ipx.h
from net/ipx/ipx_route.c because they are used by more than one file.

This eliminates the following warning from net/ipx/af_ipx.c:
net/ipx/af_ipx.c:193:23: warning: no previous prototype for ‘ipxitf_find_using_net’ [-Wmissing-prototypes]
net/ipx/af_ipx.c:577:5: warning: no previous prototype for ‘ipxitf_send’ [-Wmissing-prototypes]
net/ipx/af_ipx.c:1219:8: warning: no previous prototype for ‘ipx_cksum’ [-Wmissing-prototypes]

Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-09 17:32:49 -08:00
Rashika Kheria ab3301bd96 net: Move prototype declaration to header file include/net/dn.h from net/decnet/af_decnet.c
Move prototype declaration of functions to header file include/net/dn.h
from net/decnet/af_decnet.c because they are used by more than one file.

This eliminates the following warning in net/decnet/af_decnet.c:
net/decnet/sysctl_net_decnet.c:354:6: warning: no previous prototype for ‘dn_register_sysctl’ [-Wmissing-prototypes]
net/decnet/sysctl_net_decnet.c:359:6: warning: no previous prototype for ‘dn_unregister_sysctl’ [-Wmissing-prototypes]

Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-09 17:32:49 -08:00
Rashika Kheria f56b8bf6e4 net: Move prototype declaration to appropriate header file from decnet/af_decnet.c
Move prototype declaration of functions to header file include/net/dn_route.h
from net/decnet/af_decnet.c because it is used by more than one file.

This eliminates the following warning in net/decnet/dn_route.c:
net/decnet/dn_route.c:629:5: warning: no previous prototype for ‘dn_route_rcv’ [-Wmissing-prototypes]

Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-09 17:32:49 -08:00
Rashika Kheria 0a59f3a9fd net: Mark functions as static in core/dev.c
Mark functions as static in core/dev.c because they are not used outside
this file.

This eliminates the following warning in core/dev.c:
net/core/dev.c:2806:5: warning: no previous prototype for ‘__dev_queue_xmit’ [-Wmissing-prototypes]
net/core/dev.c:4640:5: warning: no previous prototype for ‘netdev_adjacent_sysfs_add’ [-Wmissing-prototypes]
net/core/dev.c:4650:6: warning: no previous prototype for ‘netdev_adjacent_sysfs_del’ [-Wmissing-prototypes]

Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Reviewed-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-09 17:32:49 -08:00
Rashika Kheria 02fe72c9ed net: Include appropriate header file in caif/cfsrvl.c
Include appropriate header file net/caif/caif_dev.h in caif/cfsrvl.c
because it has prototype declaration of functions defined in
caif/cfsrvl.c.

This eliminates the following warning in caif/cfsrvl.c:
net/caif/cfsrvl.c:198:6: warning: no previous prototype for ‘caif_free_client’ [-Wmissing-prototypes]
net/caif/cfsrvl.c:208:6: warning: no previous prototype for ‘caif_client_register_refcnt’ [-Wmissing-prototypes]

Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-09 17:32:49 -08:00
Rashika Kheria 8203274e15 net: Include appropriate header file in caif/caif_dev.c
Include appropriate header file net/caif/caif_dev.h in caif/caif_dev.c
because it has prototype declarations of function defined in
caif/caif_dev.c.

This eliminates the following file in caif/caif_dev.c:
net/caif/caif_dev.c:303:6: warning: no previous prototype for ‘caif_enroll_dev’ [-Wmissing-prototypes]

Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-09 17:32:49 -08:00
Rashika Kheria fd944bded5 net: Mark function as static in 9p/client.c
Mark function as static in net/9p/client.c because it is not used
outside this file.

This eliminates the following warning in net/9p/client.c:
net/9p/client.c:207:18: warning: no previous prototype for ‘p9_fcall_alloc’ [-Wmissing-prototypes]

Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-09 17:32:49 -08:00
David S. Miller 872c7e6fd2 Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless
John W. Linville says:

====================
Please pull this batch of fixes intended for the 3.14 stream!

For the mac80211 bits, Johannes says:

"This is just a collection of small fixes, the commit logs explain the
details. The only thing that isn't strictly a fix is the 5/10 MHz
enabling, I had forgotten this and there's little point in waiting
longer. The patch simply removes the force-disable code that I put in
when there was a problem with the userspace API (that has long been
fixed.)"

For the iwlwifi bits, Emmanuel says:

"I have an important fix that disables A band in case the driver thought
it was enabled, and the firmware disagreed. We ended up making the
firmware unhappy. I also fix the station table in AP mode and fix the
scan while we have BT working.
Johannes removes a static variable that could potentially lead to to
issues on multi-device setups and disables scheduled scan to avoid
issues with old versions of wpa_supplicant.
A small fix from David on scan and a few new device IDs for 7265."

On top of that...

Oleksij Rempel adds a USB ID to the ar5523 driver and changes the
default powersave setting for ath9k_htc to "off", due to observed
stability issues (based on an equivalent ath9k patch).

Stanislaw Gruszka similarly disables powersave for a couple of rt2x00
drivers.  He also fixes a couple of scheduling while atomic issues
in ath9k_htc.

Sujith Manoharan rounds-out the powersave disables with one for ath9k.
He also fixes a build prolem with ath9k on ARM and fixes an ath9k Tx
power calculation.

Finally, Andrea Merello fixes a couple of lingering DMA mapping
problems in the rtl8180 driver.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-09 14:21:25 -08:00
David S. Miller f41f031960 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
Netfilter/nftables/IPVS fixes for net

The following patchset contains Netfilter/IPVS fixes, mostly nftables
fixes, most relevantly they are:

* Fix a crash in the h323 conntrack NAT helper due to expectation list
  corruption, from Alexey Dobriyan.

* A couple of RCU race fixes for conntrack, one manifests by hitting BUG_ON
  in nf_nat_setup_info() and the destroy path, patches from Andrey Vagin and
  me.

* Dump direction attribute in nft_ct only if it is set, from Arturo
  Borrero.

* Fix IPVS bug in its own connection tracking system that may lead to
  copying only 4 bytes of the IPv6 address when initializing the
  ip_vs_conn object, from Michal Kubecek.

* Fix -EBUSY errors in nftables when deleting the rules, chain and tables
  in a row due mixture of asynchronous and synchronous object releasing,
  from me.

* Three fixes for the nf_tables set infrastructure when using intervals and
  mappings, from me.

* Four patches to fixing the nf_tables log, reject and ct expressions from
  the new inet table, from Patrick McHardy.

* Fix memory overrun in the map that is used to dynamically allocate names
  from anonymous sets, also from Patrick.

* Fix a potential oops if you dump a set with NFPROTO_UNSPEC and a table
  name, from Patrick McHardy.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-09 14:20:00 -08:00
John W. Linville 0f96b860bc Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem 2014-02-07 13:44:14 -05:00
Patrick McHardy 6d8c00d58e netfilter: nf_tables: unininline nft_trace_packet()
It makes no sense to inline a rarely used function meant for debugging
only that is called a total of five times in the main evaluation loop.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-02-07 17:50:27 +01:00
Pablo Neira Ayuso 62f9c8b40d netfilter: nf_tables: fix loop checking with end interval elements
Fix access to uninitialized data for end interval elements. The
element data part is uninitialized in interval end elements.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-02-07 17:21:45 +01:00
Pablo Neira Ayuso 2fb91ddbf8 netfilter: nft_rbtree: fix data handling of end interval elements
This patch fixes several things which related to the handling of
end interval elements:

* Chain use underflow with intervals and map: If you add a rule
  using intervals+map that introduces a loop, the error path of the
  rbtree set decrements the chain refcount for each side of the
  interval, leading to a chain use counter underflow.

* Don't copy the data part of the end interval element since, this
  area is uninitialized and this confuses the loop detection code.

* Don't allocate room for the data part of end interval elements
  since this is unused.

So, after this patch the idea is that end interval elements don't
have a data part.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Patrick McHardy <kaber@trash.net>
2014-02-07 14:22:06 +01:00
Pablo Neira Ayuso bd7fc645da netfilter: nf_tables: do not allow NFT_SET_ELEM_INTERVAL_END flag and data
This combination is not allowed since end interval elements cannot
contain data.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Patrick McHardy <kaber@trash.net>
2014-02-07 14:21:49 +01:00
Eric Dumazet 4a5ab4e224 tcp: remove 1ms offset in srtt computation
TCP pacing depends on an accurate srtt estimation.

Current srtt estimation is using jiffie resolution,
and has an artificial offset of at least 1 ms, which can produce
slowdowns when FQ/pacing is used, especially in DC world,
where typical rtt is below 1 ms.

We are planning a switch to usec resolution for linux-3.15,
but in the meantime, this patch removes the 1 ms offset.

All we need is to have tp->srtt minimal value of 1 to differentiate
the case of srtt being initialized or not, not 8.

The problematic behavior was observed on a 40Gbit testbed,
where 32 concurrent netperf were reaching 12Gbps of aggregate
speed, instead of line speed.

This patch also has the effect of reporting more accurate srtt and send
rates to iproute2 ss command as in :

$ ss -i dst cca2
Netid  State      Recv-Q Send-Q          Local Address:Port
Peer Address:Port
tcp    ESTAB      0      0                10.244.129.1:56984
10.244.129.2:12865
	 cubic wscale:6,6 rto:200 rtt:0.25/0.25 ato:40 mss:1448 cwnd:10 send
463.4Mbps rcv_rtt:1 rcv_space:29200
tcp    ESTAB      0      390960           10.244.129.1:60247
10.244.129.2:50204
	 cubic wscale:6,6 rto:200 rtt:0.875/0.75 mss:1448 cwnd:73 ssthresh:51
send 966.4Mbps unacked:73 retrans:0/121 rcv_space:29200

Reported-by: Vytautas Valancius <valas@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-06 21:28:06 -08:00
Cong Wang dbe173079a bridge: fix netconsole setup over bridge
Commit 93d8bf9fb8 ("bridge: cleanup netpoll code") introduced
a check in br_netpoll_enable(), but this check is incorrect for
br_netpoll_setup(). This patch moves the code after the check
into __br_netpoll_enable() and calls it in br_netpoll_setup().
For br_add_if(), the check is still needed.

Fixes: 93d8bf9fb8 ("bridge: cleanup netpoll code")
Cc: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Tested-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-06 21:28:06 -08:00
Eric Dumazet ed98df3361 net: use __GFP_NORETRY for high order allocations
sock_alloc_send_pskb() & sk_page_frag_refill()
have a loop trying high order allocations to prepare
skb with low number of fragments as this increases performance.

Problem is that under memory pressure/fragmentation, this can
trigger OOM while the intent was only to try the high order
allocations, then fallback to order-0 allocations.

We had various reports from unexpected regressions.

According to David, setting __GFP_NORETRY should be fine,
as the asynchronous compaction is still enabled, and this
will prevent OOM from kicking as in :

CFSClientEventm invoked oom-killer: gfp_mask=0x42d0, order=3, oom_adj=0,
oom_score_adj=0, oom_score_badness=2 (enabled),memcg_scoring=disabled
CFSClientEventm

Call Trace:
 [<ffffffff8043766c>] dump_header+0xe1/0x23e
 [<ffffffff80437a02>] oom_kill_process+0x6a/0x323
 [<ffffffff80438443>] out_of_memory+0x4b3/0x50d
 [<ffffffff8043a4a6>] __alloc_pages_may_oom+0xa2/0xc7
 [<ffffffff80236f42>] __alloc_pages_nodemask+0x1002/0x17f0
 [<ffffffff8024bd23>] alloc_pages_current+0x103/0x2b0
 [<ffffffff8028567f>] sk_page_frag_refill+0x8f/0x160
 [<ffffffff80295fa0>] tcp_sendmsg+0x560/0xee0
 [<ffffffff802a5037>] inet_sendmsg+0x67/0x100
 [<ffffffff80283c9c>] __sock_sendmsg_nosec+0x6c/0x90
 [<ffffffff80283e85>] sock_sendmsg+0xc5/0xf0
 [<ffffffff802847b6>] __sys_sendmsg+0x136/0x430
 [<ffffffff80284ec8>] sys_sendmsg+0x88/0x110
 [<ffffffff80711472>] system_call_fastpath+0x16/0x1b
Out of Memory: Kill process 2856 (bash) score 9999 or sacrifice child

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-06 21:28:06 -08:00
Sabrina Dubroca 00fe11b3c6 netpoll: fix netconsole IPv6 setup
Currently, to make netconsole start over IPv6, the source address
needs to be specified. Without a source address, netpoll_parse_options
assumes we're setting up over IPv4 and the destination IPv6 address is
rejected.

Check if the IP version has been forced by a source address before
checking for a version mismatch when parsing the destination address.

Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Acked-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-06 21:28:06 -08:00
Matija Glavinic Pecotic 661dbf3412 net: sctp: fix initialization of local source address on accepted ipv6 sockets
commit 	efe4208f47f907b86f528788da711e8ab9dea44d:
'ipv6: make lookups simpler and faster' broke initialization of local source
address on accepted ipv6 sockets. Before the mentioned commit receive address
was copied along with the contents of ipv6_pinfo in sctp_v6_create_accept_sk.
Now when it is moved, it has to be copied separately.

This also fixes lksctp's ipv6 regression in a sense that test_getname_v6, TC5 -
'getsockname on a connected server socket' now passes.

Signed-off-by: Matija Glavinic Pecotic <matija.glavinic-pecotic.ext@nsn.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-06 21:18:06 -08:00
Geert Uytterhoeven 63b5f152eb ipv4: Fix runtime WARNING in rtmsg_ifa()
On m68k/ARAnyM:

WARNING: CPU: 0 PID: 407 at net/ipv4/devinet.c:1599 0x316a99()
Modules linked in:
CPU: 0 PID: 407 Comm: ifconfig Not tainted
3.13.0-atari-09263-g0c71d68014d1 #1378
Stack from 10c4fdf0:
        10c4fdf0 002ffabb 000243e8 00000000 008ced6c 00024416 00316a99 0000063f
        00316a99 00000009 00000000 002501b4 00316a99 0000063f c0a86117 00000080
        c0a86117 00ad0c90 00250a5a 00000014 00ad0c90 00000000 00000000 00000001
        00b02dd0 00356594 00000000 00356594 c0a86117 eff6c9e4 008ced6c 00000002
        008ced60 0024f9b4 00250b52 00ad0c90 00000000 00000000 00252390 00ad0c90
        eff6c9e4 0000004f 00000000 00000000 eff6c9e4 8000e25c eff6c9e4 80001020
Call Trace: [<000243e8>] warn_slowpath_common+0x52/0x6c
 [<00024416>] warn_slowpath_null+0x14/0x1a
 [<002501b4>] rtmsg_ifa+0xdc/0xf0
 [<00250a5a>] __inet_insert_ifa+0xd6/0x1c2
 [<0024f9b4>] inet_abc_len+0x0/0x42
 [<00250b52>] inet_insert_ifa+0xc/0x12
 [<00252390>] devinet_ioctl+0x2ae/0x5d6

Adding some debugging code reveals that net_fill_ifaddr() fails in

    put_cacheinfo(skb, ifa->ifa_cstamp, ifa->ifa_tstamp,
                              preferred, valid))

nla_put complains:

    lib/nlattr.c:454: skb_tailroom(skb) = 12, nla_total_size(attrlen) = 20

Apparently commit 5c766d642b ("ipv4:
introduce address lifetime") forgot to take into account the addition of
struct ifa_cacheinfo in inet_nlmsg_size(). Hence add it, like is already
done for ipv6.

Suggested-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-06 20:02:15 -08:00
Pablo Neira Ayuso 0165d9325d netfilter: nf_tables: fix racy rule deletion
We may lost race if we flush the rule-set (which happens asynchronously
via call_rcu) and we try to remove the table (that userspace assumes
to be empty).

Fix this by recovering synchronous rule and chain deletion. This was
introduced time ago before we had no batch support, and synchronous
rule deletion performance was not good. Now that we have the batch
support, we can just postpone the purge of old rule in a second step
in the commit phase. All object deletions are synchronous after this
patch.

As a side effect, we save memory as we don't need rcu_head per rule
anymore.

Cc: Patrick McHardy <kaber@trash.net>
Reported-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-02-06 11:46:06 +01:00
Patrick McHardy b8ecbee67c netfilter: nf_tables: fix log/queue expressions for NFPROTO_INET
The log and queue expressions both store the family during ->init() and
use it to deliver packets. This is wrong when used in NFPROTO_INET since
they should both deliver to the actual AF of the packet, not the dummy
NFPROTO_INET.

Use the family from the hook ops to fix this.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-02-06 11:41:38 +01:00
Johannes Berg fab57a6cc2 mac80211: fix virtual monitor interface iteration
During channel context assignment, the interface should
be found by interface iteration, so we need to assign the
pointer before the channel context.

Reported-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Tested-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-02-06 09:55:22 +01:00
Johannes Berg 338f977f4e mac80211: fix fragmentation code, particularly for encryption
The "new" fragmentation code (since my rewrite almost 5 years ago)
erroneously sets skb->len rather than using skb_trim() to adjust
the length of the first fragment after copying out all the others.
This leaves the skb tail pointer pointing to after where the data
originally ended, and thus causes the encryption MIC to be written
at that point, rather than where it belongs: immediately after the
data.

The impact of this is that if software encryption is done, then
 a) encryption doesn't work for the first fragment, the connection
    becomes unusable as the first fragment will never be properly
    verified at the receiver, the MIC is practically guaranteed to
    be wrong
 b) we leak up to 8 bytes of plaintext (!) of the packet out into
    the air

This is only mitigated by the fact that many devices are capable
of doing encryption in hardware, in which case this can't happen
as the tail pointer is irrelevant in that case. Additionally,
fragmentation is not used very frequently and would normally have
to be configured manually.

Fix this by using skb_trim() properly.

Cc: stable@vger.kernel.org
Fixes: 2de8e0d999 ("mac80211: rewrite fragmentation")
Reported-by: Jouni Malinen <j@w1.fi>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-02-06 09:55:21 +01:00
Sujith Manoharan d4c80d9df6 mac80211: Fix IBSS disconnect
Currently, when a station leaves an IBSS network, the
corresponding BSS is not dropped from cfg80211 if there are
other active stations in the network. But, the small
window that is present when trying to determine a station's
status based on IEEE80211_IBSS_MERGE_INTERVAL introduces
a race.

Instead of trying to keep the BSS, always remove it when
leaving an IBSS network. There is not much benefit to retain
the BSS entry since it will be added with a subsequent join
operation.

This fixes an issue where a dangling BSS entry causes ath9k
to wait for a beacon indefinitely.

Cc: <stable@vger.kernel.org>
Reported-by: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Sujith Manoharan <c_manoha@qca.qualcomm.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-02-06 09:55:20 +01:00
Emmanuel Grumbach 0297ea17bf mac80211: release the channel in error path in start_ap
When the driver cannot start the AP or when the assignement
of the beacon goes wrong, we need to unassign the vif.

Cc: stable@vger.kernel.org
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-02-06 09:55:20 +01:00
Johannes Berg f9d15d162b cfg80211: send scan results from work queue
Due to the previous commit, when a scan finishes, it is in theory
possible to hit the following sequence:
 1. interface starts being removed
 2. scan is cancelled by driver and cfg80211 is notified
 3. scan done work is scheduled
 4. interface is removed completely, rdev->scan_req is freed,
    event sent to userspace but scan done work remains pending
 5. new scan is requested on another virtual interface
 6. scan done work runs, freeing the still-running scan

To fix this situation, hang on to the scan done message and block
new scans while that is the case, and only send the message from
the work function, regardless of whether the scan_req is already
freed from interface removal. This makes step 5 above impossible
and changes step 6 to be
 5. scan done work runs, sending the scan done message

As this can't work for wext, so we send the message immediately,
but this shouldn't be an issue since we still return -EBUSY.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-02-06 09:55:19 +01:00
Johannes Berg a617302c53 cfg80211: fix scan done race
When an interface/wdev is removed, any ongoing scan should be
cancelled by the driver. This will make it call cfg80211, which
only queues a work struct. If interface/wdev removal is quick
enough, this can leave the scan request pending and processed
only after the interface is gone, causing a use-after-free.

Fix this by making sure the scan request is not pending after
the interface is destroyed. We can't flush or cancel the work
item due to locking concerns, but when it'll run it shouldn't
find anything to do. This leaves a potential issue, if a new
scan gets requested before the work runs, it prematurely stops
the running scan, potentially causing another crash. I'll fix
that in the next patch.

This was particularly observed with P2P_DEVICE wdevs, likely
because freeing them is quicker than freeing netdevs.

Reported-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com>
Fixes: 4a58e7c384 ("cfg80211: don't "leak" uncompleted scans")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-02-06 09:55:19 +01:00
Emmanuel Grumbach 8ffcc704c9 mac80211: avoid deadlock revealed by lockdep
sdata->u.ap.request_smps_work can’t be flushed synchronously
under wdev_lock(wdev) since ieee80211_request_smps_ap_work
itself locks the same lock.
While at it, reset the driver_smps_mode when the ap is
stopped to its default: OFF.

This solves:

======================================================
[ INFO: possible circular locking dependency detected ]
3.12.0-ipeer+ #2 Tainted: G           O
-------------------------------------------------------
rmmod/2867 is trying to acquire lock:
  ((&sdata->u.ap.request_smps_work)){+.+...}, at: [<c105b8d0>] flush_work+0x0/0x90

but task is already holding lock:
  (&wdev->mtx){+.+.+.}, at: [<f9b32626>] cfg80211_stop_ap+0x26/0x230 [cfg80211]

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (&wdev->mtx){+.+.+.}:
        [<c10aefa9>] lock_acquire+0x79/0xe0
        [<c1607a1a>] mutex_lock_nested+0x4a/0x360
        [<fb06288b>] ieee80211_request_smps_ap_work+0x2b/0x50 [mac80211]
        [<c105cdd8>] process_one_work+0x198/0x450
        [<c105d469>] worker_thread+0xf9/0x320
        [<c10669ff>] kthread+0x9f/0xb0
        [<c1613397>] ret_from_kernel_thread+0x1b/0x28

-> #0 ((&sdata->u.ap.request_smps_work)){+.+...}:
        [<c10ae9df>] __lock_acquire+0x183f/0x1910
        [<c10aefa9>] lock_acquire+0x79/0xe0
        [<c105b917>] flush_work+0x47/0x90
        [<c105d867>] __cancel_work_timer+0x67/0xe0
        [<c105d90f>] cancel_work_sync+0xf/0x20
        [<fb0765cc>] ieee80211_stop_ap+0x8c/0x340 [mac80211]
        [<f9b3268c>] cfg80211_stop_ap+0x8c/0x230 [cfg80211]
        [<f9b0d8f9>] cfg80211_leave+0x79/0x100 [cfg80211]
        [<f9b0da72>] cfg80211_netdev_notifier_call+0xf2/0x4f0 [cfg80211]
        [<c160f2c9>] notifier_call_chain+0x59/0x130
        [<c106c6de>] __raw_notifier_call_chain+0x1e/0x30
        [<c106c70f>] raw_notifier_call_chain+0x1f/0x30
        [<c14f8213>] call_netdevice_notifiers_info+0x33/0x70
        [<c14f8263>] call_netdevice_notifiers+0x13/0x20
        [<c14f82a4>] __dev_close_many+0x34/0xb0
        [<c14f83fe>] dev_close_many+0x6e/0xc0
        [<c14f9c77>] rollback_registered_many+0xa7/0x1f0
        [<c14f9dd4>] unregister_netdevice_many+0x14/0x60
        [<fb06f4d9>] ieee80211_remove_interfaces+0xe9/0x170 [mac80211]
        [<fb055116>] ieee80211_unregister_hw+0x56/0x110 [mac80211]
        [<fa3e9396>] iwl_op_mode_mvm_stop+0x26/0xe0 [iwlmvm]
        [<f9b9d8ca>] _iwl_op_mode_stop+0x3a/0x70 [iwlwifi]
        [<f9b9d96f>] iwl_opmode_deregister+0x6f/0x90 [iwlwifi]
        [<fa405179>] __exit_compat+0xd/0x19 [iwlmvm]
        [<c10b8bf9>] SyS_delete_module+0x179/0x2b0
        [<c1613421>] sysenter_do_call+0x12/0x32

Fixes: 687da13223 ("mac80211: implement SMPS for AP")
Cc: <stable@vger.kernel.org> [3.13]
Reported-by: Ilan Peer <ilan.peer@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-02-06 09:55:18 +01:00
Johannes Berg 5a6aa705ff cfg80211: re-enable 5/10 MHz support
Unfortunately I forgot this during the merge window, but the
patch seems small enough to go in as a fix. The userspace API
bug that was the reason for disabling it has long been fixed.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-02-06 09:55:18 +01:00
Pontus Fuchs f12cb28930 nl80211: Reset split_start when netlink skb is exhausted
When the netlink skb is exhausted split_start is left set. In the
subsequent retry, with a larger buffer, the dump is continued from the
failing point instead of from the beginning.

This was causing my rt28xx based USB dongle to now show up when
running "iw list" with an old iw version without split dump support.

Cc: stable@vger.kernel.org
Fixes: 3713b4e364 ("nl80211: allow splitting wiphy information in dumps")
Signed-off-by: Pontus Fuchs <pontus.fuchs@gmail.com>
[avoid the entire workaround when state->split is set]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-02-06 09:55:17 +01:00
Eliad Peller 2f617435c3 mac80211: move roc cookie assignment earlier
ieee80211_start_roc_work() might add a new roc
to existing roc, and tell cfg80211 it has already
started.

However, this might happen before the roc cookie
was set, resulting in REMAIN_ON_CHANNEL (started)
event with null cookie. Consequently, it can make
wpa_supplicant go out of sync.

Fix it by setting the roc cookie earlier.

Cc: stable@vger.kernel.org
Signed-off-by: Eliad Peller <eliad@wizery.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-02-06 09:55:17 +01:00
Patrick McHardy 05513e9e33 netfilter: nf_tables: add reject module for NFPROTO_INET
Add a reject module for NFPROTO_INET. It does nothing but dispatch
to the AF-specific modules based on the hook family.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-02-06 09:44:18 +01:00
Patrick McHardy cc4723ca31 netfilter: nft_reject: split up reject module into IPv4 and IPv6 specifc parts
Currently the nft_reject module depends on symbols from ipv6. This is
wrong since no generic module should force IPv6 support to be loaded.
Split up the module into AF-specific and a generic part.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-02-06 09:44:10 +01:00
Patrick McHardy 64d46806b6 netfilter: nf_tables: add AF specific expression support
For the reject module, we need to add AF-specific implementations to
get rid of incorrect module dependencies. Try to load an AF-specific
module first and fall back to generic modules.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-02-06 00:05:36 +01:00
Patrick McHardy 51292c0735 netfilter: nft_ct: fix missing NFT_CT_L3PROTOCOL key in validity checks
The key was missing in the list of valid keys, add it.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-02-06 00:05:33 +01:00
Patrick McHardy ec2c993568 netfilter: nf_tables: fix potential oops when dumping sets
Commit c9c8e48597 (netfilter: nf_tables: dump sets in all existing families)
changed nft_ctx_init_from_setattr() to only look up the address family if it
is not NFPROTO_UNSPEC. However if it is NFPROTO_UNSPEC and a table attribute
is given, nftables_afinfo_lookup() will dereference the NULL afi pointer.

Fix by checking for non-NULL afi and also move a check added by that commit
to the proper position.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-02-06 00:04:15 +01:00
Patrick McHardy 53b70287dd netfilter: nf_tables: fix overrun in nf_tables_set_alloc_name()
The map that is used to allocate anonymous sets is indeed
BITS_PER_BYTE * PAGE_SIZE long.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-02-05 17:46:07 +01:00
Pablo Neira Ayuso e53376bef2 netfilter: nf_conntrack: don't release a conntrack with non-zero refcnt
With this patch, the conntrack refcount is initially set to zero and
it is bumped once it is added to any of the list, so we fulfill
Eric's golden rule which is that all released objects always have a
refcount that equals zero.

Andrey Vagin reports that nf_conntrack_free can't be called for a
conntrack with non-zero ref-counter, because it can race with
nf_conntrack_find_get().

A conntrack slab is created with SLAB_DESTROY_BY_RCU. Non-zero
ref-counter says that this conntrack is used. So when we release
a conntrack with non-zero counter, we break this assumption.

CPU1                                    CPU2
____nf_conntrack_find()
                                        nf_ct_put()
                                         destroy_conntrack()
                                        ...
                                        init_conntrack
                                         __nf_conntrack_alloc (set use = 1)
atomic_inc_not_zero(&ct->use) (use = 2)
                                         if (!l4proto->new(ct, skb, dataoff, timeouts))
                                          nf_conntrack_free(ct); (use = 2 !!!)
                                        ...
                                        __nf_conntrack_alloc (set use = 1)
 if (!nf_ct_key_equal(h, tuple, zone))
  nf_ct_put(ct); (use = 0)
   destroy_conntrack()
                                        /* continue to work with CT */

After applying the path "[PATCH] netfilter: nf_conntrack: fix RCU
race in nf_conntrack_find_get" another bug was triggered in
destroy_conntrack():

<4>[67096.759334] ------------[ cut here ]------------
<2>[67096.759353] kernel BUG at net/netfilter/nf_conntrack_core.c:211!
...
<4>[67096.759837] Pid: 498649, comm: atdd veid: 666 Tainted: G         C ---------------    2.6.32-042stab084.18 #1 042stab084_18 /DQ45CB
<4>[67096.759932] RIP: 0010:[<ffffffffa03d99ac>]  [<ffffffffa03d99ac>] destroy_conntrack+0x15c/0x190 [nf_conntrack]
<4>[67096.760255] Call Trace:
<4>[67096.760255]  [<ffffffff814844a7>] nf_conntrack_destroy+0x17/0x30
<4>[67096.760255]  [<ffffffffa03d9bb5>] nf_conntrack_find_get+0x85/0x130 [nf_conntrack]
<4>[67096.760255]  [<ffffffffa03d9fb2>] nf_conntrack_in+0x352/0xb60 [nf_conntrack]
<4>[67096.760255]  [<ffffffffa048c771>] ipv4_conntrack_local+0x51/0x60 [nf_conntrack_ipv4]
<4>[67096.760255]  [<ffffffff81484419>] nf_iterate+0x69/0xb0
<4>[67096.760255]  [<ffffffff814b5b00>] ? dst_output+0x0/0x20
<4>[67096.760255]  [<ffffffff814845d4>] nf_hook_slow+0x74/0x110
<4>[67096.760255]  [<ffffffff814b5b00>] ? dst_output+0x0/0x20
<4>[67096.760255]  [<ffffffff814b66d5>] raw_sendmsg+0x775/0x910
<4>[67096.760255]  [<ffffffff8104c5a8>] ? flush_tlb_others_ipi+0x128/0x130
<4>[67096.760255]  [<ffffffff8100bc4e>] ? apic_timer_interrupt+0xe/0x20
<4>[67096.760255]  [<ffffffff8100bc4e>] ? apic_timer_interrupt+0xe/0x20
<4>[67096.760255]  [<ffffffff814c136a>] inet_sendmsg+0x4a/0xb0
<4>[67096.760255]  [<ffffffff81444e93>] ? sock_sendmsg+0x13/0x140
<4>[67096.760255]  [<ffffffff81444f97>] sock_sendmsg+0x117/0x140
<4>[67096.760255]  [<ffffffff8102e299>] ? native_smp_send_reschedule+0x49/0x60
<4>[67096.760255]  [<ffffffff81519beb>] ? _spin_unlock_bh+0x1b/0x20
<4>[67096.760255]  [<ffffffff8109d930>] ? autoremove_wake_function+0x0/0x40
<4>[67096.760255]  [<ffffffff814960f0>] ? do_ip_setsockopt+0x90/0xd80
<4>[67096.760255]  [<ffffffff8100bc4e>] ? apic_timer_interrupt+0xe/0x20
<4>[67096.760255]  [<ffffffff8100bc4e>] ? apic_timer_interrupt+0xe/0x20
<4>[67096.760255]  [<ffffffff814457c9>] sys_sendto+0x139/0x190
<4>[67096.760255]  [<ffffffff810efa77>] ? audit_syscall_entry+0x1d7/0x200
<4>[67096.760255]  [<ffffffff810ef7c5>] ? __audit_syscall_exit+0x265/0x290
<4>[67096.760255]  [<ffffffff81474daf>] compat_sys_socketcall+0x13f/0x210
<4>[67096.760255]  [<ffffffff8104dea3>] ia32_sysret+0x0/0x5

I have reused the original title for the RFC patch that Andrey posted and
most of the original patch description.

Cc: Eric Dumazet <edumazet@google.com>
Cc: Andrew Vagin <avagin@parallels.com>
Cc: Florian Westphal <fw@strlen.de>
Reported-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
2014-02-05 17:46:06 +01:00
Alexey Dobriyan 829d9315c4 netfilter: nf_nat_h323: fix crash in nf_ct_unlink_expect_report()
Similar bug fixed in SIP module in 3f509c6 ("netfilter: nf_nat_sip: fix
incorrect handling of EBUSY for RTCP expectation").

BUG: unable to handle kernel paging request at 00100104
IP: [<f8214f07>] nf_ct_unlink_expect_report+0x57/0xf0 [nf_conntrack]
...
Call Trace:
  [<c0244bd8>] ? del_timer+0x48/0x70
  [<f8215687>] nf_ct_remove_expectations+0x47/0x60 [nf_conntrack]
  [<f8211c99>] nf_ct_delete_from_lists+0x59/0x90 [nf_conntrack]
  [<f8212e5e>] death_by_timeout+0x14e/0x1c0 [nf_conntrack]
  [<f8212d10>] ? nf_conntrack_set_hashsize+0x190/0x190 [nf_conntrack]
  [<c024442d>] call_timer_fn+0x1d/0x80
  [<c024461e>] run_timer_softirq+0x18e/0x1a0
  [<f8212d10>] ? nf_conntrack_set_hashsize+0x190/0x190 [nf_conntrack]
  [<c023e6f3>] __do_softirq+0xa3/0x170
  [<c023e650>] ? __local_bh_enable+0x70/0x70
  <IRQ>
  [<c023e587>] ? irq_exit+0x67/0xa0
  [<c0202af6>] ? do_IRQ+0x46/0xb0
  [<c027ad05>] ? clockevents_notify+0x35/0x110
  [<c066ac6c>] ? common_interrupt+0x2c/0x40
  [<c056e3c1>] ? cpuidle_enter_state+0x41/0xf0
  [<c056e6fb>] ? cpuidle_idle_call+0x8b/0x100
  [<c02085f8>] ? arch_cpu_idle+0x8/0x30
  [<c027314b>] ? cpu_idle_loop+0x4b/0x140
  [<c0273258>] ? cpu_startup_entry+0x18/0x20
  [<c066056d>] ? rest_init+0x5d/0x70
  [<c0813ac8>] ? start_kernel+0x2ec/0x2f2
  [<c081364f>] ? repair_env_string+0x5b/0x5b
  [<c0813269>] ? i386_start_kernel+0x33/0x35

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-02-05 17:46:05 +01:00
Andrey Vagin c6825c0976 netfilter: nf_conntrack: fix RCU race in nf_conntrack_find_get
Lets look at destroy_conntrack:

hlist_nulls_del_rcu(&ct->tuplehash[IP_CT_DIR_ORIGINAL].hnnode);
...
nf_conntrack_free(ct)
	kmem_cache_free(net->ct.nf_conntrack_cachep, ct);

net->ct.nf_conntrack_cachep is created with SLAB_DESTROY_BY_RCU.

The hash is protected by rcu, so readers look up conntracks without
locks.
A conntrack is removed from the hash, but in this moment a few readers
still can use the conntrack. Then this conntrack is released and another
thread creates conntrack with the same address and the equal tuple.
After this a reader starts to validate the conntrack:
* It's not dying, because a new conntrack was created
* nf_ct_tuple_equal() returns true.

But this conntrack is not initialized yet, so it can not be used by two
threads concurrently. In this case BUG_ON may be triggered from
nf_nat_setup_info().

Florian Westphal suggested to check the confirm bit too. I think it's
right.

task 1			task 2			task 3
			nf_conntrack_find_get
			 ____nf_conntrack_find
destroy_conntrack
 hlist_nulls_del_rcu
 nf_conntrack_free
 kmem_cache_free
						__nf_conntrack_alloc
						 kmem_cache_alloc
						 memset(&ct->tuplehash[IP_CT_DIR_MAX],
			 if (nf_ct_is_dying(ct))
			 if (!nf_ct_tuple_equal()

I'm not sure, that I have ever seen this race condition in a real life.
Currently we are investigating a bug, which is reproduced on a few nodes.
In our case one conntrack is initialized from a few tasks concurrently,
we don't have any other explanation for this.

<2>[46267.083061] kernel BUG at net/ipv4/netfilter/nf_nat_core.c:322!
...
<4>[46267.083951] RIP: 0010:[<ffffffffa01e00a4>]  [<ffffffffa01e00a4>] nf_nat_setup_info+0x564/0x590 [nf_nat]
...
<4>[46267.085549] Call Trace:
<4>[46267.085622]  [<ffffffffa023421b>] alloc_null_binding+0x5b/0xa0 [iptable_nat]
<4>[46267.085697]  [<ffffffffa02342bc>] nf_nat_rule_find+0x5c/0x80 [iptable_nat]
<4>[46267.085770]  [<ffffffffa0234521>] nf_nat_fn+0x111/0x260 [iptable_nat]
<4>[46267.085843]  [<ffffffffa0234798>] nf_nat_out+0x48/0xd0 [iptable_nat]
<4>[46267.085919]  [<ffffffff814841b9>] nf_iterate+0x69/0xb0
<4>[46267.085991]  [<ffffffff81494e70>] ? ip_finish_output+0x0/0x2f0
<4>[46267.086063]  [<ffffffff81484374>] nf_hook_slow+0x74/0x110
<4>[46267.086133]  [<ffffffff81494e70>] ? ip_finish_output+0x0/0x2f0
<4>[46267.086207]  [<ffffffff814b5890>] ? dst_output+0x0/0x20
<4>[46267.086277]  [<ffffffff81495204>] ip_output+0xa4/0xc0
<4>[46267.086346]  [<ffffffff814b65a4>] raw_sendmsg+0x8b4/0x910
<4>[46267.086419]  [<ffffffff814c10fa>] inet_sendmsg+0x4a/0xb0
<4>[46267.086491]  [<ffffffff814459aa>] ? sock_update_classid+0x3a/0x50
<4>[46267.086562]  [<ffffffff81444d67>] sock_sendmsg+0x117/0x140
<4>[46267.086638]  [<ffffffff8151997b>] ? _spin_unlock_bh+0x1b/0x20
<4>[46267.086712]  [<ffffffff8109d370>] ? autoremove_wake_function+0x0/0x40
<4>[46267.086785]  [<ffffffff81495e80>] ? do_ip_setsockopt+0x90/0xd80
<4>[46267.086858]  [<ffffffff8100be0e>] ? call_function_interrupt+0xe/0x20
<4>[46267.086936]  [<ffffffff8118cb10>] ? ub_slab_ptr+0x20/0x90
<4>[46267.087006]  [<ffffffff8118cb10>] ? ub_slab_ptr+0x20/0x90
<4>[46267.087081]  [<ffffffff8118f2e8>] ? kmem_cache_alloc+0xd8/0x1e0
<4>[46267.087151]  [<ffffffff81445599>] sys_sendto+0x139/0x190
<4>[46267.087229]  [<ffffffff81448c0d>] ? sock_setsockopt+0x16d/0x6f0
<4>[46267.087303]  [<ffffffff810efa47>] ? audit_syscall_entry+0x1d7/0x200
<4>[46267.087378]  [<ffffffff810ef795>] ? __audit_syscall_exit+0x265/0x290
<4>[46267.087454]  [<ffffffff81474885>] ? compat_sys_setsockopt+0x75/0x210
<4>[46267.087531]  [<ffffffff81474b5f>] compat_sys_socketcall+0x13f/0x210
<4>[46267.087607]  [<ffffffff8104dea3>] ia32_sysret+0x0/0x5
<4>[46267.087676] Code: 91 20 e2 01 75 29 48 89 de 4c 89 f7 e8 56 fa ff ff 85 c0 0f 84 68 fc ff ff 0f b6 4d c6 41 8b 45 00 e9 4d fb ff ff e8 7c 19 e9 e0 <0f> 0b eb fe f6 05 17 91 20 e2 80 74 ce 80 3d 5f 2e 00 00 00 74
<1>[46267.088023] RIP  [<ffffffffa01e00a4>] nf_nat_setup_info+0x564/0x590

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Florian Westphal <fw@strlen.de>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-02-05 13:16:18 +01:00
Patrick McHardy 3dd7279fb6 netfilter: nf_tables: fix oops when deleting a chain with references
The following commands trigger an oops:

 # nft -i
 nft> add table filter
 nft> add chain filter input { type filter hook input priority 0; }
 nft> add chain filter test
 nft> add rule filter input jump test
 nft> delete chain filter test

We need to check the chain use counter before allowing destruction since
we might have references from sets or jump rules.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=69341
Reported-by: Matthew Ife <deleriux1@gmail.com>
Tested-by: Matthew Ife <deleriux1@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-02-05 13:16:17 +01:00
Arturo Borrero 2a53bfb3e0 netfilter: nft_ct: fix unconditional dump of 'dir' attr
We want to make sure that the information that we get from the kernel can
be reinjected without troubles. The kernel shouldn't return an attribute
that is not required, or even prohibited.

Dumping unconditionally NFTA_CT_DIRECTION could lead an application in
userspace to interpret that the attribute was originally set, while it
was not.

Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-02-05 13:16:17 +01:00
Andy Zhou c14e0953ca openvswitch: Suppress error messages on megaflow updates
With subfacets, we'd expect megaflow updates message to carry
the original micro flow. If not, EINVAL is returned and kernel
logs an error message.  Now that the user space subfacet layer is
removed, it is expected that flow updates can arrive with a
micro flow other than the original. Change the return code to
EEXIST and remove the kernel error log message.

Reported-by: Ben Pfaff <blp@nicira.com>
Signed-off-by: Andy Zhou <azhou@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
2014-02-04 22:32:38 -08:00