All handler->err() routines expect that we've done a pskb_may_pull()
test to make sure that IP header length + 8 bytes can be safely
pulled.
Reported-by: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Can be used to match packets against netfilter ip sets created via ipset(8).
skb->sk_iif is used as 'incoming interface', skb->dev is 'outgoing interface'.
Since ipset is usually called from netfilter, the ematch
initializes a fake xt_action_param, pulls the ip header into the
linear area and also sets skb->data to the IP header (otherwise
matching Layer 4 set types doesn't work).
Tested-by: Mr Dash Four <mr.dash.four@googlemail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
6lowpan module starts collecting incomming frames and fragments
right after lowpan_module_init() therefor it will be better to
clean unfinished fragments in lowpan_cleanup_module() function
instead of doing it when link goes down.
Changed spinlocks type to prevent deadlock with expired timer event
and removed unused one.
Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Function lowpan_alloc_new_frame() takes u8 tag as an argument. However,
its only caller, lowpan_process_data() passes down a u16. Hence,
the tag value can get corrupted. This prevent 6lowpan fragment reassembly of a
message when the fragment tag value is over 256.
Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Cc: Tony Cheneau <tony.cheneau@amnesiak.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make symbols static to avoid the following warning shown up
by sparse:
warning: symbol ... was not declared. Should it be static?
Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use netdev_alloc_skb_ip_align() instead of alloc_skb() to get some
extra headroom in case we need to forward this frame in a tunnel or
something else.
Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add method to get the device short 802.15.4 address. This call
needed by ieee802154 layer to satisfy 'iz list' request from
the user space.
Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Revert the commit 768f7c7c12 to initialize
spinlock in the more preferable way and make it static to avoid sparse
warning.
Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With the default name table size of 1024, it is possible that
the sanity check in tipc_nametbl_stop could spam out 1024
essentially identical error messages if memory was corrupted
or similar. Limit it to issuing no more than a single message.
The actual chain number (i.e. 0 --> 1023) wouldn't provide any
useful insight if/when such an instance happened, so don't
bother printing out that value.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
This is done to improve readability, and so that we can give
the struct a name that will allow us to declare a local
pointer to it in code, instead of having to always redirect
through the link struct to get to it.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
This sets things up so that we can have the protocol error handlers
call down into the ipv6 route code for redirects just as ipv4 already
does.
Signed-off-by: David S. Miller <davem@davemloft.net>
No longer needed, as the protocol handlers now all properly
propagate the redirect back into the routing code.
Signed-off-by: David S. Miller <davem@davemloft.net>
Pass in the SKB rather than just the IP addresses, so that policy
and other aspects can reside in ip_rt_redirect() rather then
icmp_redirect().
Signed-off-by: David S. Miller <davem@davemloft.net>
This introduce TSQ (TCP Small Queues)
TSQ goal is to reduce number of TCP packets in xmit queues (qdisc &
device queues), to reduce RTT and cwnd bias, part of the bufferbloat
problem.
sk->sk_wmem_alloc not allowed to grow above a given limit,
allowing no more than ~128KB [1] per tcp socket in qdisc/dev layers at a
given time.
TSO packets are sized/capped to half the limit, so that we have two
TSO packets in flight, allowing better bandwidth use.
As a side effect, setting the limit to 40000 automatically reduces the
standard gso max limit (65536) to 40000/2 : It can help to reduce
latencies of high prio packets, having smaller TSO packets.
This means we divert sock_wfree() to a tcp_wfree() handler, to
queue/send following frames when skb_orphan() [2] is called for the
already queued skbs.
Results on my dev machines (tg3/ixgbe nics) are really impressive,
using standard pfifo_fast, and with or without TSO/GSO.
Without reduction of nominal bandwidth, we have reduction of buffering
per bulk sender :
< 1ms on Gbit (instead of 50ms with TSO)
< 8ms on 100Mbit (instead of 132 ms)
I no longer have 4 MBytes backlogged in qdisc by a single netperf
session, and both side socket autotuning no longer use 4 Mbytes.
As skb destructor cannot restart xmit itself ( as qdisc lock might be
taken at this point ), we delegate the work to a tasklet. We use one
tasklest per cpu for performance reasons.
If tasklet finds a socket owned by the user, it sets TSQ_OWNED flag.
This flag is tested in a new protocol method called from release_sock(),
to eventually send new segments.
[1] New /proc/sys/net/ipv4/tcp_limit_output_bytes tunable
[2] skb_orphan() is usually called at TX completion time,
but some drivers call it in their start_xmit() handler.
These drivers should at least use BQL, or else a single TCP
session can still fill the whole NIC TX ring, since TSQ will
have no effect.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Dave Taht <dave.taht@bufferbloat.net>
Cc: Tom Herbert <therbert@google.com>
Cc: Matt Mathis <mattmathis@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Nandita Dukkipati <nanditad@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The recent patch "tcp: Maintain dynamic metrics in local cache." introduced
an out of bounds access due to what appears to be a typo. I believe this
change should resolve the issue by replacing the access to RTAX_CWND with
TCP_METRIC_CWND.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
mld->mld_maxdelay is net endian, so we should use ntohs, not htons
CC: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Conflicts:
net/batman-adv/bridge_loop_avoidance.c
net/batman-adv/bridge_loop_avoidance.h
net/batman-adv/soft-interface.c
net/mac80211/mlme.c
With merge help from Antonio Quartulli (batman-adv) and
Stephen Rothwell (drivers/net/usb/qmi_wwan.c).
The net/mac80211/mlme.c conflict seemed easy enough, accounting for a
conversion to some new tracing macros.
Signed-off-by: David S. Miller <davem@davemloft.net>
In driver reload test there is a memory leak.
The structure vlan_info was not freed when the driver was removed.
It was not released since the nr_vids var is one after last vlan was removed.
The nr_vids is one, since vlan zero is added to the interface when the interface
is being set, but the vlan zero is not deleted at unregister.
Fix - delete vlan zero when we unregister the device.
Signed-off-by: Amir Hanania <amir.hanania@intel.com>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- fix a bug generated by the wrong interaction between the GW feature and the
Bridge Loop Avoidance
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
iEYEABECAAYFAk/2EMQACgkQpGgxIkP9cweoqgCeNGrHU9HxBnKXSylNcqhQBzqr
9jMAni+gJX+lzmrA2j1w/rCaamuNpbJG
=mXZq
-----END PGP SIGNATURE-----
Merge tag 'batman-adv-fix-for-davem' of git://git.open-mesh.org/linux-merge
Included changes:
- fix a bug generated by the wrong interaction between the GW feature and the
Bridge Loop Avoidance
Fix incorrect start markers, wrapped summary lines, missing section
breaks, incorrect separators, and some name mismatches.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Defining a function with no parameters as 'T foo()' is the deprecated
K&R style, and is not strictly equivalent to defining it as 'T foo(void)'.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nothing every writes to ipv4 metrics any longer.
PMTU is stored in rt->rt_pmtu.
Dynamic TCP metrics are stored in a special TCP metrics cache,
completely outside of the routes.
Therefore ->cow_metrics() can simply nothing more than a WARN_ON
trigger so we can catch anyone who tries to add new writes to
ipv4 route metrics.
Signed-off-by: David S. Miller <davem@davemloft.net>
Blackhole routes have a COW metrics operation that returns NULL
always, therefore this dst_copy_metrics() call did absolutely
nothing.
Signed-off-by: David S. Miller <davem@davemloft.net>
No longer needed. TCP writes metrics, but now in it's own special
cache that does not dirty the route metrics. Therefore there is no
longer any reason to pre-cow metrics in this way.
Signed-off-by: David S. Miller <davem@davemloft.net>
We don't maintain it dynamically any longer, so reporting it would
be extremely misleading. Report zero instead.
Signed-off-by: David S. Miller <davem@davemloft.net>
Maintain a local hash table of TCP dynamic metrics blobs.
Computed TCP metrics are no longer maintained in the route metrics.
The table uses RCU and an extremely simple hash so that it has low
latency and low overhead. A simple hash is legitimate because we only
make metrics blobs for fully established connections.
Some tweaking of the default hash table sizes, metric timeouts, and
the hash chain length limit certainly could use some tweaking. But
the basic design seems sound.
With help from Eric Dumazet and Joe Perches.
Signed-off-by: David S. Miller <davem@davemloft.net>
No need to cast return value of nla_data()
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Tested-by: Oliver Hartkopp <socketcan@hartkopp.net>
Acked-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
- set message type to RTM_NEWROUTE
- relate to original request by inheriting the sequence and port number.
- set NLM_F_MULTI because it's a dump and more messages will follow
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Tested-by: Oliver Hartkopp <socketcan@hartkopp.net>
Acked-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Also use nla_get_u32() instead of nla_memcpy() to access u32 attribtues.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Tested-by: Oliver Hartkopp <socketcan@hartkopp.net>
Acked-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
nlmsg_end() will take care of this when we finalize the message.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Tested-by: Oliver Hartkopp <socketcan@hartkopp.net>
Acked-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
If list_for_each_entry, etc complete a traversal of the list, the iterator
variable ends up pointing to an address at an offset from the list head,
and not a meaningful structure. Thus this value should not be used after
the end of the iterator. This seems to be a copy-paste bug from a previous
debugging message, and so the meaningless value is just deleted.
This problem was found using Coccinelle (http://coccinelle.lip6.fr/).
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
dev->priomap is allocated by extend_netdev_table() called from
update_netdev_tables().
And this is only called if write_priomap() is called.
But if write_priomap() is not called, it seems we can have out of bounds
accesses in cgrp_destroy(), read_priomap() & skb_update_prio()
With help from Gao Feng
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Gao feng <gaofeng@cn.fujitsu.com>
Acked-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some devices (e.g. Sony's PaSoRi) can not do type B polling, so we have
to make a distinction between ISO14443 type A and B poll modes.
Cc: Eric Lapuyade <eric.lapuyade@intel.com>
Cc: Ilan Elias <ilane@ti.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
The socket local pointer can be NULL when a socket is created but never
bound or connected.
Reported-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
When receiving such frame, the sockets waiting for a connection to finish
should be woken up. Connecting to an unbound LLCP service will trigger a
DM as a response.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
With the LLCP 16 local SAPs we can potentially quickly run out of source
SAPs for non well known services.
With the so called late binding we will reserve an SAP only when we actually
get a client connection for a local service. The SAP will be released once
the last client is gone, leaving it available to other services.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
With not Well Known Services there is no guarantees as to which
SSAP the server will be listening on, so there is no reason to
support binding to a specific source SAP.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
This patch fixes a typo and return the correct error when trying to
bind 2 sockets to the same service name.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
The LLCP SAP should only be freed when the socket owning it is released.
As long as the socket is alive, the SAP should be reserved in order to
e.g. send the right wks array when bringing the MAC up.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
When the MAC link goes down, we should only keep the bound sockets
alive. They will be closed by sock_release or when the underlying
NFC device is moving away.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Drivers will need them before starting a poll or when being activated
as targets. Mostly WKS can have changed between device registration and
then so we need to re-build the whole array.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Some NFC chips will statically create and open pipes for both standard
and proprietary gates. The driver can now pass this information to HCI
such that HCI will not attempt to create and open them, but will instead
directly use the passed pipe ids.
Signed-off-by: Eric Lapuyade <eric.lapuyade@intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
If the device is polling we sent a 0 target found event.
Signed-off-by: Eric Lapuyade <eric.lapuyade@intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
The semantics for a zero target found event is that the polling operation
could not complete.
Signed-off-by: Eric Lapuyade <eric.lapuyade@intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
There can ever be only one call to nfc_targets_found() after polling
has been engaged. This could be from a target discovered event from
the driver, or from an error handler to notify poll will never complete.
Signed-off-by: Eric Lapuyade <eric.lapuyade@intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
If there is an ongoing HCI command executing, it will be completed,
thereby pushing the error up to the core. Otherwise, HCI will directly
notify the core with the error.
Signed-off-by: Eric Lapuyade <eric.lapuyade@intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
HCI cmd can be completed either from an HCI response or from an
internal driver or HCI error. This requires to factorize the
completion code outside of the device lock.
Signed-off-by: Eric Lapuyade <eric.lapuyade@intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
This API should be used by drivers, HCI, SHDLC or NCI stacks to report an
unrecoverable error.
Signed-off-by: Eric Lapuyade <eric.lapuyade@intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
An HCI command can complete either from an HCI response
(with an HCI result) or as a consequence of any other system
error during processing. The completion therefore needs to take
a standard errno code. The HCI response will convert its result
to a standard errno before calling the completion.
Signed-off-by: Eric Lapuyade <eric.lapuyade@intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
We can now report an ENOMEM error up to the HCI layer.
Signed-off-by: Eric Lapuyade <eric.lapuyade@intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
nfc_hci_recv_frame can not be called with a NULL skb.
Signed-off-by: Eric Lapuyade <eric.lapuyade@intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
shdlc reset may leave HCI in an inconsistent state by loosing parts of
HCI frames. Handle this case by reporting an unrecoverable error to HCI.
Signed-off-by: Eric Lapuyade <eric.lapuyade@intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
The questions asked in the comments have been answered and addressed.
Signed-off-by: Eric Lapuyade <eric.lapuyade@intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
If association failed due to internal error (e.g. no
supported rates IE), we call ieee80211_destroy_assoc_data()
with assoc=true, while we actually reject the association.
This results in the BSSID not being zeroed out.
After passing assoc=false, we no longer have to call
sta_info_destroy_addr() explicitly. While on it, move
the "associated" message after the assoc_success check.
Cc: stable@vger.kernel.org [3.4+]
Signed-off-by: Eliad Peller <eliad@wizery.com>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
msp has type struct minstrel_ht_sta_priv not struct minstrel_ht_sta.
(This incorporates the fixup originally posted as "mac80211: fix kzalloc
memory corruption introduced in minstrel_ht". -- JWL)
Reported-by: Fengguang Wu <wfg@linux.intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Thomas Huehn <thomas@net.t-labs.tu-berlin.de>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Pablo Neira Ayuso says:
====================
* One to get the timeout special parameter for the SET target back working
(this was introduced while trying to fix another bug in 3.4) from
Jozsef Kadlecsik.
* One crash fix if containers and nf_conntrack are used reported by Hans
Schillstrom by myself.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The patch "127f559 netfilter: ipset: fix timeout value overflow bug"
broke the SET target when no timeout was specified.
Reported-by: Jean-Philippe Menil <jean-philippe.menil@univ-nantes.fr>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
we set max_prioidx to the first zero bit index of prioidx_map in
function get_prioidx.
So when we delete the low index netprio cgroup and adding a new
netprio cgroup again,the max_prioidx will be set to the low index.
when we set the high index cgroup's net_prio.ifpriomap,the function
write_priomap will call update_netdev_tables to alloc memory which
size is sizeof(struct netprio_map) + sizeof(u32) * (max_prioidx + 1),
so the size of array that map->priomap point to is max_prioidx +1,
which is low than what we actually need.
fix this by adding check in get_prioidx,only set max_prioidx when
max_prioidx low than the new prioidx.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The comments were wrong here because "AX25_MAX_DIGIS" is 8 but the
comments say 6. Also I've changed the "7" to "AX25_ADDR_LEN".
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix two netem bugs :
1) When a frame was dropped by tfifo_enqueue(), drop counter
was incremented twice.
2) When reordering is triggered, we enqueue a packet without
checking queue limit. This can OOM pretty fast when this
is repeated enough, since skbs are orphaned, no socket limit
can help in this situation.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Mark Gordon <msg@google.com>
Cc: Andreas Terzis <aterzis@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
While doing some recent work on sctp sack bundling I noted that
sctp_packet_append_chunk was pretty inefficient. Specifially, it was called
recursively while trying to bundle auth and sack chunks. Because of that we
call sctp_packet_bundle_sack and sctp_packet_bundle_auth a total of 4 times for
every call to sctp_packet_append_chunk, knowing that at least 3 of those calls
will do nothing.
So lets refactor sctp_packet_bundle_auth to have an outer part that does the
attempted bundling, and an inner part that just does the chunk appends. This
saves us several calls per iteration that we just don't need.
Also, noticed that the auth and sack bundling fail to free the chunks they
allocate if the append fails, so make sure we add that in
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Vlad Yasevich <vyasevich@gmail.com>
CC: "David S. Miller" <davem@davemloft.net>
CC: linux-sctp@vger.kernel.org
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently when sending data over datagram, the send function will attempt to
allocate any size passed on from the userspace.
We should make sure that this size is checked and limited. We'll limit it
to the MTU of the device, which is checked later anyway.
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>