This implements a socket release callback function to check
if the socket cached route got invalid during the time
we owned the socket. The function is used from udp, raw
and ping sockets.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The route lookup in ipv4_sk_update_pmtu() might return a route
different from the route we cached at the socket. This is because
standart routes are per cpu, so each cpu has it's own struct rtable.
This means that we do not invalidate the socket cached route if the
NET_RX_SOFTIRQ is not served by the same cpu that the sending socket
uses. As a result, the cached route reused until we disconnect.
With this patch we invalidate the socket cached route if possible.
If the socket is owened by the user, we can't update the cached
route directly. A followup patch will implement socket release
callback functions for datagram sockets to handle this case.
Reported-by: Yurij M. Plotnikov <Yurij.Plotnikov@oktetlabs.ru>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On IPsec pmtu events we can't access the transport headers of
the original packet, so we can't find the socket that sent
the packet. The only chance to notify the socket about the
pmtu change is to force a relookup for all routes. This
patch implenents this for the IPsec protocols.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Missing multiplication of block size by sizeof(struct hlist_head)
can cause xfrm_hash_free() to be called with wrong second argument
so that kfree() is called on a block allocated with vzalloc() or
__get_free_pages() or free_pages() is called with wrong order when
a namespace with enough policies is removed.
Bug introduced by commit a35f6c5d, i.e. versions >= 2.6.29 are
affected.
Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
commit 9ca1b22d6d (net: splice: avoid high order page splitting)
forgot that skb->head could need a copy into several page frags.
This could be the case for loopback traffic mostly.
Also remove now useless skb argument from linear_to_page()
and __splice_segment() prototypes.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willy Tarreau <w@1wt.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
splice() can handle pages of any order, but network code tries hard to
split them in PAGE_SIZE units. Not quite successfully anyway, as
__splice_segment() assumed poff < PAGE_SIZE. This is true for
the skb->data part, not necessarily for the fragments.
This patch removes this logic to give the pages as they are in the skb.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willy Tarreau <w@1wt.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit 563d34d057 (tcp: dont drop MTU reduction indications)
added an error leading to incorrect accounting of
LINUX_MIB_LOCKDROPPEDICMPS
If socket is owned by the user, we want to increment
this SNMP counter, unless the message is a
(ICMP_DEST_UNREACH,ICMP_FRAG_NEEDED) one.
Reported-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
pmtu and redirect events are now handled in the protocols error handler,
so add an error handler for icmp6 to do this. It is needed in the case
when we have no socket context. Based on a patch by Duan Jiong.
Reported-by: Duan Jiong <djduanjiong@gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
All of the xfrm_replay->advance functions in xfrm_replay.c check if
x->replay_esn->replay_window is zero (and return if so). However,
one of them, xfrm_replay_advance_bmp(), divides by that value (in the
'%' operator) before doing the check, which can potentially trigger
a divide-by-zero exception. Some compilers will also assume that the
earlier division means the value cannot be zero later, and thus will
eliminate the subsequent zero check as dead code.
This patch moves the division to after the check.
Signed-off-by: Nickolai Zeldovich <nickolai@csail.mit.edu>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Jamie Parsons reported a problem recently, in which the re-initalization of an
association (The duplicate init case), resulted in a loss of receive window
space. He tracked down the root cause to sctp_outq_teardown, which discarded
all the data on an outq during a re-initalization of the corresponding
association, but never reset the outq->outstanding_data field to zero. I wrote,
and he tested this fix, which does a proper full re-initalization of the outq,
fixing this problem, and hopefully future proofing us from simmilar issues down
the road.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Reported-by: Jamie Parsons <Jamie.Parsons@metaswitch.com>
Tested-by: Jamie Parsons <Jamie.Parsons@metaswitch.com>
CC: Jamie Parsons <Jamie.Parsons@metaswitch.com>
CC: Vlad Yasevich <vyasevich@gmail.com>
CC: "David S. Miller" <davem@davemloft.net>
CC: netdev@vger.kernel.org
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Both ceph_osdc_alloc_request() and ceph_osdc_build_request() are
provided an array of ceph osd request operations. Rather than just
passing the number of operations in the array, the caller is
required append an additional zeroed operation structure to signal
the end of the array.
All callers know the number of operations at the time these
functions are called, so drop the silly zero entry and supply that
number directly. As a result, get_num_ops() is no longer needed.
This also means that ceph_osdc_alloc_request() never uses its ops
argument, so that can be dropped.
Also rbd_create_rw_ops() no longer needs to add one to reserve room
for the additional op.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Only one of the two callers of ceph_osdc_alloc_request() provides
page or bio data for its payload. And essentially all that function
was doing with those arguments was assigning them to fields in the
osd request structure.
Simplify ceph_osdc_alloc_request() by having the caller take care of
making those assignments
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
The only thing ceph_osdc_alloc_request() really does with the
flags value it is passed is assign it to the newly-created
osd request structure. Do that in the caller instead.
Both callers subsequently call ceph_osdc_build_request(), so have
that function (instead of ceph_osdc_alloc_request()) issue a warning
if a request comes through with neither the read nor write flags set.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
The osdc parameter to ceph_calc_raw_layout() is not used, so get rid
of it. Consequently, the corresponding parameter in calc_layout()
becomes unused, so get rid of that as well.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
A snapshot id must be provided to ceph_calc_raw_layout() even though
it is not needed at all for calculating the layout.
Where the snapshot id *is* needed is when building the request
message for an osd operation.
Drop the snapid parameter from ceph_calc_raw_layout() and pass
that value instead in ceph_osdc_build_request().
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
ceph_calc_file_object_mapping() takes (among other things) a "file"
offset and length, and based on the layout, determines the object
number ("bno") backing the affected portion of the file's data and
the offset into that object where the desired range begins. It also
computes the size that should be used for the request--either the
amount requested or something less if that would exceed the end of
the object.
This patch changes the input length parameter in this function so it
is used only for input. That is, the argument will be passed by
value rather than by address, so the value provided won't get
updated by the function.
The value would only get updated if the length would surpass the
current object, and in that case the value it got updated to would
be exactly that returned in *oxlen.
Only one of the two callers is affected by this change. Update
ceph_calc_raw_layout() so it records any updated value.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
The len argument to ceph_osdc_build_request() is set up to be
passed by address, but that function never updates its value
so there's no need to do this. Tighten up the interface by
passing the length directly.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Since every osd message is now prepared to include trailing data,
there's no need to check ahead of time whether any operations will
make use of the trail portion of the message.
We can drop the second argument to get_num_ops(), and as a result we
can also get rid of op_needs_trail() which is no longer used.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
An osd request structure contains an optional trail portion, which
if present will contain data to be passed in the payload portion of
the message containing the request. The trail field is a
ceph_pagelist pointer, and if null it indicates there is no trail.
A ceph_pagelist structure contains a length field, and it can
legitimately hold value 0. Make use of this to change the
interpretation of the "trail" of an osd request so that every osd
request has trailing data, it just might have length 0.
This means we change the r_trail field in a ceph_osd_request
structure from a pointer to a structure that is always initialized.
Note that in ceph_osdc_start_request(), the trail pointer (or now
address of that structure) is assigned to a ceph message's trail
field. Here's why that's still OK (looking at net/ceph/messenger.c):
- What would have resulted in a null pointer previously will now
refer to a 0-length page list. That message trail pointer
is used in two functions, write_partial_msg_pages() and
out_msg_pos_next().
- In write_partial_msg_pages(), a null page list pointer is
handled the same as a message with 0-length trail, and both
result in a "in_trail" variable set to false. The trail
pointer is only used if in_trail is true.
- The only other place the message trail pointer is used is
out_msg_pos_next(). That function is only called by
write_partial_msg_pages() and only touches the trail pointer
if the in_trail value it is passed is true.
Therefore a null ceph_msg->trail pointer is equivalent to a non-null
pointer referring to a 0-length page list structure.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
The last two parameters to ceph_osd_build_request() describe the
object id, but the values passed always come from the osd request
structure whose address is also provided. Get rid of those last
two parameters.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Reformat __reset_osd() into three distinct blocks of code
handling the three return cases.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
This saves us some cycles, but does not affect the placement result at
all.
This corresponds to ceph.git commit 4abb53d4f.
Signed-off-by: Sage Weil <sage@inktank.com>
Add libceph support for a new CRUSH tunable recently added to Ceph servers.
Consider the CRUSH rule
step chooseleaf firstn 0 type <node_type>
This rule means that <n> replicas will be chosen in a manner such that
each chosen leaf's branch will contain a unique instance of <node_type>.
When an object is re-replicated after a leaf failure, if the CRUSH map uses
a chooseleaf rule the remapped replica ends up under the <node_type> bucket
that held the failed leaf. This causes uneven data distribution across the
storage cluster, to the point that when all the leaves but one fail under a
particular <node_type> bucket, that remaining leaf holds all the data from
its failed peers.
This behavior also limits the number of peers that can participate in the
re-replication of the data held by the failed leaf, which increases the
time required to re-replicate after a failure.
For a chooseleaf CRUSH rule, the tree descent has two steps: call them the
inner and outer descents.
If the tree descent down to <node_type> is the outer descent, and the descent
from <node_type> down to a leaf is the inner descent, the issue is that a
down leaf is detected on the inner descent, so only the inner descent is
retried.
In order to disperse re-replicated data as widely as possible across a
storage cluster after a failure, we want to retry the outer descent. So,
fix up crush_choose() to allow the inner descent to return immediately on
choosing a failed leaf. Wire this up as a new CRUSH tunable.
Note that after this change, for a chooseleaf rule, if the primary OSD
in a placement group has failed, choosing a replacement may result in
one of the other OSDs in the PG colliding with the new primary. This
requires that OSD's data for that PG to need moving as well. This
seems unavoidable but should be relatively rare.
This corresponds to ceph.git commit 88f218181a9e6d2292e2697fc93797d0f6d6e5dc.
Signed-off-by: Jim Schutt <jaschut@sandia.gov>
Reviewed-by: Sage Weil <sage@inktank.com>
Routes with locked mtu should not use learned pmtu informations,
so do not update the pmtu on these routes.
Reported-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The output route check was introduced with git commit 261663b0
(ipv4: Don't use the cached pmtu informations for input routes)
during times when we cached the pmtu informations on the
inetpeer. Now the pmtu informations are back in the routes,
so this check is obsolete. It also had some unwanted side effects,
as reported by Timo Teras and Lukas Tribus.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: Timo Teräs <timo.teras@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 299b0767 (ipv6: Fix IPsec slowpath fragmentation problem)
has introduced a error in the header length calculation that
provokes corrupted packets when non-fragmentable extensions
headers (Destination Option or Routing Header Type 2) are used.
rt->rt6i_nfheader_len is the length of the non-fragmentable
extension header, and it should be substracted to
rt->dst.header_len, and not to exthdrlen, as it was done before
commit 299b0767.
This patch reverts to the original and correct behavior. It has
been successfully tested with and without IPsec on packets
that include non-fragmentable extensions headers.
Signed-off-by: Romain Kuntz <r.kuntz@ipflavors.com>
Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mesh PERR action frames are robust and thus may be encrypted, so add
proper head/tailroom to allow this. Fixes this warning when operating
a Mesh STA on ath5k:
WARNING: at net/mac80211/wpa.c:427 ccmp_encrypt_skb.isra.5+0x7b/0x1a0 [mac80211]()
Call Trace:
[<c011c5e7>] warn_slowpath_common+0x63/0x78
[<c011c60b>] warn_slowpath_null+0xf/0x13
[<e090621d>] ccmp_encrypt_skb.isra.5+0x7b/0x1a0 [mac80211]
[<e090685c>] ieee80211_crypto_ccmp_encrypt+0x1f/0x37 [mac80211]
[<e0917113>] invoke_tx_handlers+0xcad/0x10bd [mac80211]
[<e0917665>] ieee80211_tx+0x87/0xb3 [mac80211]
[<e0918932>] ieee80211_tx_pending+0xcc/0x170 [mac80211]
[<c0121c43>] tasklet_action+0x3e/0x65
Signed-off-by: Bob Copeland <me@bobcopeland.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
A user reported warnings in ath5k due to transmitting frames with no
rates set up. The frames were Mesh PERR frames, and some debugging
showed an empty control block with just the vif pointer:
> [ 562.522682] XXX txinfo: 00000000: 00 00 00 00 00 00 00 00 00 00 00
> 00 00 00 00 00 ................
> [ 562.522688] XXX txinfo: 00000010: 00 00 00 00 00 00 00 00 54 b8 f2
> db 00 00 00 00 ........T.......
> [ 562.522693] XXX txinfo: 00000020: 00 00 00 00 00 00 00 00 00 00 00
> 00 00 00 00 00 ................
Set the IEEE80211_TX_INTFL_NEED_TXPROCESSING flag to ensure that
rate control gets run before the frame is sent.
Signed-off-by: Bob Copeland <me@bobcopeland.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Channel contexts are not always used with monitor interfaces. If no channel
context is set, use the oper channel, otherwise tx fails.
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
[check local->use_chanctx]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Since:
commit b23b025fe2
Author: Ben Greear <greearb@candelatech.com>
Date: Fri Feb 4 11:54:17 2011 -0800
mac80211: Optimize scans on current operating channel.
we do not disable PS while going back to operational channel (on
ieee80211_scan_state_suspend) and deffer that until scan finish.
But since we are allowed to send frames, we can send a frame to AP
without PM bit set, so disable PS on AP side. Then when we switch
to off-channel (in ieee80211_scan_state_resume) we do not enable PS.
Hence we are off-channel with PS disabled, frames are not buffered
by AP.
To fix remove offchannel_ps_disable argument and always enable PS when
going off-channel and disable it when going on-channel, like it was
before.
Cc: stable@vger.kernel.org # 2.6.39+
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Tested-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
During FT roaming, wpa_supplicant attempts to set the
key before association. This used to be rejected, but
as a side effect of my commit 66e67e4189
("mac80211: redesign auth/assoc") the key was accepted
causing hardware crypto to not be used for it as the
station isn't added to the driver yet.
It would be possible to accept the key and then add it
to the driver when the station has been added. However,
this may run into issues with drivers using the state-
based station adding if they accept the key only after
association like it used to be.
For now, revert to the behaviour from before the auth
and assoc change.
Cc: stable@vger.kernel.org
Reported-by: Cédric Debarge <cedric.debarge@acksys.fr>
Tested-by: Cédric Debarge <cedric.debarge@acksys.fr>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Pablo Neira Ayuso says:
====================
The following patchset contains netfilter fixes for 3.8-rc3,
they are:
* fix possible BUG_ON if several netns are in use and the nf_conntrack
module is removed, initial patch from Gao feng, final patch from myself.
* fix unset return value if conntrack zone are disabled at
compile-time, reported by Borislav Petkov, fix from myself.
* fix display error message via dmesg for arp_tables, from Jan Engelhardt.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
spin_is_locked() on a non !SMP build is kind of useless.
BUG_ON(!spin_is_locked(xx)) is guaranteed to crash.
Just remove this check in reqsk_fastopen_remove() as
the callers do hold the socket lock.
Reported-by: Ketan Kulkarni <ketkulka@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jerry Chu <hkchu@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Dave Taht <dave.taht@gmail.com>
Acked-by: H.K. Jerry Chu <hkchu@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull networking fixes from David Miller:
1) Fix regression allowing IP_TTL setting of zero, fix from Cong Wang.
2) Fix leak regressions in tunap, from Jason Wang.
3) be2net driver always returns IRQ_HANDLED in INTx handler, fix from
Sathya Perla.
4) qlge doesn't really support NETIF_F_TSO6, don't set that flag. Fix
from Amerigo Wang.
5) Add 802.11ad Atheros wil6210 driver, from Vladimir Kondratiev.
6) Fix MTU calculations in mac80211 layer, from T Krishna Chaitanya.
7) Station info layer of mac80211 needs to use del_timer_sync(), from
Johannes Berg.
8) tcp_read_sock() can loop forever, because we don't immediately stop
when recv_actor() returns zero. Fix from Eric Dumazet.
9) Fix WARN_ON() in tcp_cleanup_rbuf(). We have to use sk_eat_skb() in
tcp_recv_skb() to handle the case where a large GRO packet is split
up while it is use by a splice() operation. Fix also from Eric
Dumazet.
10) addrconf_get_prefix_route() in ipv6 tests flags incorrectly, it
does:
if (X && (p->flags & Y) != 0)
when it really meant to go:
if (X && (p->flags & X) != 0)
fix from Romain Kuntz.
11) Fix lost Kconfig dependency for bfin_mac driver hardware
timestamping. From Lars-Peter Clausen.
12) Fix regression in handling of RST without ACK in TCP, from Eric
Dumazet.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (37 commits)
be2net: fix unconditionally returning IRQ_HANDLED in INTx
tuntap: fix leaking reference count
tuntap: forbid calling TUNSETIFF when detached
tuntap: switch to use rtnl_dereference()
net, wireless: overwrite default_ethtool_ops
qlge: remove NETIF_F_TSO6 flag
tcp: accept RST without ACK flag
net: ethernet: xilinx: Do not use NO_IRQ in axienet
net: ethernet: xilinx: Do not use axienet on PPC
bnx2x: Allow management traffic after boot from SAN
bnx2x: Fix fastpath structures when memory allocation fails
bfin_mac: Restore hardware time-stamping dependency on BF518
tun: avoid owner checks on IFF_ATTACH_QUEUE
bnx2x: move debugging code before the return
tuntap: refuse to re-attach to different tun_struct
ipv6: use addrconf_get_prefix_route for prefix route lookup [v2]
ipv6: fix the noflags test in addrconf_get_prefix_route
tcp: fix splice() and tcp collapsing interaction
tcp: splice: fix an infinite loop in tcp_read_sock()
net: prevent setting ttl=0 via IP_TTL
...
arptables 0.0.4 (released on 10th Jan 2013) supports calling the
CLASSIFY target, but on adding a rule to the wrong chain, the
diagnostic is as follows:
# arptables -A INPUT -j CLASSIFY --set-class 0:0
arptables: Invalid argument
# dmesg | tail -n1
x_tables: arp_tables: CLASSIFY target: used from hooks
PREROUTING, but only usable from INPUT/FORWARD
This is incorrect, since xt_CLASSIFY.c does specify
(1 << NF_ARP_OUT) | (1 << NF_ARP_FORWARD).
This patch corrects the x_tables diagnostic message to print the
proper hook names for the NFPROTO_ARP case.
Affects all kernels down to and including v2.6.31.
Signed-off-by: Jan Engelhardt <jengelh@inai.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
canqun zhang reported that we're hitting BUG_ON in the
nf_conntrack_destroy path when calling kfree_skb while
rmmod'ing the nf_conntrack module.
Currently, the nf_ct_destroy hook is being set to NULL in the
destroy path of conntrack.init_net. However, this is a problem
since init_net may be destroyed before any other existing netns
(we cannot assume any specific ordering while releasing existing
netns according to what I read in recent emails).
Thanks to Gao feng for initial patch to address this issue.
Reported-by: canqun zhang <canqunzhang@gmail.com>
Acked-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Since:
commit 2c60db0370
Author: Eric Dumazet <edumazet@google.com>
Date: Sun Sep 16 09:17:26 2012 +0000
net: provide a default dev->ethtool_ops
wireless core does not correctly assign ethtool_ops.
After alloc_netdev*() call, some cfg80211 drivers provide they own
ethtool_ops, but some do not. For them, wireless core provide generic
cfg80211_ethtool_ops, which is assigned in NETDEV_REGISTER notify call:
if (!dev->ethtool_ops)
dev->ethtool_ops = &cfg80211_ethtool_ops;
But after Eric's commit, dev->ethtool_ops is no longer NULL (on cfg80211
drivers without custom ethtool_ops), but points to &default_ethtool_ops.
In order to fix the problem, provide function which will overwrite
default_ethtool_ops and use it by wireless core.
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Acked-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- Fix a socket lock leak in net/sunrpc/xprt.c
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABAgAGBQJQ8FNyAAoJEGcL54qWCgDy1EAP/jetZgUmOLCV37TVAFDPkaDy
ADjeIshsJt7T2/2zKWBoDQ4sKSNO3wRbuSQ9gaMPglfdf8j3PV38+2MOyL3L4yTp
2L5RqVrbzs+xgIRN7uu6pajVNeZpZb4PqphO+2SnM8uSz6XMVpYRoDtVBiEhgF16
F9csoBEX5HMC4AFhbkDoKOUoIb13cutYdd+0ijKnAwBrc31YUrcQDwUtZfcp8h2P
xk4q/k5uj0ilHGafu0BkkMqyQLVocvp/FJXDQ5CjCI73J55hE7lcfM2LMavrJ0gA
ACxE5+kr0vVOaasvpyu3nkntQ4Td6Z2PYbXCyIIlGvsyqCM8QgqUrfTU9zZauxRa
mrRWgw0c/mqJ2o41Jl2GxWXCPIoDMX9izdZad3wZ9ct0OTTk6RumHTvnGo1XoZBI
i5UTVgmnZoOFBQ+gWsxBay9rBjEoG2IBxsew7eEDPCXM0nIG0NztvGK7psFbjR1y
+wPAgB9+NghOzTwH3GrC1zEK5tpGq1DAbyciT5HC7gk/1ZmfVcvT0iAqO6nkyeyX
MArMSS6TAgR4IH+gr/qdybnwI6AezGVLiRwCScNPWyHq/gJ9tMCpZ+iodQKxMkoW
PGHaldLdMWtL+PEEYAmqWclMTaEnnsgMbbqmU1PucWYZ9Ovq2Kktzucczd/2GwdO
Gh2Utpg0vfAJSZkxy1yK
=ukG7
-----END PGP SIGNATURE-----
Merge tag 'nfs-for-3.8-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client bugfix from Trond Myklebust:
- Fix a socket lock leak in net/sunrpc/xprt.c
* tag 'nfs-for-3.8-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
SUNRPC: Ensure we release the socket write lock if the rpc_task exits early
commit c3ae62af8e (tcp: should drop incoming frames without ACK flag
set) added a regression on the handling of RST messages.
RST should be allowed to come even without ACK bit set. We validate
the RST by checking the exact sequence, as requested by RFC 793 and
5961 3.2, in tcp_validate_incoming()
Reported-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Tested-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix new kernel-doc warnings in clnt.c:
Warning(net/sunrpc/clnt.c:561): No description found for parameter 'flavor'
Warning(net/sunrpc/clnt.c:561): Excess function parameter 'auth' description in 'rpc_clone_client_set_auth'
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: linux-nfs@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Replace ip6_route_lookup() with addrconf_get_prefix_route() when
looking up for a prefix route. This ensures that the connected prefix
is looked up in the main table, and avoids the selection of other
matching routes located in different tables as well as blackhole
or prohibited entries.
In addition, this fixes an Opps introduced by commit 64c6d08e (ipv6:
del unreachable route when an addr is deleted on lo), that would occur
when a blackhole or prohibited entry is selected by ip6_route_lookup().
Such entries have a NULL rt6i_table argument, which is accessed by
__ip6_del_rt() when trying to lock rt6i_table->tb6_lock.
The function addrconf_is_prefix_route() is not used anymore and is
removed.
[v2] Minor indentation cleanup and log updates.
Signed-off-by: Romain Kuntz <r.kuntz@ipflavors.com>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The tests on the flags in addrconf_get_prefix_route() does no make
much sense: the 'noflags' parameter contains the set of flags that
must not match with the route flags, so the test must be done
against 'noflags', and not against 'flags'.
Signed-off-by: Romain Kuntz <r.kuntz@ipflavors.com>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Under unusual circumstances, TCP collapse can split a big GRO TCP packet
while its being used in a splice(socket->pipe) operation.
skb_splice_bits() releases the socket lock before calling
splice_to_pipe().
[ 1081.353685] WARNING: at net/ipv4/tcp.c:1330 tcp_cleanup_rbuf+0x4d/0xfc()
[ 1081.371956] Hardware name: System x3690 X5 -[7148Z68]-
[ 1081.391820] cleanup rbuf bug: copied AD3BCF1 seq AD370AF rcvnxt AD3CF13
To fix this problem, we must eat skbs in tcp_recv_skb().
Remove the inline keyword from tcp_recv_skb() definition since
it has three call sites.
Reported-by: Christian Becker <c.becker@traviangames.com>
Cc: Willy Tarreau <w@1wt.eu>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Tested-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull s390 patches from Martin Schwidefsky:
"Add the finit_module system call, fix the irq statistics in
/proc/stat, fix a s390dbf lockdep problem, a patch revert for a
problem that is not 100% understood yet, and a few patches to
fix warnings."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/pci: define read*_relaxed functions
s390/topology: export cpu_topology
s390/pm: export pm_power_off
s390/pci: define isa_dma_bridge_buggy
s390/3215: partially revert tty close handling fix
s390/irq: count cpu restart events
s390/irq: remove split irq fields from /proc/stat
s390/irq: enable irq sum accounting for /proc/stat again
s390/syscalls: wire up finit_module syscall
s390/pci: remove dead code
s390/smp: fix section mismatch for smp_add_present_cpu()
s390/debug: Fix s390dbf lockdep problem in debug_(un)register_view()
net/netfilter/xt_CT.c: In function ‘xt_ct_tg_check_v1’:
net/netfilter/xt_CT.c:250:6: warning: ‘ret’ may be used uninitialized in this function [-Wmaybe-uninitialized]
net/netfilter/xt_CT.c: In function ‘xt_ct_tg_check_v0’:
net/netfilter/xt_CT.c:112:6: warning: ‘ret’ may be used uninitialized in this function [-Wmaybe-uninitialized]
Reported-by: Borislav Petkov <bp@alien8.de>
Acked-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The length parameter should be sizeof(req->name) - 1 because there is no
guarantee that string provided by userspace will contain the trailing
'\0'.
Can be easily reproduced by manually setting req->name to 128 non-zero
bytes prior to ioctl(HIDPCONNADD) and checking the device name setup on
input subsystem:
$ cat /sys/devices/pnp0/00\:04/tty/ttyS0/hci0/hci0\:1/input8/name
AAAAAA[...]AAAAAAAAf0:af:f0:af:f0:af
("f0:af:f0:af:f0:af" is the device bluetooth address, taken from "phys"
field in struct hid_device due to overflow.)
Cc: stable@vger.kernel.org
Signed-off-by: Anderson Lizardo <anderson.lizardo@openbossa.org>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
A regression is introduced by the following commit:
commit 4d52cfbef6
Author: Eric Dumazet <eric.dumazet@gmail.com>
Date: Tue Jun 2 00:42:16 2009 -0700
net: ipv4/ip_sockglue.c cleanups
Pure cleanups
but it is not a pure cleanup...
- if (val != -1 && (val < 1 || val>255))
+ if (val != -1 && (val < 0 || val > 255))
Since there is no reason provided to allow ttl=0, change it back.
Reported-by: nitin padalia <padalia.nitin@gmail.com>
Cc: nitin padalia <padalia.nitin@gmail.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the rpc_task exits while holding the socket write lock before it has
allocated an rpc slot, then the usual mechanism for releasing the write
lock in xprt_release() is defeated.
The problem occurs if the call to xprt_lock_write() initially fails, so
that the rpc_task is put on the xprt->sending wait queue. If the task
exits after being assigned the lock by __xprt_lock_write_func, but
before it has retried the call to xprt_lock_and_alloc_slot(), then
it calls xprt_release() while holding the write lock, but will
immediately exit due to the test for task->tk_rqstp != NULL.
Reported-by: Chris Perl <chris.perl@gmail.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org [>= 3.1]
Pull networking fixes from David Miller:
1) New sysctl ndisc_notify needs some documentation, from Hanns
Frederic Sowa.
2) Netfilter REJECT target doesn't set transport header of SKB
correctly, from Mukund Jampala.
3) Forcedeth driver needs to check for DMA mapping failures, from Larry
Finger.
4) brcmsmac driver can't use usleep_range while holding locks, use
udelay instead. From Niels Ole Salscheider.
5) Fix unregister of netlink bridge multicast database handlers, from
Vlad Yasevich and Rami Rosen.
6) Fix checksum calculations in netfilter's ipv6 network prefix
translation module.
7) Fix high order page allocation failures in netfilter xt_recent, from
Eric Dumazet.
8) mac802154 needs to use netif_rx_ni() instead of netif_rx() because
mac802154_process_data() can execute in process rather than
interrupt context. From Alexander Aring.
9) Fix splice handling of MSG_SENDPAGE_NOTLAST, otherwise we elide one
tcp_push() too many. From Eric Dumazet and Willy Tarreau.
10) Fix skb->truesize tracking in XEN netfront driver, from Ian
Campbell.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (46 commits)
xen/netfront: improve truesize tracking
ipv4: fix NULL checking in devinet_ioctl()
tcp: fix MSG_SENDPAGE_NOTLAST logic
net/ipv4/ipconfig: really display the BOOTP/DHCP server's address.
ip-sysctl: fix spelling errors
mac802154: fix NOHZ local_softirq_pending 08 warning
ipv6: document ndisc_notify in networking/ip-sysctl.txt
ath9k: Fix Kconfig for ATH9K_HTC
netfilter: xt_recent: avoid high order page allocations
netfilter: fix missing dependencies for the NOTRACK target
netfilter: ip6t_NPT: fix IPv6 NTP checksum calculation
bridge: add empty br_mdb_init() and br_mdb_uninit() definitions.
vxlan: allow live mac address change
bridge: Correctly unregister MDB rtnetlink handlers
brcmfmac: fix parsing rsn ie for ap mode.
brcmsmac: add copyright information for Canonical
rtlwifi: rtl8723ae: Fix warning for unchecked pci_map_single() call
rtlwifi: rtl8192se: Fix warning for unchecked pci_map_single() call
rtlwifi: rtl8192de: Fix warning for unchecked pci_map_single() call
rtlwifi: rtl8192ce: Fix warning for unchecked pci_map_single() call
...
IPsec tunnel does not set ECN field to CE in inner header when
the ECN field in the outer header is CE, and the ECN field in
the inner header is ECT(0) or ECT(1).
The cause is ipip6_hdr() does not return the correct address of
inner header since skb->transport-header is not the inner header
after esp6_input_done2(), or ah6_input().
Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
IPsec tunnel does not set ECN field to CE in inner header when
the ECN field in the outer header is CE, and the ECN field in
the inner header is ECT(0) or ECT(1).
The cause is ipip_hdr() does not return the correct address of
inner header since skb->transport-header is not the inner header
after esp_input_done2(), or ah_input().
Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Now that irq sum accounting for /proc/stat's "intr" line works again we
have the oddity that the sum field (first field) contains only the sum
of the second (external irqs) and third field (I/O interrupts).
The reason for that is that these two fields are already sums of all other
fields. So if we would sum up everything we would count every interrupt
twice.
This is broken since the split interrupt accounting was merged two years
ago: 052ff461c8 "[S390] irq: have detailed
statistics for interrupt types".
To fix this remove the split interrupt fields from /proc/stat's "intr"
line again and only have them in /proc/interrupts.
This restores the old behaviour, seems to be the only sane fix and mimics
a behaviour from other architectures where /proc/interrupts also contains
more than /proc/stat's "intr" line does.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Pablo Neira Ayuso says:
====================
The following batch contains Netfilter fixes for 3.8-rc2, they are:
* Fix IPv6 stateless network/port translation (NPT) checksum
calculation, from Ulrich Weber.
* Fix for xt_recent to avoid memory allocation failures if large
hashtables are used, from Eric Dumazet.
* Fix missing dependencies in Kconfig for the deprecated NOTRACK,
from myself.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 0d0863b020 ("sctp: Change defaults on cookie hmac selection")
added a "choice" to the sctp Kconfig file. It introduced a bug which
led to an infinite loop when while running "make oldconfig".
The problem is that the wrong symbol was defined as the default value
for the choice. Using the correct value gets rid of the infinite loop.
Note: if CONFIG_SCTP_COOKIE_HMAC_SHA1=y was present in the input
config file, both that and CONFIG_SCTP_COOKIE_HMAC_MD5=y be present
in the generated config file.
Signed-off-by: Alex Elder <elder@inktank.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The NULL pointer check `!ifa' should come before its first use.
[ Bug origin : commit fd23c3b311
(ipv4: Add hash table of interface addresses) in linux-2.6.39 ]
Signed-off-by: Xi Wang <xi.wang@gmail.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Up to now, the debug and info messages from the ipconfig subsytem
claim to display the IP address of the DHCP/BOOTP server but
display instead the IP address of the bootserver. Fix that.
Signed-off-by: Philippe De Muyter <phdm@macqel.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
When using nanosleep() in an userspace application we get a
ratelimit warning
NOHZ: local_softirq_pending 08
for 10 times.
This patch replaces netif_rx() with netif_rx_ni() which has
to be used from process/softirq context.
The process/softirq context will be called from fakelb driver.
See linux-kernel commit 481a819 for similar fix.
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
xt_recent can try high order page allocations and this can fail.
iptables: page allocation failure: order:9, mode:0xc0d0
It also wastes about half the allocated space because of kmalloc()
power-of-two roundups and struct recent_table layout.
Use vmalloc() instead to save space and be less prone to allocation
errors when memory is fragmented.
Reported-by: Miroslav Kratochvil <exa.exa@gmail.com>
Reported-by: Dave Jones <davej@redhat.com>
Reported-by: Harald Reindl <h.reindl@thelounge.net>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
csum16_add() has a broken carry detection, should be:
sum += sum < (__force u16)b;
Instead of fixing csum16_add, remove the custom checksum
functions and use the generic csum_add/csum_sub ones.
Signed-off-by: Ulrich Weber <ulrich.weber@sophos.com>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Partially revert commit (SUNRPC: add WARN_ON_ONCE for potential deadlock).
The looping behaviour has been tracked down to a knownn issue with
workqueues, and a workaround has now been implemented.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Weston Andros Adamson <dros@netapp.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Bruce Fields <bfields@fieldses.org>
Cc: stable@vger.kernel.org [>= 3.7]
This patch ensures that we free the rpc_task after the cleanup callbacks
are done in order to avoid a deadlock problem that can be triggered if
the callback needs to wait for another workqueue item to complete.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Weston Andros Adamson <dros@netapp.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Bruce Fields <bfields@fieldses.org>
Cc: stable@vger.kernel.org
The maximum MTU shouldn't take the headers into account,
the maximum MSDU size is exactly the maximum MTU.
Signed-off-by: T Krishna Chaitanya <chaitanyatk@posedge.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
When AP's SSID is hidden the BSS can appear several times in
cfg80211's BSS list: once with a zero-length SSID that comes
from the beacon, and once for each SSID from probe reponses.
Since the mac80211 stores its data in ieee80211_bss which
is embedded into cfg80211_bss, mac80211's data will be
duplicated too.
This becomes a problem when a driver needs the dtim_period
since this data exists only in the beacon's instance in
cfg80211 bss table which isn't the instance that is used
when associating.
Remove the DTIM period from the BSS table and track it
explicitly to avoid this problem.
Cc: stable@vger.kernel.org
Tested-by: Efi Tubul <efi.tubul@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
This is a very old bug, but there's nothing that prevents the
timer from running while the module is being removed when we
only do del_timer() instead of del_timer_sync().
The timer should normally not be running at this point, but
it's not clearly impossible (or we could just remove this.)
Cc: stable@vger.kernel.org
Tested-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Unfortunately, commit b22cfcfcae, intended to speed up roaming
by avoiding the synchronize_rcu() broke AP/mesh modes as it moved
some code into that work item that will still call into the driver
at a time where it's no longer expected to handle this: after the
AP or mesh has been stopped.
To fix this problem remove the per-station work struct, maintain a
station cleanup list instead and flush this list when stations are
flushed. To keep this patch smaller for stable, do this when the
stations are flushed (sta_info_flush()). This unfortunately brings
back the original roaming delay; I'll fix that again in a separate
patch.
Also, Ben reported that the original commit could sometimes (with
many interfaces) cause long delays when an interface is set down,
due to blocking on flush_workqueue(). Since we now maintain the
cleanup list, this particular change of the original patch can be
reverted.
Cc: stable@vger.kernel.org [3.7]
Reported-by: Ben Greear <greearb@candelatech.com>
Tested-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
The array of rmc_entrys is redundant since only the
list_head is used. Make this an array of list_heads
instead and save ~6k per vif at runtime :D
Signed-off-by: Thomas Pedersen <thomas@cozybit.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Make AP_VLAN type interfaces track the AP master channel
context so they have one assigned for the various lookups.
Don't give them their own refcount etc. since they're just
slaves to the AP master.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
[change to flush stations with AP flush in second loop]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Do not scan on no-IBSS and disabled channels in IBSS mode. Doing this
can trigger Microcode errors on iwlwifi and iwlegacy drivers.
Also rename ieee80211_request_internal_scan() function since it is only
used in IBSS mode and simplify calling it from ieee80211_sta_find_ibss().
This patch should address:
https://bugzilla.redhat.com/show_bug.cgi?id=883414https://bugzilla.kernel.org/show_bug.cgi?id=49411
Reported-by: Jesse Kahtava <jesse_kahtava@f-m.fm>
Reported-by: Mikko Rapeli <mikko.rapeli@iki.fi>
Cc: stable@vger.kernel.org
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
This patch adds empty br_mdb_init() and br_mdb_uninit() definitions in
br_private.h to avoid build failure when CONFIG_BRIDGE_IGMP_SNOOPING is not set.
These methods were moved from br_multicast.c to br_netlink.c by
commit 3ec8e9f085
Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 63233159fd4e596568f5f168ecb0879b61631d47:
bridge: Do not unregister all PF_BRIDGE rtnl operations
introduced a bug where a removal of a single bridge from a
multi-bridge system would remove MDB netlink handlers.
The handlers should only be removed once all bridges are gone, but
since we don't keep track of the number of bridge interfaces, it's
simpler to do it when the bridge module is unloaded. To make it
consistent, move the registration code into module initialization
code path.
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull Ceph fixes from Sage Weil:
"Two of Alex's patches deal with a race when reseting server
connections for open RBD images, one demotes some non-fatal BUGs to
WARNs, and my patch fixes a protocol feature bit failure path."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
libceph: fix protocol feature mismatch failure path
libceph: WARN, don't BUG on unexpected connection states
libceph: always reset osds when kicking
libceph: move linger requests sooner in kick_requests()
Pablo Neira Ayuso says:
====================
The following batch contains Netfilter fixes for 3.8-rc1. They are
a mixture of old bugs that have passed unnoticed (I'll pass these to
stable) and more fresh ones from the previous merge window, they are:
* Fix for MAC address in 6in4 tunnels via NFLOG that results in ulogd
showing up wrong address, from Bob Hockney.
* Fix a comment in nf_conntrack_ipv6, from Florent Fourcot.
* Fix a leak an error path in ctnetlink while creating an expectation,
from Jesper Juhl.
* Fix missing ICMP time exceeded in the IPv6 defragmentation code, from
Haibo Xi.
* Fix inconsistent handling of routing changes in MASQUERADE for the
new connections case, from Andrew Collins.
* Fix a missing skb_reset_transport in ip[6]t_REJECT that leads to
crashes in the ixgbe driver (since it seems to access the transport
header with TSO enabled), from Mukund Jampala.
* Recover obsoleted NOTRACK target by including it into the CT and spot
a warning via printk about being obsoleted. Many people don't check the
scheduled to be removal file under Documentation, so we follow some
less agressive approach to kill this in a year or so. Spotted by Florian
Westphal, patch from myself.
* Fix race condition in xt_hashlimit that allows to create two or more
entries, from myself.
* Fix crash if the CT is used due to the recently added facilities to
consult the dying and unconfirmed conntrack lists, from myself.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
We should not set con->state to CLOSED here; that happens in
ceph_fault() in the caller, where it first asserts that the state
is not yet CLOSED. Avoids a BUG when the features don't match.
Since the fail_protocol() has become a trivial wrapper, replace
calls to it with direct calls to reset_connection().
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Alex Elder <elder@inktank.com>
A number of assertions in the ceph messenger are implemented with
BUG_ON(), killing the system if connection's state doesn't match
what's expected. At this point our state model is (evidently) not
well understood enough for these assertions to trigger a BUG().
Convert all BUG_ON(con->state...) calls to be WARN_ON(con->state...)
so we learn about these issues without killing the machine.
We now recognize that a connection fault can occur due to a socket
closure at any time, regardless of the state of the connection. So
there is really nothing we can assert about the state of the
connection at that point so eliminate that assertion.
Reported-by: Ugis <ugis22@gmail.com>
Tested-by: Ugis <ugis22@gmail.com>
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
When ceph_osdc_handle_map() is called to process a new osd map,
kick_requests() is called to ensure all affected requests are
updated if necessary to reflect changes in the osd map. This
happens in two cases: whenever an incremental map update is
processed; and when a full map update (or the last one if there is
more than one) gets processed.
In the former case, the kick_requests() call is followed immediately
by a call to reset_changed_osds() to ensure any connections to osds
affected by the map change are reset. But for full map updates
this isn't done.
Both cases should be doing this osd reset.
Rather than duplicating the reset_changed_osds() call, move it into
the end of kick_requests().
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
The kick_requests() function is called by ceph_osdc_handle_map()
when an osd map change has been indicated. Its purpose is to
re-queue any request whose target osd is different from what it
was when it was originally sent.
It is structured as two loops, one for incomplete but registered
requests, and a second for handling completed linger requests.
As a special case, in the first loop if a request marked to linger
has not yet completed, it is moved from the request list to the
linger list. This is as a quick and dirty way to have the second
loop handle sending the request along with all the other linger
requests.
Because of the way it's done now, however, this quick and dirty
solution can result in these incomplete linger requests never
getting re-sent as desired. The problem lies in the fact that
the second loop only arranges for a linger request to be sent
if it appears its target osd has changed. This is the proper
handling for *completed* linger requests (it avoids issuing
the same linger request twice to the same osd).
But although the linger requests added to the list in the first loop
may have been sent, they have not yet completed, so they need to be
re-sent regardless of whether their target osd has changed.
The first required fix is we need to avoid calling __map_request()
on any incomplete linger request. Otherwise the subsequent
__map_request() call in the second loop will find the target osd
has not changed and will therefore not re-send the request.
Second, we need to be sure that a sent but incomplete linger request
gets re-sent. If the target osd is the same with the new osd map as
it was when the request was originally sent, this won't happen.
This can be fixed through careful handling when we move these
requests from the request list to the linger list, by unregistering
the request *before* it is registered as a linger request. This
works because a side-effect of unregistering the request is to make
the request's r_osd pointer be NULL, and *that* will ensure the
second loop actually re-sends the linger request.
Processing of such a request is done at that point, so continue with
the next one once it's been moved.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
ip6gre_xmit2() incorrectly sets transport header to inner payload
instead of GRE header. It seems copy-and-pasted from ipip.c.
Set transport header to gre header.
(In ipip case the transport header is the inner ip header, so that's
correct.)
Found by inspection. In practice the incorrect transport header
doesn't matter because the skb usually is sent to another net_device
or socket, so the transport header isn't referenced.
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
ipgre_tunnel_xmit() incorrectly sets transport header to inner payload
instead of GRE header. It seems copy-and-pasted from ipip.c.
So set transport header to gre header.
(In ipip case the transport header is the inner ip header, so that's
correct.)
Found by inspection. In practice the incorrect transport header
doesn't matter because the skb usually is sent to another net_device
or socket, so the transport header isn't referenced.
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add an else to only print the incompatible protocol message
when version hasn't been established.
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
0b088e00 ("RDS: Use page_remainder_alloc() for recv bufs")
added uses of sg_dma_len() and sg_dma_address(). This makes
RDS DOA with the qib driver.
IB ulps should use ib_sg_dma_len() and ib_sg_dma_address
respectively since some HCAs overload ib_sg_dma* operations.
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In commit 96e0bf4b51 (tcp: Discard segments that ack data not yet
sent) John Dykstra enforced a check against ack sequences.
In commit 354e4aa391 (tcp: RFC 5961 5.2 Blind Data Injection Attack
Mitigation) I added more safety tests.
But we missed fact that these tests are not performed if ACK bit is
not set.
RFC 793 3.9 mandates TCP should drop a frame without ACK flag set.
" fifth check the ACK field,
if the ACK bit is off drop the segment and return"
Not doing so permits an attacker to only guess an acceptable sequence
number, evading stronger checks.
Many thanks to Zhiyun Qian for bringing this issue to our attention.
See :
http://web.eecs.umich.edu/~zhiyunq/pub/ccs12_TCP_sequence_number_inference.pdf
Reported-by: Zhiyun Qian <zhiyunq@umich.edu>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Nandita Dukkipati <nanditad@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: John Dykstra <john.dykstra1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
batadv_iv_ogm_emit_send_time() attempts to calculates a random integer
in the range of 'orig_interval +- BATADV_JITTER' by the below lines.
msecs = atomic_read(&bat_priv->orig_interval) - BATADV_JITTER;
msecs += (random32() % 2 * BATADV_JITTER);
But it actually gets 'orig_interval' or 'orig_interval - BATADV_JITTER'
because '%' and '*' have same precedence and associativity is
left-to-right.
This adds the parentheses at the appropriate position so that it matches
original intension.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: Antonio Quartulli <ordex@autistici.org>
Cc: Marek Lindner <lindner_marek@yahoo.de>
Cc: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Cc: Antonio Quartulli <ordex@autistici.org>
Cc: b.a.t.m.a.n@lists.open-mesh.org
Cc: "David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes a leak in one of the error paths of
ctnetlink_create_expect if no helper and no timeout is specified.
Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
recent_net_exit() is called before recent_mt_destroy() in the
destroy path of network namespaces. Make sure there are no entries
in the parent proc entry xt_recent before removing it.
Signed-off-by: Vitaly E. Lavrov <lve@guap.ru>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
recent_net_exit() is called before recent_mt_destroy() in the
destroy path of network namespaces. Make sure there are no entries
in the parent proc entry xt_recent before removing it.
Signed-off-by: Vitaly E. Lavrov <lve@guap.ru>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Two packets may race to create the same entry in the hashtable,
double check if this packet lost race. This double checking only
happens in the path of the packet that creates the hashtable for
first time.
Note that, with this patch, no packet drops occur if the race happens.
Reported-by: Feng Gao <gfree.wind@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Sedat reported the following commit caused a regression:
commit 9650388b5c
Author: Eric Dumazet <edumazet@google.com>
Date: Fri Dec 21 07:32:10 2012 +0000
ipv4: arp: fix a lockdep splat in arp_solicit
This is due to the 6th parameter of arp_send() needs to be NULL
for the broadcast case, the above commit changed it to an all-zero
array by mistake.
Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Westphal reported that the removal of the NOTRACK target
(9655050 netfilter: remove xt_NOTRACK) is breaking some existing
setups.
That removal was scheduled for removal since long time ago as
described in Documentation/feature-removal-schedule.txt
What: xt_NOTRACK
Files: net/netfilter/xt_NOTRACK.c
When: April 2011
Why: Superseded by xt_CT
Still, people may have not notice / may have decided to stick to an
old iptables version. I agree with him in that some more conservative
approach by spotting some printk to warn users for some time is less
agressive.
Current iptables 1.4.16.3 already contains the aliasing support
that makes it point to the CT target, so upgrading would fix it.
Still, the policy so far has been to avoid pushing our users to
upgrade.
As a solution, this patch recovers the NOTRACK target inside the CT
target and it now spots a warning.
Reported-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Fixed integer overflow in function htb_dequeue
Signed-off-by: Stefan Hasko <hasko.stevo@gmail.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
CONFIG_HOTPLUG is always enabled now, so remove the unused code that was
trying to be compiled out when this option was disabled, in the
networking core.
Cc: Bill Pemberton <wfp5p@virginia.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
When netdev_set_master faild in br_add_if, we should
call br_netpoll_disable to do some cleanup jobs,such
as free the memory of struct netpoll which allocated
in br_netpoll_enable.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Acked-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Using a seqlock for devnet_rename_seq is not a good idea,
as device_rename() can sleep.
As we hold RTNL, we dont need a protection for writers,
and only need a seqcount so that readers can catch a change done
by a writer.
Bug added in commit c91f6df2db (sockopt: Change getsockopt() of
SO_BINDTODEVICE to return an interface name)
Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Brian Haley <brian.haley@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Once skb_realloc_headroom() is called, tiph might point to freed memory.
Cache tiph->ttl value before the reallocation, to avoid unexpected
behavior.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Isaku Yamahata <yamahata@valinux.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
ipgre_tunnel_xmit() parses network header as IP unconditionally.
But transmitting packets are not always IP packet. For example such packet
can be sent by packet socket with sockaddr_ll.sll_protocol set.
So make the function check if skb->protocol is IP.
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull nfsd update from Bruce Fields:
"Included this time:
- more nfsd containerization work from Stanislav Kinsbursky: we're
not quite there yet, but should be by 3.9.
- NFSv4.1 progress: implementation of basic backchannel security
negotiation and the mandatory BACKCHANNEL_CTL operation. See
http://wiki.linux-nfs.org/wiki/index.php/Server_4.0_and_4.1_issues
for remaining TODO's
- Fixes for some bugs that could be triggered by unusual compounds.
Our xdr code wasn't designed with v4 compounds in mind, and it
shows. A more thorough rewrite is still a todo.
- If you've ever seen "RPC: multiple fragments per record not
supported" logged while using some sort of odd userland NFS client,
that should now be fixed.
- Further work from Jeff Layton on our mechanism for storing
information about NFSv4 clients across reboots.
- Further work from Bryan Schumaker on his fault-injection mechanism
(which allows us to discard selective NFSv4 state, to excercise
rarely-taken recovery code paths in the client.)
- The usual mix of miscellaneous bugs and cleanup.
Thanks to everyone who tested or contributed this cycle."
* 'for-3.8' of git://linux-nfs.org/~bfields/linux: (111 commits)
nfsd4: don't leave freed stateid hashed
nfsd4: free_stateid can use the current stateid
nfsd4: cleanup: replace rq_resused count by rq_next_page pointer
nfsd: warn on odd reply state in nfsd_vfs_read
nfsd4: fix oops on unusual readlike compound
nfsd4: disable zero-copy on non-final read ops
svcrpc: fix some printks
NFSD: Correct the size calculation in fault_inject_write
NFSD: Pass correct buffer size to rpc_ntop
nfsd: pass proper net to nfsd_destroy() from NFSd kthreads
nfsd: simplify service shutdown
nfsd: replace boolean nfsd_up flag by users counter
nfsd: simplify NFSv4 state init and shutdown
nfsd: introduce helpers for generic resources init and shutdown
nfsd: make NFSd service structure allocated per net
nfsd: make NFSd service boot time per-net
nfsd: per-net NFSd up flag introduced
nfsd: move per-net startup code to separated function
nfsd: pass net to __write_ports() and down
nfsd: pass net to nfsd_set_nrthreads()
...
Pull Ceph update from Sage Weil:
"There are a few different groups of commits here. The largest is
Alex's ongoing work to enable the coming RBD features (cloning,
striping). There is some cleanup in libceph that goes along with it.
Cyril and David have fixed some problems with NFS reexport (leaking
dentries and page locks), and there is a batch of patches from Yan
fixing problems with the fs client when running against a clustered
MDS. There are a few bug fixes mixed in for good measure, many of
which will be going to the stable trees once they're upstream.
My apologies for the late pull. There is still a gremlin in the rbd
map/unmap code and I was hoping to include the fix for that as well,
but we haven't been able to confirm the fix is correct yet; I'll send
that in a separate pull once it's nailed down."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (68 commits)
rbd: get rid of rbd_{get,put}_dev()
libceph: register request before unregister linger
libceph: don't use rb_init_node() in ceph_osdc_alloc_request()
libceph: init event->node in ceph_osdc_create_event()
libceph: init osd->o_node in create_osd()
libceph: report connection fault with warning
libceph: socket can close in any connection state
rbd: don't use ENOTSUPP
rbd: remove linger unconditionally
rbd: get rid of RBD_MAX_SEG_NAME_LEN
libceph: avoid using freed osd in __kick_osd_requests()
ceph: don't reference req after put
rbd: do not allow remove of mounted-on image
libceph: Unlock unprocessed pages in start_read() error path
ceph: call handle_cap_grant() for cap import message
ceph: Fix __ceph_do_pending_vmtruncate
ceph: Don't add dirty inode to dirty list if caps is in migration
ceph: Fix infinite loop in __wake_requests
ceph: Don't update i_max_size when handling non-auth cap
bdi_register: add __printf verification, fix arg mismatch
...
In kick_requests(), we need to register the request before we
unregister the linger request. Otherwise the unregister will
reset the request's osd pointer to NULL.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
The red-black node in the ceph osd request structure is initialized
in ceph_osdc_alloc_request() using rbd_init_node(). We do need to
initialize this, because in __unregister_request() we call
RB_EMPTY_NODE(), which expects the node it's checking to have
been initialized. But rb_init_node() is apparently overkill, and
may in fact be on its way out. So use RB_CLEAR_NODE() instead.
For a little more background, see this commit:
4c199a93 rbtree: empty nodes have no color"
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
The red-black node node in the ceph osd event structure is not
initialized in create_osdc_create_event(). Because this node can
be the subject of a RB_EMPTY_NODE() call later on, we should ensure
the node is initialized properly for that.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
The red-black node node in the ceph osd structure is not initialized
in create_osd(). Because this node can be the subject of a
RB_EMPTY_NODE() call later on, we should ensure the node is
initialized properly for that. Add a call to RB_CLEAR_NODE()
initialize it.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
When a connection's socket disconnects, or if there's a protocol
error of some kind on the connection, a fault is signaled and
the connection is reset (closed and reopened, basically). We
currently get an error message on the log whenever this occurs.
A ceph connection will attempt to reestablish a socket connection
repeatedly if a fault occurs. This means that these error messages
will get repeatedly added to the log, which is undesirable.
Change the error message to be a warning, so they don't get
logged by default.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Latinoware 2012.
There's a slightly non-trivial merge in virtio-net, as we cleaned up the
virtio add_buf interface while DaveM accepted the mq virtio-net patches.
You can see my solution in my pending-rebases branch, if that helps, but I
know you love merging:
https://git.kernel.org/?p=linux/kernel/git/rusty/linux.git;a=commit;h=12e4e64fa66a4c812e4855de32abdb4d819526fe
Cheers,
Rusty.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJQz/vKAAoJENkgDmzRrbjx+eYQAK/egj9T8Nnth6mkzdbCFSO7
Bciga2hDiudGCiGojTRGPRSc0VP9LgfvPbY2pxX+R9CfEqR+a8q/rRQhCS79ZwPB
/mJy3HNiCx418HZxgwNtk6vPe0PjJm6SsjbXeB9hB+PQLCbdwA0BjpG6xjF/jitP
noPqhhXreeQgYVxAKoFPvff/Byu2GlNnDdVMQxWRmo8hTKlTCzl0T/7BHRxthhJj
iOrXTFzrT/osPT0zyqlngT03T4wlBvL2Bfw8d/kuRPEZ71dpIctWeH2KzdwXVCrz
hFQGxAz4OWvW3xrNwj7c6O3SWj4VemUMjQqeA/PtRiOEI5gM0Y/Bit47dWL4wM/O
OWUKFHzq4DFs8MmwXBgDDXl5xOjOBH9Ik4FZayn3Y7COT/B8CjFdOC2MdDGmZ9yd
NInumg7FqP+u12g+9Vq8S/b0cfoQm4qFe8VHiPJu+jRmCZglyvLjk7oq/QwW8Gaq
Pkzit1Ey0DWo2KvZ4D/nuXJCuhmzN/AJ10M48lLYZhtOIVg9gsa0xjhfgq4FnvSK
xFCf3rcWnlGIXcOYh/hKU25WaCLzBuqMuSK35A72IujrQOL7OJTk4Oqote3Z3H9B
08XJmyW6SOZdfw17X4Im1jbyuLek///xQJ9Jw/tya7j9lBt8zjJ+FmLPs4mLGEOm
WJv9uZPs+QbIMNky2Lcb
=myDR
-----END PGP SIGNATURE-----
Merge tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux
Pull virtio update from Rusty Russell:
"Some nice cleanups, and even a patch my wife did as a "live" demo for
Latinoware 2012.
There's a slightly non-trivial merge in virtio-net, as we cleaned up
the virtio add_buf interface while DaveM accepted the mq virtio-net
patches."
* tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (27 commits)
virtio_console: Add support for remoteproc serial
virtio_console: Merge struct buffer_token into struct port_buffer
virtio: add drv_to_virtio to make code clearly
virtio: use dev_to_virtio wrapper in virtio
virtio-mmio: Fix irq parsing in command line parameter
virtio_console: Free buffers from out-queue upon close
virtio: Convert dev_printk(KERN_<LEVEL> to dev_<level>(
virtio_console: Use kmalloc instead of kzalloc
virtio_console: Free buffer if splice fails
virtio: tools: make it clear that virtqueue_add_buf() no longer returns > 0
virtio: scsi: make it clear that virtqueue_add_buf() no longer returns > 0
virtio: rpmsg: make it clear that virtqueue_add_buf() no longer returns > 0
virtio: net: make it clear that virtqueue_add_buf() no longer returns > 0
virtio: console: make it clear that virtqueue_add_buf() no longer returns > 0
virtio: make virtqueue_add_buf() returning 0 on success, not capacity.
virtio: console: don't rely on virtqueue_add_buf() returning capacity.
virtio_net: don't rely on virtqueue_add_buf() returning capacity.
virtio-net: remove unused skb_vnet_hdr->num_sg field
virtio-net: correct capacity math on ring full
virtio: move queue_index and num_free fields into core struct virtqueue.
...
Pull networking fixes from David Miller:
1) Really fix tuntap SKB use after free bug, from Eric Dumazet.
2) Adjust SKB data pointer to point past the transport header before
calling icmpv6_notify() so that the headers are in the state which
that function expects. From Duan Jiong.
3) Fix ambiguities in the new tuntap multi-queue APIs. From Jason
Wang.
4) mISDN needs to use del_timer_sync(), from Konstantin Khlebnikov.
5) Don't destroy mutex after freeing up device private in mac802154,
fix also from Konstantin Khlebnikov.
6) Fix INET request socket leak in TCP and DCCP, from Christoph Paasch.
7) SCTP HMAC kconfig rework, from Neil Horman.
8) Fix SCTP jprobes function signature, otherwise things explode, from
Daniel Borkmann.
9) Fix typo in ipv6-offload Makefile variable reference, from Simon
Arlott.
10) Don't fail USBNET open just because remote wakeup isn't supported,
from Oliver Neukum.
11) be2net driver bug fixes from Sathya Perla.
12) SOLOS PCI ATM driver bug fixes from Nathan Williams and David
Woodhouse.
13) Fix MTU changing regression in 8139cp driver, from John Greene.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (45 commits)
solos-pci: ensure all TX packets are aligned to 4 bytes
solos-pci: add firmware upgrade support for new models
solos-pci: remove superfluous debug output
solos-pci: add GPIO support for newer versions on Geos board
8139cp: Prevent dev_close/cp_interrupt race on MTU change
net: qmi_wwan: add ZTE MF880
drivers/net: Use of_match_ptr() macro in smsc911x.c
drivers/net: Use of_match_ptr() macro in smc91x.c
ipv6: addrconf.c: remove unnecessary "if"
bridge: Correctly encode addresses when dumping mdb entries
bridge: Do not unregister all PF_BRIDGE rtnl operations
use generic usbnet_manage_power()
usbnet: generic manage_power()
usbnet: handle PM failure gracefully
ksz884x: fix receive polling race condition
qlcnic: update driver version
qlcnic: fix unused variable warnings
net: fec: forbid FEC_PTP on SoCs that do not support
be2net: fix wrong frag_idx reported by RX CQ
be2net: fix be_close() to ensure all events are ack'ed
...
the value of err is always negative if it goes to errout, so we don't need to
check the value of err.
Signed-off-by: Cong Ding <dinggnu@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When dumping mdb table, set the addresses the kernel returns
based on the address protocol type.
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Acked-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bridge fdb and link rtnl operations are registered in
core/rtnetlink. Bridge mdb operations are registred
in bridge/mdb. When removing bridge module, do not
unregister ALL PF_BRIDGE ops since that would remove
the ops from rtnetlink as well. Do remove mdb ops when
bridge is destroyed.
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull (again) user namespace infrastructure changes from Eric Biederman:
"Those bugs, those darn embarrasing bugs just want don't want to get
fixed.
Linus I just updated my mirror of your kernel.org tree and it appears
you successfully pulled everything except the last 4 commits that fix
those embarrasing bugs.
When you get a chance can you please repull my branch"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
userns: Fix typo in description of the limitation of userns_install
userns: Add a more complete capability subset test to commit_creds
userns: Require CAP_SYS_ADMIN for most uses of setns.
Fix cap_capable to only allow owners in the parent user namespace to have caps.
Features include:
- Full audit of BUG_ON asserts in the NFS, SUNRPC and lockd client code
Remove altogether where possible, and replace with WARN_ON_ONCE and
appropriate error returns where not.
- NFSv4.1 client adds session dynamic slot table management. There is
matching server side code that has been submitted to Bruce for
consideration. Together, this code allows the server to dynamically
manage the amount of memory it allocates to the duplicate request
cache for each client. It will constantly resize those caches to
reserve more memory for clients that are hot while shrinking caches
for those that are quiescent.
In addition, there are assorted bugfixes for the generic NFS write code,
fixes to deal with the drop_nlink() warnings, and yet another fix for
NFSv4 getacl.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABAgAGBQJQz8VNAAoJEGcL54qWCgDy7iYQAKbr7AAZOcZPoJigzakZ7nMi
UKYulGbFais2Llwzw1e+U5RzmorTSbvl7/m8eS7pDf3auYw/t4xtXjKSGZUNxaE1
q2hNKgVwodMbScYdkZXvKKNckS93oPDttrmEyzjKanqey+1E3HSklvOvikN0ihte
B/G1OtA7Qpcr92bPrLK+PjDqarCBUI4g42dYbZOBrZnXKTRtzUqsuKPu7WjpPiof
SHE5b1Emt7oUxgcijWGcvYCQ8voZdeSCnSksH3DgvORlutwdhUD3Yg8KyEfFZdyc
6C59ozXRLiHkV3c+jMhJzDkQXR9bYHrnK3tlq4G8v1NdJxRktQliZeqecRvip/Wz
rAxfE6fnPDEvKsCpZb3+5yTAt+aZwzEhRg1fFC9qfGOp+oRa+CWw5kJCyIFHwJu6
4LOlubQAf6rnIsja1L8D0FdeqHUa1+wy61On5kgVYS5JGtoBsQHpa1zTwdOxPmsR
2XTMYGNCEabvpKpO9+5xQbUzkFExPTesw47ygXiUuDT/snaarpV3/f05SSCaWZkX
R8QsGEOXTIh8/S+UxARGpc7H6xi1PdBM5nBziHVzjEdHgZRF4wGFaJe2CirMjSJO
Df5GEd5Z/8VCGWs+1w7HD5EaQ2n0wbt5daCE80Y2jRBr7NMYnY+ciF8/GktLpHsn
Zq1bXGOdr3UZ92LXuzL9
=G3N9
-----END PGP SIGNATURE-----
Merge tag 'nfs-for-3.8-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client updates from Trond Myklebust:
"Features include:
- Full audit of BUG_ON asserts in the NFS, SUNRPC and lockd client
code. Remove altogether where possible, and replace with
WARN_ON_ONCE and appropriate error returns where not.
- NFSv4.1 client adds session dynamic slot table management. There
is matching server side code that has been submitted to Bruce for
consideration.
Together, this code allows the server to dynamically manage the
amount of memory it allocates to the duplicate request cache for
each client. It will constantly resize those caches to reserve
more memory for clients that are hot while shrinking caches for
those that are quiescent.
In addition, there are assorted bugfixes for the generic NFS write
code, fixes to deal with the drop_nlink() warnings, and yet another
fix for NFSv4 getacl."
* tag 'nfs-for-3.8-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (106 commits)
SUNRPC: continue run over clients list on PipeFS event instead of break
NFS: Don't use SetPageError in the NFS writeback code
SUNRPC: variable 'svsk' is unused in function bc_send_request
SUNRPC: Handle ECONNREFUSED in xs_local_setup_socket
NFSv4.1: Deal effectively with interrupted RPC calls.
NFSv4.1: Move the RPC timestamp out of the slot.
NFSv4.1: Try to deal with NFS4ERR_SEQ_MISORDERED.
NFS: nfs_lookup_revalidate should not trust an inode with i_nlink == 0
NFS: Fix calls to drop_nlink()
NFS: Ensure that we always drop inodes that have been marked as stale
nfs: Remove unused list nfs4_clientid_list
nfs: Remove duplicate function declaration in internal.h
NFS: avoid NULL dereference in nfs_destroy_server
SUNRPC handle EKEYEXPIRED in call_refreshresult
SUNRPC set gss gc_expiry to full lifetime
nfs: fix page dirtying in NFS DIO read codepath
nfs: don't zero out the rest of the page if we hit the EOF on a DIO READ
NFSv4.1: Be conservative about the client highest slotid
NFSv4.1: Handle NFS4ERR_BADSLOT errors correctly
nfs: don't extend writes to cover entire page if pagecache is invalid
...
As reported by Chen Gang <gang.chen@asianux.com>, we should ensure there
is enough space when formatting the sysfs buffers.
Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>
Otherwise an out of bounds read could happen.
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull user namespace changes from Eric Biederman:
"While small this set of changes is very significant with respect to
containers in general and user namespaces in particular. The user
space interface is now complete.
This set of changes adds support for unprivileged users to create user
namespaces and as a user namespace root to create other namespaces.
The tyranny of supporting suid root preventing unprivileged users from
using cool new kernel features is broken.
This set of changes completes the work on setns, adding support for
the pid, user, mount namespaces.
This set of changes includes a bunch of basic pid namespace
cleanups/simplifications. Of particular significance is the rework of
the pid namespace cleanup so it no longer requires sending out
tendrils into all kinds of unexpected cleanup paths for operation. At
least one case of broken error handling is fixed by this cleanup.
The files under /proc/<pid>/ns/ have been converted from regular files
to magic symlinks which prevents incorrect caching by the VFS,
ensuring the files always refer to the namespace the process is
currently using and ensuring that the ptrace_mayaccess permission
checks are always applied.
The files under /proc/<pid>/ns/ have been given stable inode numbers
so it is now possible to see if different processes share the same
namespaces.
Through the David Miller's net tree are changes to relax many of the
permission checks in the networking stack to allowing the user
namespace root to usefully use the networking stack. Similar changes
for the mount namespace and the pid namespace are coming through my
tree.
Two small changes to add user namespace support were commited here adn
in David Miller's -net tree so that I could complete the work on the
/proc/<pid>/ns/ files in this tree.
Work remains to make it safe to build user namespaces and 9p, afs,
ceph, cifs, coda, gfs2, ncpfs, nfs, nfsd, ocfs2, and xfs so the
Kconfig guard remains in place preventing that user namespaces from
being built when any of those filesystems are enabled.
Future design work remains to allow root users outside of the initial
user namespace to mount more than just /proc and /sys."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (38 commits)
proc: Usable inode numbers for the namespace file descriptors.
proc: Fix the namespace inode permission checks.
proc: Generalize proc inode allocation
userns: Allow unprivilged mounts of proc and sysfs
userns: For /proc/self/{uid,gid}_map derive the lower userns from the struct file
procfs: Print task uids and gids in the userns that opened the proc file
userns: Implement unshare of the user namespace
userns: Implent proc namespace operations
userns: Kill task_user_ns
userns: Make create_new_namespaces take a user_ns parameter
userns: Allow unprivileged use of setns.
userns: Allow unprivileged users to create new namespaces
userns: Allow setting a userns mapping to your current uid.
userns: Allow chown and setgid preservation
userns: Allow unprivileged users to create user namespaces.
userns: Ignore suid and sgid on binaries if the uid or gid can not be mapped
userns: fix return value on mntns_install() failure
vfs: Allow unprivileged manipulation of the mount namespace.
vfs: Only support slave subtrees across different user namespaces
vfs: Add a user namespace reference from struct mnt_namespace
...
A connection's socket can close for any reason, independent of the
state of the connection (and without irrespective of the connection
mutex). As a result, the connectino can be in pretty much any state
at the time its socket is closed.
Handle those other cases at the top of con_work(). Pull this whole
block of code into a separate function to reduce the clutter.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
In __unregister_linger_request(), the request is being removed
from the osd client's req_linger list only when the request
has a non-null osd pointer. It should be done whether or not
the request currently has an osd.
This is most likely a non-issue because I believe the request
will always have an osd when this function is called.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
There are SUNRPC clients, which program doesn't have pipe_dir_name. These
clients can be skipped on PipeFS events, because nothing have to be created or
destroyed. But instead of breaking in case of such a client was found, search
for suitable client over clients list have to be continued. Otherwise some
clients could not be covered by PipeFS event handler.
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Cc: stable@vger.kernel.org [>= v3.4]
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
If an osd has no requests and no linger requests, __reset_osd()
will just remove it with a call to __remove_osd(). That drops
a reference to the osd, and therefore the osd may have been free
by the time __reset_osd() returns. That function offers no
indication this may have occurred, and as a result the osd will
continue to be used even when it's no longer valid.
Change__reset_osd() so it returns an error (ENODEV) when it
deletes the osd being reset. And change __kick_osd_requests() so it
returns immediately (before referencing osd again) if __reset_osd()
returns *any* error.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
In __unregister_request(), there is a call to list_del_init()
referencing a request that was the subject of a call to
ceph_osdc_put_request() on the previous line. This is not
safe, because the request structure could have been freed
by the time we reach the list_del_init().
Fix this by reversing the order of these lines.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-off-by: Sage Weil <sage@inktank.com>
In (0c36b48 netfilter: nfnetlink_log: fix mac address for 6in4 tunnels)
the include file that defines ARPD_SIT was missing. This passed unnoticed
during my tests (I did not hit this problem here).
net/netfilter/nfnetlink_log.c: In function '__build_packet_message':
net/netfilter/nfnetlink_log.c:494:25: error: 'ARPHRD_SIT' undeclared (first use in this function)
net/netfilter/nfnetlink_log.c:494:25: note: each undeclared identifier is reported only once for
+each function it appears in
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pull security subsystem updates from James Morris:
"A quiet cycle for the security subsystem with just a few maintenance
updates."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
Smack: create a sysfs mount point for smackfs
Smack: use select not depends in Kconfig
Yama: remove locking from delete path
Yama: add RCU to drop read locking
drivers/char/tpm: remove tasklet and cleanup
KEYS: Use keyring_alloc() to create special keyrings
KEYS: Reduce initial permissions on keys
KEYS: Make the session and process keyrings per-thread
seccomp: Make syscall skipping and nr changes more consistent
key: Fix resource leak
keys: Fix unreachable code
KEYS: Add payload preparsing opportunity prior to key instantiate or update
In (d871bef netfilter: ctnetlink: dump entries from the dying and
unconfirmed lists), we assume that all conntrack objects are
inserted in any of the existing lists. However, template conntrack
objects were not. This results in hitting BUG_ON in the
destroy_conntrack path while removing a rule that uses the CT target.
This patch fixes the situation by adding the template lists, which
is where template conntrack objects reside now.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
For tunnelled ipv6in4 packets, the LOG target (xt_LOG.c) adjusts
the start of the mac field to start at the ethernet header instead
of the ipv4 header for the tunnel. This patch conforms what is
passed by the NFLOG target through nfnetlink to what the LOG target
does. Code borrowed from xt_LOG.c.
Signed-off-by: Bob Hockney <bhockney@ix.netcom.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Commit b836c99fd6 (ipv6: unify conntrack reassembly expire
code with standard one) use the standard IPv6 reassembly
code(ip6_expire_frag_queue) to handle conntrack reassembly expire.
In ip6_expire_frag_queue, it invoke dev_get_by_index_rcu to get
which device received this expired packet.so we must save ifindex
when NF_conntrack get this packet.
With this patch applied, I can see ICMP Time Exceeded sent
from the receiver when the sender sent out 1/2 fragmented
IPv6 packet.
Signed-off-by: Haibo Xi <haibbo@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Remove ambiguity of double negation.
Signed-off-by: Florent Fourcot <florent.fourcot@enst-bretagne.fr>
Acked-by: Rick Jones <rick.jones2@hp.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Since (a0ecb85 netfilter: nf_nat: Handle routing changes in MASQUERADE
target), the MASQUERADE target handles routing changes which affect
the output interface of a connection, but only for ESTABLISHED
connections. It is also possible for NEW connections which
already have a conntrack entry to be affected by routing changes.
This adds a check to drop entries in the NEW+conntrack state
when the oif has changed.
Signed-off-by: Andrew Collins <bsderandrew@gmail.com>
Acked-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The following commit breaks IPv6 TCP transmission for me:
Commit 75fe83c322
Author: Vlad Yasevich <vyasevic@redhat.com>
Date: Fri Nov 16 09:41:21 2012 +0000
ipv6: Preserve ipv6 functionality needed by NET
This patch fixes the typo "ipv6_offload" which should be
"ipv6-offload".
I don't know why not including the offload modules should
break TCP. Disabling all offload options on the NIC didn't
help. Outgoing pulseaudio traffic kept stalling.
Signed-off-by: Simon Arlott <simon@fire.lp0.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 24cb81a6a (sctp: Push struct net down into all of the
state machine functions) introduced the net structure into all
state machine functions, but jsctp_sf_eat_sack was not updated,
hence when SCTP association probing is enabled in the kernel,
any simple SCTP client/server program from userspace will panic
the kernel.
Cc: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds a flag to each mdb entry, so that we can distinguish
permanent entries with temporary entries.
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Recently I posted commit 3c68198e75 which made selection of the cookie hmac
algorithm selectable. This is all well and good, but Linus noted that it
changes the default config:
http://marc.info/?l=linux-netdev&m=135536629004808&w=2
I've modified the sctp Kconfig file to reflect the recommended way of making
this choice, using the thermal driver example specified, and brought the
defaults back into line with the way they were prior to my origional patch
Also, on Linus' suggestion, re-adding ability to select default 'none' hmac
algorithm, so we don't needlessly bloat the kernel by forcing a non-none
default. This also led me to note that we won't honor the default none
condition properly because of how sctp_net_init is encoded. Fix that up as
well.
Tested by myself (allbeit fairly quickly). All configuration combinations seems
to work soundly.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: David Miller <davem@davemloft.net>
CC: Linus Torvalds <torvalds@linux-foundation.org>
CC: Vlad Yasevich <vyasevich@gmail.com>
CC: linux-sctp@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Silence the unnecessary warning "unhandled error (111) connecting to..."
and convert it to a dprintk for debugging purposes.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Andy Lutomirski <luto@amacapital.net> found a nasty little bug in
the permissions of setns. With unprivileged user namespaces it
became possible to create new namespaces without privilege.
However the setns calls were relaxed to only require CAP_SYS_ADMIN in
the user nameapce of the targed namespace.
Which made the following nasty sequence possible.
pid = clone(CLONE_NEWUSER | CLONE_NEWNS);
if (pid == 0) { /* child */
system("mount --bind /home/me/passwd /etc/passwd");
}
else if (pid != 0) { /* parent */
char path[PATH_MAX];
snprintf(path, sizeof(path), "/proc/%u/ns/mnt");
fd = open(path, O_RDONLY);
setns(fd, 0);
system("su -");
}
Prevent this possibility by requiring CAP_SYS_ADMIN
in the current user namespace when joing all but the user namespace.
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
If in either of the above functions inet_csk_route_child_sock() or
__inet_inherit_port() fails, the newsk will not be freed:
unreferenced object 0xffff88022e8a92c0 (size 1592):
comm "softirq", pid 0, jiffies 4294946244 (age 726.160s)
hex dump (first 32 bytes):
0a 01 01 01 0a 01 01 02 00 00 00 00 a7 cc 16 00 ................
02 00 03 01 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
[<ffffffff8153d190>] kmemleak_alloc+0x21/0x3e
[<ffffffff810ab3e7>] kmem_cache_alloc+0xb5/0xc5
[<ffffffff8149b65b>] sk_prot_alloc.isra.53+0x2b/0xcd
[<ffffffff8149b784>] sk_clone_lock+0x16/0x21e
[<ffffffff814d711a>] inet_csk_clone_lock+0x10/0x7b
[<ffffffff814ebbc3>] tcp_create_openreq_child+0x21/0x481
[<ffffffff814e8fa5>] tcp_v4_syn_recv_sock+0x3a/0x23b
[<ffffffff814ec5ba>] tcp_check_req+0x29f/0x416
[<ffffffff814e8e10>] tcp_v4_do_rcv+0x161/0x2bc
[<ffffffff814eb917>] tcp_v4_rcv+0x6c9/0x701
[<ffffffff814cea9f>] ip_local_deliver_finish+0x70/0xc4
[<ffffffff814cec20>] ip_local_deliver+0x4e/0x7f
[<ffffffff814ce9f8>] ip_rcv_finish+0x1fc/0x233
[<ffffffff814cee68>] ip_rcv+0x217/0x267
[<ffffffff814a7bbe>] __netif_receive_skb+0x49e/0x553
[<ffffffff814a7cc3>] netif_receive_skb+0x50/0x82
This happens, because sk_clone_lock initializes sk_refcnt to 2, and thus
a single sock_put() is not enough to free the memory. Additionally, things
like xfrm, memcg, cookie_values,... may have been initialized.
We have to free them properly.
This is fixed by forcing a call to tcp_done(), ending up in
inet_csk_destroy_sock, doing the final sock_put(). tcp_done() is necessary,
because it ends up doing all the cleanup on xfrm, memcg, cookie_values,
xfrm,...
Before calling tcp_done, we have to set the socket to SOCK_DEAD, to
force it entering inet_csk_destroy_sock. To avoid the warning in
inet_csk_destroy_sock, inet_num has to be set to 0.
As inet_csk_destroy_sock does a dec on orphan_count, we first have to
increase it.
Calling tcp_done() allows us to remove the calls to
tcp_clear_xmit_timer() and tcp_cleanup_congestion_control().
A similar approach is taken for dccp by calling dccp_done().
This is in the kernel since 093d282321 (tproxy: fix hash locking issue
when using port redirection in __inet_inherit_port()), thus since
version >= 2.6.37.
Signed-off-by: Christoph Paasch <christoph.paasch@uclouvain.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
In function ndisc_redirect_rcv(), the skb->data points to the transport
header, but function icmpv6_notify() need the skb->data points to the
inner IP packet. So before using icmpv6_notify() to propagate redirect,
change skb->data to point the inner IP packet that triggered the sending
of the Redirect, and introduce struct rd_msg to make it easy.
Signed-off-by: Duan Jiong <djduanjiong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
mutex_destroy() must be called before wpan_phy_free(), because it puts the last
reference and frees memory. Catched as overwritten poison in kmalloc-2048.
Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: linux-zigbee-devel@lists.sourceforge.net
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
As suggested by Stephen Hemminger, this remove the temporary variable
introduced in commit eca2a43bb0
("bridge: fix icmpv6 endian bug and other sparse warnings")
Signed-off-by: Ang Way Chuang <wcang@sfc.wide.ad.jp>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull networking fixes from David Miller:
"A pile of fixes in response to yesterday's big merge. The SCTP HMAC
thing hasn't been addressed yet, I'll take care of that myself if Neil
and Vlad don't show signs of life by tomorrow.
1) Use after free of SKB in tuntap code. Fix by Eric Dumazet,
reported by Dave Jones.
2) NFC LLCP code emits annoying kernel log message, triggerable by
the user. From Dave Jones.
3) Fix several endianness bugs noticed by sparse in the bridging
code, from Stephen Hemminger.
4) Ipv6 NDISC code doesn't take padding into account properly, fix
from YOSHIFUJI Hideaki.
5) Add missing docs to ethtool_flow_ext struct, from Yan Burman."
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
bridge: fix icmpv6 endian bug and other sparse warnings
net: ethool: Document struct ethtool_flow_ext
ndisc: Fix padding error in link-layer address option.
tuntap: dont use skb after netif_rx_ni(skb)
nfc: remove noisy message from llcp_sock_sendmsg
Pull HID subsystem updates from Jiri Kosina:
1) Support for HID over I2C bus has been added by Benjamin Tissoires.
ACPI device discovery is still in the works.
2) Support for Win8 Multitiouch protocol is being added, most work done
by Benjamin Tissoires as well
3) EIO/ERESTARTSYS is fixed in hiddev/hidraw, fixes by Andrew Duggan
and Jiri Kosina
4) ION iCade driver added by Bastien Nocera
5) Support for a couple new Roccat devices has been added by Stefan
Achatz
6) HID sensor hubs are now auto-detected instead of having to list all
the VID/PID combinations in the blacklist array
7) other random fixes and support for new device IDs
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: (65 commits)
HID: i2c-hid: add mutex protecting open/close race
Revert "HID: sensors: add to special driver list"
HID: sensors: autodetect USB HID sensor hubs
HID: hidp: fallback to input session properly if hid is blacklisted
HID: i2c-hid: fix ret_count check
HID: i2c-hid: fix i2c_hid_get_raw_report count mismatches
HID: i2c-hid: remove extra .irq field in struct i2c_hid
HID: i2c-hid: reorder allocation/free of buffers
HID: i2c-hid: fix memory corruption due to missing hid declaration
HID: i2c-hid: remove superfluous include
HID: i2c-hid: remove unneeded test in i2c_hid_remove
HID: i2c-hid: i2c_hid_get_report may fail
HID: i2c-hid: also call i2c_hid_free_buffers in i2c_hid_remove
HID: i2c-hid: fix error messages
HID: i2c-hid: fix return paths
HID: i2c-hid: remove unused static declarations
HID: i2c-hid: fix i2c_hid_dbg macro
HID: i2c-hid: fix checkpatch.pl warning
HID: i2c-hid: enhance Kconfig
HID: i2c-hid: change I2C name
...