This makes "iw wlan0 dump survey" work again with
mac80211-based drivers that support it, e.g. ath5k.
Signed-off-by: Holger Schurig <holgerschurig@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
ipmr_rules_exit() and ip6mr_rules_exit() free a list of items, but
forget to properly remove these items from list. List head is not
changed and still points to freed memory.
This can trigger a fault later when icmpv6_sk_exit() is called.
Fix is to either reinit list, or use list_del() to properly remove items
from list before freeing them.
bugzilla report : https://bugzilla.kernel.org/show_bug.cgi?id=16120
Introduced by commit d1db275dd3 (ipv6: ip6mr: support multiple
tables) and commit f0ad0860d0 (ipv4: ipmr: support multiple tables)
Reported-by: Alex Zhavnerchik <alex.vizor@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
With mtu=9000, mld_newpack() use order-2 GFP_ATOMIC allocations, that
are very unreliable, on machines where PAGE_SIZE=4K
Limit allocated skbs to be at most one page. (order-0 allocations)
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Its better to make a route lookup in appropriate namespace.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
I believe a moderate SYN flood attack can corrupt RFS flow table
(rps_sock_flow_table), making RPS/RFS much less effective.
Even in a normal situation, server handling short lived sessions suffer
from bad steering for the first data packet of a session, if another SYN
packet is received for another session.
We do following action in tcp_v4_rcv() :
sock_rps_save_rxhash(sk, skb->rxhash);
We should _not_ do this if sk is a LISTEN socket, as about each
packet received on a LISTEN socket has a different rxhash than
previous one.
-> RPS_NO_CPU markers are spread all over rps_sock_flow_table.
Also, it makes sense to protect sk->rxhash field changes with socket
lock (We currently can change it even if user thread owns the lock
and might use rxhash)
This patch moves sock_rps_save_rxhash() to a sock locked section,
and only for non LISTEN sockets.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
syncookies default to on since
e994b7c901
(tcp: Don't make syn cookies initial setting depend on CONFIG_SYSCTL).
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
xfrm triggers a warning if dst_pop() drops a refcount
on a noref dst. This patch changes dst_pop() to
skb_dst_pop(). skb_dst_pop() drops the refcnt only
on a refcounted dst. Also we don't clone the child
dst_entry, so it is not refcounted and we can use
skb_dst_set_noref() in xfrm_output_one().
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Processing an association response could take a bit
of time while we set up the hardware etc. During that
time, the AP might already send a blockack request.
If this happens very quickly on a fairly slow machine,
we can end up processing the blockack request before
the association processing has finished. Since the
blockack processing cannot sleep right now, we also
cannot make it wait in the driver.
As a result, sometimes on slow machines the iwlagn
driver gets totally confused, and no traffic can pass
when the aggregation setup was done before the assoc
setup completed.
I'm working on a proper fix for this, which involves
queuing all blockack category action frames from a
work struct, and also allowing the ampdu_action driver
callback to sleep, which will generally clean up the
code and make things easier.
However, this is a very involved and complex change.
To fix the problem at hand in a way that can also be
backported to stable, I've come up with this patch.
Here, I simply process all aggregation action frames
from the managed interface skb queue, which means
their processing will be serialized with processing
the association response, thereby fixing the problem.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Cc: stable@kernel.org
Signed-off-by: John W. Linville <linville@tuxdriver.com>
access skb->data safely
we should use skb_header_pointer() and skb_store_bits() to access skb->data to
handle small or non-linear skbs.
Signed-off-by: Changli Gao <xiaosuo@gmail.com>
----
net/sched/act_pedit.c | 24 ++++++++++++++----------
1 file changed, 14 insertions(+), 10 deletions(-)
Signed-off-by: David S. Miller <davem@davemloft.net>
use skb_header_pointer() to dereference data safely
the original skb->data dereference isn't safe, as there isn't any skb->len or
skb_is_nonlinear() check. skb_header_pointer() is used instead in this patch.
And when the skb isn't long enough, we terminate the function u32_classify()
immediately with -1.
Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For large values of rtt, 2^rho operation may overflow u32. Clamp down the increment to 2^16.
Signed-off-by: Daniele Lacamera <root@danielinux.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
fix the wrong checksum when addr isn't in old_addr/mask
For TCP and UDP packets, when addr isn't in old_addr/mask we don't do SNAT or
DNAT, and we should not update layer 4 checksum.
Signed-off-by: Changli Gao <xiaosuo@gmail.com>
----
net/sched/act_nat.c | 4 ++++
1 file changed, 4 insertions(+)
Signed-off-by: David S. Miller <davem@davemloft.net>
If a skb is received on an inactive bond that does not meet
the special cases checked for by skb_bond_should_drop it should
only be delivered to exact matches as the comment in
netif_receive_skb() says.
However because null_or_bond could also be null this is not
always true. This patch renames null_or_bond to orig_or_bond
and initializes it to orig_dev. This keeps the intent of
null_or_bond to pass frames received on VLAN interfaces stacked
on bonding interfaces without invalidating the statement for
null_or_orig.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The vlan device should not copy the slave or master flags from
the real device. It is not in the bond until added nor is it
a master.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Packets going through __xfrm_route_forward() have a not refcounted dst
entry, since we enabled a noref forwarding path.
xfrm_lookup() might incorrectly release this dst entry.
It's a bit late to make invasive changes in xfrm_lookup(), so lets force
a refcount in this path.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The dialog token allocator has apparently been broken
since b83f4e15 ("mac80211: fix deadlock in sta->lock")
because it got moved out under the spinlock. Fix it.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Daniel reported that the paged RX changes had
broken blockack request frame processing due
to using data that wasn't really part of the
skb data.
Fix this using skb_copy_bits() for the needed
data. As a side effect, this adds a check on
processing too short frames, which previously
this code could do.
Reported-by: Daniel Halperin <dhalperi@cs.washington.edu>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Acked-by: Daniel Halperin <dhalperi@cs.washington.edu>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Commit: c720c7e838 missed these.
Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Correct sk_forward_alloc handling for error_queue would need to use a
backlog of frames that softirq handler could not deliver because socket
is owned by user thread. Or extend backlog processing to be able to
process normal and error packets.
Another possibility is to not use mem charge for error queue, this is
what I implemented in this patch.
Note: this reverts commit 29030374
(net: fix sk_forward_alloc corruptions), since we dont need to lock
socket anymore.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit f3c5c1bfd4 (netfilter: xtables: make ip_tables reentrant)
introduced a performance regression, because stackptr array is shared by
all cpus, adding cache line ping pongs. (16 cpus share a 64 bytes cache
line)
Fix this using alloc_percpu()
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-By: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
In xt_register_table, xt_jumpstack_alloc is called first, later
xt_replace_table is used. But in xt_replace_table, xt_jumpstack_alloc
will be used again. Then the memory allocated by previous xt_jumpstack_alloc
will be leaked. We can simply remove the previous xt_jumpstack_alloc because
there aren't any users of newinfo between xt_jumpstack_alloc and
xt_replace_table.
Signed-off-by: Xiaotian Feng <dfeng@redhat.com>
Cc: Patrick McHardy <kaber@trash.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jan Engelhardt <jengelh@medozas.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Acked-By: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
As David found out, sock_queue_err_skb() should be called with socket
lock hold, or we risk sk_forward_alloc corruption, since we use non
atomic operations to update this field.
This patch adds bh_lock_sock()/bh_unlock_sock() pair to three spots.
(BH already disabled)
1) skb_tstamp_tx()
2) Before calling ip_icmp_error(), in __udp4_lib_err()
3) Before calling ipv6_icmp_error(), in __udp6_lib_err()
Reported-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The accept()'d socket need to be unhashed while the (listen()'ing)
socket lock is held. This fixes a race condition that could lead to an
OOPS.
Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There was an spin_unlock missing on the error path. The spin_lock was
tucked in with the declarations so it was hard to spot. I added a new
line.
Signed-off-by: Dan Carpenter <error27@gmail.com>
Acked-by: Sjur Brændeland <sjurbren@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a mutex_unlock missing on the error path. In each case, whenever the
label out is reached from elsewhere in the function, mutex is not locked.
The semantic match that finds this problem is as follows:
(http://coccinelle.lip6.fr/)
// <smpl>
@@
expression E1;
@@
* mutex_lock(E1);
<+... when != E1
if (...) {
... when != E1
* return ...;
}
...+>
* mutex_unlock(E1);
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Reviewed-by: Zach Brown <zach.brown@oracle.com>
Acked-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit f4f914b5 (net: ipv6 bind to device issue) caused
a regression with Mobile IPv6 when it changed the meaning
of fl->oif to become a strict requirement of the route
lookup. Instead, only force strict mode when
sk->sk_bound_dev_if is set on the calling socket, getting
the intended behavior and fixing the regression.
Tested-by: Arnaud Ebalard <arno@natisbad.org>
Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
sparse correctly complains that
__ieee80211_get_channel_mode is not static.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (22 commits)
netlink: bug fix: wrong size was calculated for vfinfo list blob
netlink: bug fix: don't overrun skbs on vf_port dump
xt_tee: use skb_dst_drop()
netdev/fec: fix ifconfig eth0 down hang issue
cnic: Fix context memory init. on 5709.
drivers/net: Eliminate a NULL pointer dereference
drivers/net/hamradio: Eliminate a NULL pointer dereference
be2net: Patch removes redundant while statement in loop.
ipv6: Add GSO support on forwarding path
net: fix __neigh_event_send()
vhost: fix the memory leak which will happen when memory_access_ok fails
vhost-net: fix to check the return value of copy_to/from_user() correctly
vhost: fix to check the return value of copy_to/from_user() correctly
vhost: Fix host panic if ioctl called with wrong index
net: fix lock_sock_bh/unlock_sock_bh
net/iucv: Add missing spin_unlock
net: ll_temac: fix checksum offload logic
net: ll_temac: fix interrupt bug when interrupt 0 is used
sctp: dubious bitfields in sctp_transport
ipmr: off by one in __ipmr_fill_mroute()
...
The wrong size was being calculated for vfinfo. In one case, it was over-
calculating using nlmsg_total_size on attrs, in another case, it was
under-calculating by assuming ifla_vf_* structs are packed together, but
each struct is it's own attr w/ hdr (and padding).
Signed-off-by: Scott Feldman <scofeldm@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Noticed by Patrick McHardy: was continuing to fill skb after a
nla_put_failure, ignoring the size calculated by upper layer. Now,
return -EMSGSIZE on any overruns, but also allow netdev to
fail ndo_get_vf_port with error other than -EMSGSIZE, thus unwinding
nest.
Signed-off-by: Scott Feldman <scofeldm@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After commit 7fee226a (net: add a noref bit on skb dst), its wrong to
use : dst_release(skb_dst(skb)), since we could decrement a refcount
while skb dst was not refcounted.
We should use skb_dst_drop(skb) instead.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently we disallow GSO packets on the IPv6 forward path.
This patch fixes this.
Note that I discovered that our existing GSO MTU checks (e.g.,
IPv4 forwarding) are buggy in that they skip the check altogether,
when they really should be checking gso_size + header instead.
I have also been lazy here in that I haven't bothered to segment
the GSO packet by hand before generating an ICMP message. Someone
should add that to be 100% correct.
Reported-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit 7fee226ad2 (net: add a noref bit on skb dst) missed one spot
where an skb is enqueued, with a possibly not refcounted dst entry.
__neigh_event_send() inserts skb into arp_queue, so we must make sure
dst entry is refcounted, or dst entry can be freed by garbage collector
after caller exits from rcu protected section.
Reported-by: Ingo Molnar <mingo@elte.hu>
Tested-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (61 commits)
tracing: Add __used annotation to event variable
perf, trace: Fix !x86 build bug
perf report: Support multiple events on the TUI
perf annotate: Fix up usage of the build id cache
x86/mmiotrace: Remove redundant instruction prefix checks
perf annotate: Add TUI interface
perf tui: Remove annotate from popup menu after failure
perf report: Don't start the TUI if -D is used
perf: Fix getline undeclared
perf: Optimize perf_tp_event_match()
perf: Remove more code from the fastpath
perf: Optimize the !vmalloc backed buffer
perf: Optimize perf_output_copy()
perf: Fix wakeup storm for RO mmap()s
perf-record: Share per-cpu buffers
perf-record: Remove -M
perf: Ensure that IOC_OUTPUT isn't used to create multi-writer buffers
perf, trace: Optimize tracepoints by using per-tracepoint-per-cpu hlist to track events
perf, trace: Optimize tracepoints by removing IRQ-disable from perf/tracepoint interaction
perf tui: Allow disabling the TUI on a per command basis in ~/.perfconfig
...
* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
NFS: Fix another nfs_wb_page() deadlock
NFS: Ensure that we mark the inode as dirty if we exit early from commit
NFS: Fix a lock imbalance typo in nfs_access_cache_shrinker
sunrpc: fix leak on error on socket xprt setup
By the previous modification, the cpu notifier can return encapsulate
errno value. This converts the cpu notifiers for iucv.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This new sock lock primitive was introduced to speedup some user context
socket manipulation. But it is unsafe to protect two threads, one using
regular lock_sock/release_sock, one using lock_sock_bh/unlock_sock_bh
This patch changes lock_sock_bh to be careful against 'owned' state.
If owned is found to be set, we must take the slow path.
lock_sock_bh() now returns a boolean to say if the slow path was taken,
and this boolean is used at unlock_sock_bh time to call the appropriate
unlock function.
After this change, BH are either disabled or enabled during the
lock_sock_bh/unlock_sock_bh protected section. This might be misleading,
so we rename these functions to lock_sock_fast()/unlock_sock_fast().
Reported-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Tested-by: Anton Blanchard <anton@samba.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a spin_unlock missing on the error path. There seems like no reason
why the lock should continue to be held if the kzalloc fail.
The semantic match that finds this problem is as follows:
(http://coccinelle.lip6.fr/)
// <smpl>
@@
expression E1;
@@
* spin_lock(E1,...);
<+... when != E1
if (...) {
... when != E1
* return ...;
}
...+>
* spin_unlock(E1,...);
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Also collect exit code together while we're at it.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
This fixes a smatch warning:
net/ipv4/ipmr.c +1917 __ipmr_fill_mroute(12) error: buffer overflow
'(mrt)->vif_table' 32 <= 32
The ipv6 version had the same issue.
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (63 commits)
drivers/net/usb/asix.c: Fix pointer cast.
be2net: Bug fix to avoid disabling bottom half during firmware upgrade.
proc_dointvec: write a single value
hso: add support for new products
Phonet: fix potential use-after-free in pep_sock_close()
ath9k: remove VEOL support for ad-hoc
ath9k: change beacon allocation to prefer the first beacon slot
sock.h: fix kernel-doc warning
cls_cgroup: Fix build error when built-in
macvlan: do proper cleanup in macvlan_common_newlink() V2
be2net: Bug fix in init code in probe
net/dccp: expansion of error code size
ath9k: Fix rx of mcast/bcast frames in PS mode with auto sleep
wireless: fix sta_info.h kernel-doc warnings
wireless: fix mac80211.h kernel-doc warnings
iwlwifi: testing the wrong variable in iwl_add_bssid_station()
ath9k_htc: rare leak in ath9k_hif_usb_alloc_tx_urbs()
ath9k_htc: dereferencing before check in hif_usb_tx_cb()
rt2x00: Fix rt2800usb TX descriptor writing.
rt2x00: Fix failed SLEEP->AWAKE and AWAKE->SLEEP transitions.
...
sk_common_release() might destroy our last reference to the socket.
So an extra temporary reference is needed during cleanup.
Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
gcc-4.3.3 produces the warning:
"format not a string literal and no format arguments"
Signed-off-by: Alex Riesen <raa.lkml@gmail.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Chuck Lever <cel@citi.umich.edu>
Cc: David S. Miller <davem@davemloft.net>
Acked-by: Tom Talpey <tmtalpey@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- C99 knows about USHRT_MAX/SHRT_MAX/SHRT_MIN, not
USHORT_MAX/SHORT_MAX/SHORT_MIN.
- Make SHRT_MIN of type s16, not int, for consistency.
[akpm@linux-foundation.org: fix drivers/dma/timb_dma.c]
[akpm@linux-foundation.org: fix security/keys/keyring.c]
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: WANG Cong <xiyou.wangcong@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Because MIPS's EDQUOT value is 1133(0x46d).
It's larger than u8.
Signed-off-by: Yoichi Yuasa <yuasa@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix sta_info.h kernel-doc warnings:
Warning(net/mac80211/sta_info.h:164): No description found for parameter 'tid_active_rx[STA_TID_NUM]'
Warning(net/mac80211/sta_info.h:164): Excess struct/union/enum/typedef member 'tid_state_rx' description in 'sta_ampdu_mlme'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This reverts commit 03ceedea97.
This patch was reported to cause a regression in which connectivity is
lost and cannot be reestablished after a suspend/resume cycle.
Signed-off-by: John W. Linville <linville@tuxdriver.com>
* 'bkl/ioctl' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing:
uml: Pushdown the bkl from harddog_kern ioctl
sunrpc: Pushdown the bkl from sunrpc cache ioctl
sunrpc: Pushdown the bkl from ioctl
autofs4: Pushdown the bkl from ioctl
uml: Convert to unlocked_ioctls to remove implicit BKL
ncpfs: BKL ioctl pushdown
coda: Clean-up whitespace problems in pioctl.c
coda: BKL ioctl pushdown
drivers: Push down BKL into various drivers
isdn: Push down BKL into ioctl functions
scsi: Push down BKL into ioctl functions
dvb: Push down BKL into ioctl functions
smbfs: Push down BKL into ioctl function
coda/psdev: Remove BKL from ioctl function
um/mmapper: Remove BKL usage
sn_hwperf: Kill BKL usage
hfsplus: Push down BKL into ioctl function
This reverts commit 03ceedea97, since it
breaks resume from suspend-to-ram on Rafael's Acer Ferrari One.
NetworkManager thinks everything is ok, but it can't connect to the AP
to get an IP address after the resume.
In fact, it even breaks resume for non-ath9k chipsets: reverting it also
fixes Rafael's Toshiba Protege R500 with the iwlagn driver. As Johannes
says:
"Indeed, this patch needs to be reverted. That mac80211 change is wrong
and completely unnecessary."
Reported-and-requested-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Cc: Daniel Yingqiang Ma <yma.cool@gmail.com>
Cc: John W. Linville <linville@tuxdriver.com>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch makes tun update its socket classid every time we
inject a packet into the network stack. This is so that any
updates made by the admin to the process writing packets to
tun is effected.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Up until now cls_cgroup has relied on fetching the classid out of
the current executing thread. This runs into trouble when a packet
processing is delayed in which case it may execute out of another
thread's context.
Furthermore, even when a packet is not delayed we may fail to
classify it if soft IRQs have been disabled, because this scenario
is indistinguishable from one where a packet unrelated to the
current thread is processed by a real soft IRQ.
In fact, the current semantics is inherently broken, as a single
skb may be constructed out of the writes of two different tasks.
A different manifestation of this problem is when the TCP stack
transmits in response of an incoming ACK. This is currently
unclassified.
As we already have a concept of packet ownership for accounting
purposes in the skb->sk pointer, this is a natural place to store
the classid in a persistent manner.
This patch adds the cls_cgroup classid in struct sock, filling up
an existing hole on 64-bit :)
The value is set at socket creation time. So all sockets created
via socket(2) automatically gains the ID of the thread creating it.
Whenever another process touches the socket by either reading or
writing to it, we will change the socket classid to that of the
process if it has a valid (non-zero) classid.
For sockets created on inbound connections through accept(2), we
inherit the classid of the original listening socket through
sk_clone, possibly preceding the actual accept(2) call.
In order to minimise risks, I have not made this the authoritative
classid. For now it is only used as a backup when we execute
with soft IRQs disabled. Once we're completely happy with its
semantics we can use it as the sole classid.
Footnote: I have rearranged the error path on cls_group module
creation. If we didn't do this, then there is a window where
someone could create a tc rule using cls_group before the cgroup
subsystem has been registered.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixed handling when skb don't fit in user buffer,
instead of returning -EMSGSIZE, the buffer is truncated (just
as unix seqpakcet does).
Signed-off-by: Sjur Braendeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Splint found missing spin_unlock.
Corrected this an some other trivial split warnings.
Signed-off-by: Sjur Braendeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Discovered bug when testing async connect.
While connecting poll should not return POLLHUP,
but POLLOUT when connected.
Also fixed the sysfs flow-control-counters.
Signed-off-by: Sjur Braendeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Discovered bugs when injecting slab allocation failures.
Add checks on all memory allocation.
Signed-off-by: Sjur Braendeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Discovered bug when running high number of parallel connect requests.
Replace buggy home brewed list with linux/list.h.
Signed-off-by: Sjur Braendeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Discovered bug when testing on 64bit architecture.
Fixed by using long to store result from wait_event_interruptible_timeout.
Signed-off-by: Sjur Braendeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
the commit:
commit d90310243f
Author: Octavian Purdila <opurdila@ixiacom.com>
Date: Wed Nov 18 02:36:59 2009 +0000
net: device name allocation cleanups
introduced a bug when there is a hash collision making impossible
to rename a device with eth%d. This bug is very hard to reproduce
and appears rarely.
The problem is coming from we don't pass a temporary buffer to
__dev_alloc_name but 'dev->name' which is modified by the function.
A detailed explanation is here:
http://marc.info/?l=linux-netdev&m=127417784011987&w=2
Changelog:
V2 : replaced strings comparison by pointers comparison
Signed-off-by: Daniel Lezcano <daniel.lezcano@free.fr>
Reviewed-by: Octavian Purdila <opurdila@ixiacom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit c02db8c6290bb992442fec1407643c94cc414375:
Author: Chris Wright <chrisw@sous-sol.org>
Date: Sun May 16 01:05:45 2010 -0700
Subject: rtnetlink: make SR-IOV VF interface symmetric
adds broken error handling to do_setlink() in net/core/rtnetlink.c. The
problem is the following chunk of code:
if (tb[IFLA_VFINFO_LIST]) {
struct nlattr *attr;
int rem;
nla_for_each_nested(attr, tb[IFLA_VFINFO_LIST], rem) {
if (nla_type(attr) != IFLA_VF_INFO)
----> goto errout;
err = do_setvfinfo(dev, attr);
if (err < 0)
goto errout;
modified = 1;
}
}
which can get to errout without setting err, resulting in the following error:
net/core/rtnetlink.c: In function 'do_setlink':
net/core/rtnetlink.c:904: warning: 'err' may be used uninitialized in this function
Change the code to return -EINVAL in this case. Note that this might not be
the appropriate error though.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Chris Wright <chrisw@sous-sol.org>
cc: David S. Miller <davem@davemloft.net>
Acked-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is already a submenu entry that is always displayed, so there is
no need to also show a dedicated CAIF comment.
Drop dead commented code while we're here, and change the submenu text
to better match the style everyone else is using.
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Pfaff reported a kernel oops and provided a test program to
reproduce it.
https://kerneltrap.org/mailarchive/linux-netdev/2010/5/21/6277805
tc_fill_qdisc() should not be called for builtin qdisc, or it
dereference a NULL pointer to get device ifindex.
Fix is to always use tc_qdisc_dump_ignore() before calling
tc_fill_qdisc().
Reported-by: Ben Pfaff <blp@nicira.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pushdown the bkl to cache_ioctl_pipefs.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Neil Brown <neilb@suse.de>
Cc: Nfs <linux-nfs@vger.kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: John Kacur <jkacur@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Pushdown the bkl to rpc_pipe_ioctl.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Neil Brown <neilb@suse.de>
Cc: Nfs <linux-nfs@vger.kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: John Kacur <jkacur@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
* 'virtio' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: (27 commits)
drivers/char: Eliminate use after free
virtio: console: Accept console size along with resize control message
virtio: console: Store each console's size in the console structure
virtio: console: Resize console port 0 on config intr only if multiport is off
virtio: console: Add support for nonblocking write()s
virtio: console: Rename wait_is_over() to will_read_block()
virtio: console: Don't always create a port 0 if using multiport
virtio: console: Use a control message to add ports
virtio: console: Move code around for future patches
virtio: console: Remove config work handler
virtio: console: Don't call hvc_remove() on unplugging console ports
virtio: console: Return -EPIPE to hvc_console if we lost the connection
virtio: console: Let host know of port or device add failures
virtio: console: Add a __send_control_msg() that can send messages without a valid port
virtio: Revert "virtio: disable multiport console support."
virtio: add_buf_gfp
trans_virtio: use virtqueue_xxx wrappers
virtio-rng: use virtqueue_xxx wrappers
virtio_ring: remove a level of indirection
virtio_net: use virtqueue_xxx wrappers
...
Fix up conflicts in drivers/net/virtio_net.c due to new virtqueue_xxx
wrappers changes conflicting with some other cleanups.
I made a V2 of this patch on top of my patches for VFS switches.
All the changes were due to change in some offsets.
rename - change name of file or directory
size[4] Trename tag[2] fid[4] newdirfid[4] name[s]
size[4] Rrename tag[2]
The rename message is used to change the name of a file, possibly moving it
to a new directory. The 9P wstat message can only rename a file within the
same directory.
Signed-off-by: Jim Garlick <garlick@llnl.gov>
Signed-off-by: Sripathi Kodi <sripathik@in.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
I made a V2 of this patch on top of my patches for VFS switches. The
change was adding v9fs_statfs pointer to v9fs_super_ops_dotl
instead of v9fs_super_ops.
statfs - get file system statistics
size[4] Tstatfs tag[2] fid[4]
size[4] Rstatfs tag[2] type[4] bsize[4] blocks[8] bfree[8] bavail[8]
files[8] ffree[8] fsid[8] namelen[4]
The statfs message is used to request file system information returned
by the statfs(2) system call, which is used by df(1) to report file
system and disk space usage.
Signed-off-by: Jim Garlick <garlick@llnl.gov>
Signed-off-by: Sripathi Kodi <sripathik@in.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
This patch adds F_GETPIPE_SZ and F_SETPIPE_SZ fcntl() actions for
growing and shrinking the size of a pipe and adjusts pipe.c and splice.c
(and relay and network splice) usage to work with these larger (or smaller)
pipes.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Specifying a valid channel type will get
goto out rather than continuing, due to
missing braces. This affects both remain
on channel and action frame TX commands.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Since wdev can be NULL, check it before dereferencing it
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
signal_type is enum cfg80211_signal_type.
This fixes the gcc warning:
scan.c: In function `cfg80211_inform_bss':
scan.c:518:6: warning: comparison between `enum cfg80211_signal_type' and `enum nl80211_bss'
scan.c: In function `cfg80211_inform_bss_frame':
scan.c:574:6: warning: comparison between `enum cfg80211_signal_type' and `enum nl80211_bss'
Signed-off-by: Sujith <Sujith.Manoharan@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This reverts commit aaf8cdc34d.
Drivers like the ipw2100 call device_create_group when they
are initialized and device_remove_group when they are shutdown.
Moving them between namespaces deletes their sysfs groups early.
In particular the following call chain results.
netdev_unregister_kobject -> device_del -> kobject_del -> sysfs_remove_dir
With sysfs_remove_dir recursively deleting all of it's subdirectories,
and nothing adding them back.
Ouch!
Therefore we need to call something that ultimate calls sysfs_mv_dir
as that sysfs function can move sysfs directories between namespaces
without deleting their subdirectories or their contents. Allowing
us to avoid placing extra boiler plate into every driver that does
something interesting with sysfs.
Currently the function that provides that capability is device_rename.
That is the code works without nasty side effects as originally written.
So remove the misguided fix for moving devices between namespaces. The
bug in the kobject layer that inspired it has now been recognized and
fixed.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
When netlink sockets are used to convey data that is in a namespace
we need a way to select a subset of the listening sockets to deliver
the packet to. For the network namespace we have been doing this
by only transmitting packets in the correct network namespace.
For data belonging to other namespaces netlink_bradcast_filtered
provides a mechanism that allows us to examine the destination
socket and to decide if we should transmit the specified packet
to it.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
I had a couple of stupid bugs in:
netns: Teach network device kobjects which namespace they are in.
- I duplicated the Kconfig for the NET_NS
- The build was broken when sysfs was not compiled in
The sysfs breakage is because after I moved the operations
for the sysfs to the kobject layer, to make things cleaner
I forgot to move the ifdefs. Opps.
I'm not quite certain how I got introduced a second NET_NS Kconfig,
but it was probably a 3 way merge somewhere along the way that
did not notice that the NET_NS Kconfig option had mvoed and thout
that was a bug. It probably slipped in because it used to be the
sysfs patches were the first patches in my network namespace patches.
Some things just don't go like you would expect.
Neither of these bugs actually affect anything in the common case
but they should be fixed.
Thanks to Serge for noticing they were present.
Reported-by: Serge E. Hallyn <serue@us.ibm.com>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Acked-by: David S. Miller <davem@davemloft.net>
The problem. Network devices show up in sysfs and with the network
namespace active multiple devices with the same name can show up in
the same directory, ouch!
To avoid that problem and allow existing applications in network namespaces
to see the same interface that is currently presented in sysfs, this
patch enables the tagging directory support in sysfs.
By using the network namespace pointers as tags to separate out the
the sysfs directory entries we ensure that we don't have conflicts
in the directories and applications only see a limited set of
the network devices.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This allows bin_attr->read,write,mmap callbacks to check file specific data
(such as inode owner) as part of any privilege validation.
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Fix some issues introduced in batch skb dequeuing for input_pkt_queue.
The primary issue it that the queue head must be incremented only
after a packet has been processed, that is only after
__netif_receive_skb has been called. This is needed for the mechanism
to prevent OOO packet in RFS. Also when flushing the input_pkt_queue
and process_queue, the process queue should be done first to prevent
OOO packets.
Because the input_pkt_queue has been effectively split into two queues,
the calculation of the tail ptr is no longer correct. The correct value
would be head+input_pkt_queue->len+process_queue->len. To avoid
this calculation we added an explict input_queue_tail in softnet_data.
The tail value is simply incremented when queuing to input_pkt_queue.
Signed-off-by: Tom Herbert <therbert@google.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When GRO produces fraglist entries, and the resulting skb hits
an interface that is incapable of TSO but capable of FRAGLIST,
we end up producing a bogus packet with gso_size non-zero.
This was reported in the field with older versions of KVM that
did not set the TSO bits on tuntap.
This patch fixes that.
Reported-by: Igor Zhang <yugzhang@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1674 commits)
qlcnic: adding co maintainer
ixgbe: add support for active DA cables
ixgbe: dcb, do not tag tc_prio_control frames
ixgbe: fix ixgbe_tx_is_paused logic
ixgbe: always enable vlan strip/insert when DCB is enabled
ixgbe: remove some redundant code in setting FCoE FIP filter
ixgbe: fix wrong offset to fc_frame_header in ixgbe_fcoe_ddp
ixgbe: fix header len when unsplit packet overflows to data buffer
ipv6: Never schedule DAD timer on dead address
ipv6: Use POSTDAD state
ipv6: Use state_lock to protect ifa state
ipv6: Replace inet6_ifaddr->dead with state
cxgb4: notify upper drivers if the device is already up when they load
cxgb4: keep interrupts available when the ports are brought down
cxgb4: fix initial addition of MAC address
cnic: Return SPQ credit to bnx2x after ring setup and shutdown.
cnic: Convert cnic_local_flags to atomic ops.
can: Fix SJA1000 command register writes on SMP systems
bridge: fix build for CONFIG_SYSFS disabled
ARCNET: Limit com20020 PCI ID matches for SOHARD cards
...
Fix up various conflicts with pcmcia tree drivers/net/
{pcmcia/3c589_cs.c, wireless/orinoco/orinoco_cs.c and
wireless/orinoco/spectrum_cs.c} and feature removal
(Documentation/feature-removal-schedule.txt).
Also fix a non-content conflict due to pm_qos_requirement getting
renamed in the PM tree (now pm_qos_request) in net/mac80211/scan.c
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (44 commits)
vlynq: make whole Kconfig-menu dependant on architecture
add descriptive comment for TIF_MEMDIE task flag declaration.
EEPROM: max6875: Header file cleanup
EEPROM: 93cx6: Header file cleanup
EEPROM: Header file cleanup
agp: use NULL instead of 0 when pointer is needed
rtc-v3020: make bitfield unsigned
PCI: make bitfield unsigned
jbd2: use NULL instead of 0 when pointer is needed
cciss: fix shadows sparse warning
doc: inode uses a mutex instead of a semaphore.
uml: i386: Avoid redefinition of NR_syscalls
fix "seperate" typos in comments
cocbalt_lcdfb: correct sections
doc: Change urls for sparse
Powerpc: wii: Fix typo in comment
i2o: cleanup some exit paths
Documentation/: it's -> its where appropriate
UML: Fix compiler warning due to missing task_struct declaration
UML: add kernel.h include to signal.c
...
This race was triggered by a 'conntrack -F' command running in parallel
to the insertion of a hash for a new connection. Losing this race led to
a dead conntrack entry effectively blocking traffic for a particular
connection until timeout or flushing the conntrack hashes again.
Now the check for an already dying connection is done inside the lock.
Signed-off-by: Joerg Marx <joerg.marx@secunet.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
* 'for-2.6.35' of git://linux-nfs.org/~bfields/linux: (45 commits)
Revert "nfsd4: distinguish expired from stale stateids"
nfsd: safer initialization order in find_file()
nfs4: minor callback code simplification, comment
NFSD: don't report compiled-out versions as present
nfsd4: implement reclaim_complete
nfsd4: nfsd4_destroy_session must set callback client under the state lock
nfsd4: keep a reference count on client while in use
nfsd4: mark_client_expired
nfsd4: introduce nfs4_client.cl_refcount
nfsd4: refactor expire_client
nfsd4: extend the client_lock to cover cl_lru
nfsd4: use list_move in move_to_confirmed
nfsd4: fold release_session into expire_client
nfsd4: rename sessionid_lock to client_lock
nfsd4: fix bare destroy_session null dereference
nfsd4: use local variable in nfs4svc_encode_compoundres
nfsd: further comment typos
sunrpc: centralise most calls to svc_xprt_received
nfsd4: fix unlikely race in session replay case
nfsd4: fix filehandle comment
...
* 'nfs-for-2.6.35' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6: (78 commits)
SUNRPC: Don't spam gssd with upcall requests when the kerberos key expired
SUNRPC: Reorder the struct rpc_task fields
SUNRPC: Remove the 'tk_magic' debugging field
SUNRPC: Move the task->tk_bytes_sent and tk_rtt to struct rpc_rqst
NFS: Don't call iput() in nfs_access_cache_shrinker
NFS: Clean up nfs_access_zap_cache()
NFS: Don't run nfs_access_cache_shrinker() when the mask is GFP_NOFS
SUNRPC: Ensure rpcauth_prune_expired() respects the nr_to_scan parameter
SUNRPC: Ensure memory shrinker doesn't waste time in rpcauth_prune_expired()
SUNRPC: Dont run rpcauth_cache_shrinker() when gfp_mask is GFP_NOFS
NFS: Read requests can use GFP_KERNEL.
NFS: Clean up nfs_create_request()
NFS: Don't use GFP_KERNEL in rpcsec_gss downcalls
NFSv4: Don't use GFP_KERNEL allocations in state recovery
SUNRPC: Fix xs_setup_bc_tcp()
SUNRPC: Replace jiffies-based metrics with ktime-based metrics
ktime: introduce ktime_to_ms()
SUNRPC: RPC metrics and RTT estimator should use same RTT value
NFS: Calldata for nfs4_renew_done()
NFS: Squelch compiler warning in nfs_add_server_stats()
...
* 'bkl/procfs' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing:
sunrpc: Include missing smp_lock.h
procfs: Kill the bkl in ioctl
procfs: Push down the bkl from ioctl
procfs: Use generic_file_llseek in /proc/vmcore
procfs: Use generic_file_llseek in /proc/kmsg
procfs: Use generic_file_llseek in /proc/kcore
procfs: Kill BKL in llseek on proc base
Switch trans_virtio to new virtqueue_xxx wrappers.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This patch ensures that all places that schedule the DAD timer
look at the address state in a safe manner before scheduling the
timer. This ensures that we don't end up with pending timers
after deleting an address.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch makes use of the new POSTDAD state. This prevents
a race between DAD completion and failure.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>