WSL2-Linux-Kernel

Граф коммитов

Автор	SHA1	Сообщение	Дата
David S. Miller	77148625e1	Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6	2010-11-29 11:19:09 -08:00
Eric Dumazet	25888e3031	af_unix: limit recursion level Its easy to eat all kernel memory and trigger NMI watchdog, using an exploit program that queues unix sockets on top of others. lkml ref : http://lkml.org/lkml/2010/11/25/8 This mechanism is used in applications, one choice we have is to have a recursion limit. Other limits might be needed as well (if we queue other types of files), since the passfd mechanism is currently limited by socket receive queue sizes only. Add a recursion_level to unix socket, allowing up to 4 levels. Each time we send an unix socket through sendfd mechanism, we copy its recursion level (plus one) to receiver. This recursion level is cleared when socket receive queue is emptied. Reported-by: Марк Коренберг <socketpair@gmail.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-11-29 09:45:15 -08:00
Shan Wei	49b4a6546f	sctp: kill unused macros in head file 1. SCTP_CMD_NUM_VERBS,SCTP_CMD_MAX These two macros have never been used for several years since v2.6.12-rc2. 2.sctp_port_rover,sctp_port_alloc_lock The commit 063930 abandoned global variables of port_rover and port_alloc_lock, but still keep two macros to refer to them. So, remove them now. commit `0639300900` Author: Stephen Hemminger <shemminger@linux-foundation.org> Date: Wed Oct 10 17:30:18 2007 -0700 [SCTP]: port randomization Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-11-29 09:41:12 -08:00
Timo Teräs	aa285b1740	xfrm: fix gre key endianess fl->fl_gre_key is network byte order contrary to fl->fl_icmp_*. Make xfrm_flowi_{s\|d}port return network byte order values for gre key too. Signed-off-by: Timo Teräs <timo.teras@iki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-11-28 11:22:17 -08:00
andrew hendry	5595a1a599	X25 remove bkl in subscription ioctls Signed-off-by: Andrew Hendry <andrew.hendry@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-11-28 11:12:20 -08:00
Shan Wei	5584b8078a	sctp: kill unused macro definition These macros have been existed for several years since v2.6.12-rc2. But they never be used. So remove them now. Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-11-28 10:47:15 -08:00
Thomas Graf	cf7afbfeb8	rtnl: make link af-specific updates atomic As David pointed out correctly, updates to af-specific attributes are currently not atomic. If multiple changes are requested and one of them fails, previous updates may have been applied already leaving the link behind in a undefined state. This patch splits the function parse_link_af() into two functions validate_link_af() and set_link_at(). validate_link_af() is placed to validate_linkmsg() check for errors as early as possible before any changes to the link have been made. set_link_af() is called to commit the changes later. This method is not fail proof, while it is currently sufficient to make set_link_af() inerrable and thus 100% atomic, the validation function method will not be able to detect all error scenarios in the future, there will likely always be errors depending on states which are f.e. not protected by rtnl_mutex and thus may change between validation and setting. Also, instead of silently ignoring unknown address families and config blocks for address families which did not register a set function the errors EAFNOSUPPORT respectively EOPNOSUPPORT are returned to avoid comitting 4 out of 5 update requests without notifying the user. Signed-off-by: Thomas Graf <tgraf@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-11-27 22:56:08 -08:00
John W. Linville	51cce8a590	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 into for-davem	2010-11-24 16:49:20 -05:00
Johannes Berg	c063dbf52b	cfg80211: allow using CQM event to notify packet loss This adds the ability for drivers to use CQM events to notify about packet loss for specific stations (which could be the AP for the managed mode case). Since the threshold might be determined by the driver (it isn't passed in right now) it will be passed out of the driver to userspace in the event. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-11-24 16:19:36 -05:00
Bruno Randolf	79b1c460a0	cfg80211: Add documentation for antenna ops Signed-off-by: Bruno Randolf <br1@einfach.org> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-11-24 16:19:36 -05:00
Felix Fietkau	dd5b4cc71c	cfg80211/mac80211: improve ad-hoc multicast rate handling - store the multicast rate as an index instead of the rate value (reduces cpu overhead in a hotpath) - validate the rate values (must match a bitrate in at least one sband) Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-11-24 16:19:35 -05:00
John W. Linville	d7a066c923	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6	2010-11-24 16:19:24 -05:00
John W. Linville	ccb1435401	Revert "nl80211/mac80211: Report signal average" This reverts commit `86107fd170`. This patch inadvertantly changed the userland ABI. Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-11-24 16:18:36 -05:00
Eric Dumazet	bba14de987	scm: lower SCM_MAX_FD Lower SCM_MAX_FD from 255 to 253 so that allocations for scm_fp_list are halved. (commit `f8d570a4` added two pointers in this structure) scm_fp_dup() should not copy whole structure (and trigger kmemcheck warnings), but only the used part. While we are at it, only allocate needed size. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-11-24 11:16:43 -08:00
Eric Dumazet	456b61bca8	ipv6: mcast: RCU conversion ipv6_sk_mc_lock rwlock becomes a spinlock. readers (inet6_mc_check()) now takes rcu_read_lock() instead of read lock. Writers dont need to disable BH anymore. struct ipv6_mc_socklist objects are reclaimed after one RCU grace period. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-11-24 11:16:42 -08:00
Luis R. Rodriguez	b2e253cf30	cfg80211: Fix regulatory bug with multiple cards and delays When two cards are connected with the same regulatory domain if CRDA had a delayed response then cfg80211's own set regulatory domain would still be the world regulatory domain. There was a bug on cfg80211's logic such that it assumed that once you pegged a request as the last request it was already the currently set regulatory domain. This would mean we would race setting a stale regulatory domain to secondary cards which had the same regulatory domain since the alpha2 would match. We fix this by processing each regulatory request atomically, and only move on to the next one once we get it fully processed. In the case CRDA is not present we will simply world roam. This issue is only present when you have a slow system and the CRDA processing is delayed. Because of this it is not a known regression. Without this fix when a delay is present with CRDA the second card would end up with an intersected regulatory domain and not allow it to use the channels it really is designed for. When two cards with two different regulatory domains were inserted you'd end up rejecting the second card's regulatory domain request. This fails with mac80211_hswim's regtest=2 (two requests, same alpha2) and regtest=3 (two requests, different alpha2) module parameter options. This was reproduced and tested against mac80211_hwsim using this CRDA delayer: #!/bin/bash echo $COUNTRY >> /tmp/log sleep 2 /sbin/crda.orig And these regulatory tests: modprobe mac80211_hwsim regtest=2 modprobe mac80211_hwsim regtest=3 Reported-by: Mark Mentovai <mark@moxienet.com> Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Tested-by: Mark Mentovai <mark@moxienet.com> Tested-by: Bruno Randolf <br1@einfach.org> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-11-22 15:48:51 -05:00
Jan Engelhardt	20a95a2169	netns: let net_generic take pointer-to-const args This commit is same in nature as v2.6.37-rc1-755-g3654654; the network namespace itself is not modified when calling net_generic, so the parameter can be const. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-11-21 10:05:10 -08:00
David S. Miller	24912420e9	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/bonding/bond_main.c net/core/net-sysfs.c net/ipv6/addrconf.c	2010-11-19 13:13:47 -08:00
David S. Miller	07bfa524d4	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6	2010-11-18 11:56:09 -08:00
Bruno Randolf	86107fd170	nl80211/mac80211: Report signal average Extend nl80211 to report an exponential weighted moving average (EWMA) of the signal value. Since the signal value usually fluctuates between different packets, an average can be more useful than the value of the last packet. This uses the recently added generic EWMA library function. Signed-off-by: Bruno Randolf <br1@einfach.org> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-11-18 14:22:20 -05:00
Tetsuo Handa	ef22b7b65f	net: Fix duplicate volatile warning. jiffies is defined as "volatile". extern unsigned long volatile __jiffy_data jiffies; ACCESS_ONCE() uses "volatile". As a result, some compilers warn duplicate `volatile' for ACCESS_ONCE(jiffies). Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-11-18 09:40:04 -08:00
Johannes Berg	4bce22b9b8	mac80211: defines for AC numbers In many places we've just hardcoded the AC numbers -- which is a relic from the original mac80211 (d80211). Add constants for them so we know what we're talking about. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-11-17 16:19:31 -05:00
Changli Gao	5811662b15	net: use the macros defined for the members of flowi Use the macros defined for the members of flowi to clean the code up. Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-11-17 12:27:45 -08:00
Thomas Graf	f8ff182c71	rtnetlink: Link address family API Each net_device contains address family specific data such as per device settings and statistics. We already expose this data via procfs/sysfs and partially netlink. The netlink method requires the requester to send one RTM_GETLINK request for each address family it wishes to receive data of and then merge this data itself. This patch implements a new API which combines all address family specific link data in a new netlink attribute IFLA_AF_SPEC. IFLA_AF_SPEC contains a sequence of nested attributes, one for each address family which in turn defines the structure of its own attribute. Example: [IFLA_AF_SPEC] = { [AF_INET] = { [IFLA_INET_CONF] = ..., }, [AF_INET6] = { [IFLA_INET6_FLAGS] = ..., [IFLA_INET6_CONF] = ..., } } The API also allows for address families to implement a function which parses the IFLA_AF_SPEC attribute sent by userspace to implement address family specific link options. Signed-off-by: Thomas Graf <tgraf@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-11-17 11:28:24 -08:00
Felix Fietkau	8f0729b16a	mac80211: add support for setting the ad-hoc multicast rate Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-11-16 16:39:08 -05:00
Felix Fietkau	885a46d0f7	cfg80211: add support for setting the ad-hoc multicast rate Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-11-16 16:39:08 -05:00
Juuso Oikarinen	a619a4c0e1	mac80211: Add function to get probe request template for current AP Chipsets with hardware based connection monitoring need to autonomically send directed probe-request frames to the AP (in the event of beacon loss, for example.) For the hardware to be able to do this, it requires a template for the frame to transmit to the AP, filled in with the BSSID and SSID of the AP, but also the supported rate IE's. This patch adds a function to mac80211, which allows the hardware driver to fetch this template after association, so it can be configured to the hardware. Signed-off-by: Juuso Oikarinen <juuso.oikarinen@nokia.com> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-11-16 16:37:08 -05:00
Bruno Randolf	15d9675321	mac80211: Add antenna configuration Allow antenna configuration by calling driver's function for it. We disallow antenna configuration if the wiphy is already running, mainly to make life easier for 802.11n drivers which need to recalculate HT capabilites. Signed-off-by: Bruno Randolf <br1@einfach.org> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-11-16 16:37:05 -05:00
Bruno Randolf	afe0cbf875	cfg80211: Add nl80211 antenna configuration Allow setting of TX and RX antennas configuration via nl80211. The antenna configuration is defined as a bitmap of allowed antennas to use. This API can be used to mask out antennas which are not attached or should not be used for other reasons like regulatory concerns or special setups. Separate bitmaps are used for RX and TX to allow configuring different antennas for receiving and transmitting. Each bitmap is 32 bit long, each bit representing one antenna, starting with antenna 1 at the first bit. If an antenna bit is set, this means the driver is allowed to use this antenna for RX or TX respectively; if the bit is not set the hardware is not allowed to use this antenna. Using bitmaps has the benefit of allowing for a flexible configuration interface which can support many different configurations and which can be used for 802.11n as well as non-802.11n devices. Instead of relying on some hardware specific assumptions, drivers can use this information to know which antennas are actually attached to the system and derive their capabilities based on that. 802.11n devices should enable or disable chains, based on which antennas are present (If all antennas belonging to a particular chain are disabled, the entire chain should be disabled). HT capabilities (like STBC, TX Beamforming, Antenna selection) should be calculated based on the available chains after applying the antenna masks. Should a 802.11n device have diversity antennas attached to one of their chains, diversity can be enabled or disabled based on the antenna information. Non-802.11n drivers can use the antenna masks to select RX and TX antennas and to enable or disable antenna diversity. While covering chainmasks for 802.11n and the standard "legacy" modes "fixed antenna 1", "fixed antenna 2" and "diversity" this API also allows more rare, but useful configurations as follows: 1) Send on antenna 1, receive on antenna 2 (or vice versa). This can be used to have a low gain antenna for TX in order to keep within the regulatory constraints and a high gain antenna for RX in order to receive weaker signals ("speak softly, but listen harder"). This can be useful for building long-shot outdoor links. Another usage of this setup is having a low-noise pre-amplifier on antenna 1 and a power amplifier on the other antenna. This way transmit noise is mostly kept out of the low noise receive channel. (This would be bitmaps: tx 1 rx 2). 2) Another similar setup is: Use RX diversity on both antennas, but always send on antenna 1. Again that would allow us to benefit from a higher gain RX antenna, while staying within the legal limits. (This would be: tx 0 rx 3). 3) And finally there can be special experimental setups in research and development even with pre 802.11n hardware where more than 2 antennas are available. It's good to keep the API simple, yet flexible. Signed-off-by: Bruno Randolf <br1@einfach.org> -- v7: Made bitmasks 32 bit wide and rebased to latest wireless-testing. Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-11-16 16:37:05 -05:00
Arik Nemtsov	f23a478075	mac80211: support hardware TX fragmentation offload The lower driver is notified when the fragmentation threshold changes and upon a reconfig of the interface. If the driver supports hardware TX fragmentation, don't fragment packets in the stack. Signed-off-by: Arik Nemtsov <arik@wizery.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-11-16 16:37:04 -05:00
Eric Dumazet	b178bb3dfc	net: reorder struct sock fields Right now, fields in struct sock are not optimally ordered, because each path (RX softirq, TX completion, RX user, TX user) has to touch fields that are contained in many different cache lines. The really critical thing is to shrink number of cache lines that are used at RX softirq time : CPU handling softirqs for a device can receive many frames per second for many sockets. If load is too big, we can drop frames at NIC level. RPS or multiqueue cards can help, but better reduce latency if possible. This patch starts with UDP protocol, then additional patches will try to reduce latencies of other ones as well. At RX softirq time, fields of interest for UDP protocol are : (not counting ones in inet struct for the lookup) Read/Written: sk_refcnt (atomic increment/decrement) sk_rmem_alloc & sk_backlog.len (to check if there is room in queues) sk_receive_queue sk_backlog (if socket locked by user program) sk_rxhash sk_forward_alloc sk_drops Read only: sk_rcvbuf (sk_rcvqueues_full()) sk_filter sk_wq sk_policy[0] sk_flags Additional notes : - sk_backlog has one hole on 64bit arches. We can fill it to save 8 bytes. - sk_backlog is used only if RX sofirq handler finds the socket while locked by user. - sk_rxhash is written only once per flow. - sk_drops is written only if queues are full Final layout : [1] One section grouping all read/write fields, but placing rxhash and sk_backlog at the end of this section. [2] One section grouping all read fields in RX handler (sk_filter, sk_rcv_buf, sk_wq) [3] Section used by other paths I'll post a patch on its own to put sk_refcnt at the end of struct sock_common so that it shares same cache line than section [1] New offsets on 64bit arch : sizeof(struct sock)=0x268 offsetof(struct sock, sk_refcnt) =0x10 offsetof(struct sock, sk_lock) =0x48 offsetof(struct sock, sk_receive_queue)=0x68 offsetof(struct sock, sk_backlog)=0x80 offsetof(struct sock, sk_rmem_alloc)=0x80 offsetof(struct sock, sk_forward_alloc)=0x98 offsetof(struct sock, sk_rxhash)=0x9c offsetof(struct sock, sk_rcvbuf)=0xa4 offsetof(struct sock, sk_drops) =0xa0 offsetof(struct sock, sk_filter)=0xa8 offsetof(struct sock, sk_wq)=0xb0 offsetof(struct sock, sk_policy)=0xd0 offsetof(struct sock, sk_flags) =0xe0 Instead of : sizeof(struct sock)=0x270 offsetof(struct sock, sk_refcnt) =0x10 offsetof(struct sock, sk_lock) =0x50 offsetof(struct sock, sk_receive_queue)=0xc0 offsetof(struct sock, sk_backlog)=0x70 offsetof(struct sock, sk_rmem_alloc)=0xac offsetof(struct sock, sk_forward_alloc)=0x10c offsetof(struct sock, sk_rxhash)=0x128 offsetof(struct sock, sk_rcvbuf)=0x4c offsetof(struct sock, sk_drops) =0x16c offsetof(struct sock, sk_filter)=0x198 offsetof(struct sock, sk_wq)=0x88 offsetof(struct sock, sk_policy)=0x98 offsetof(struct sock, sk_flags) =0x130 Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-11-16 11:17:43 -08:00
Eric Dumazet	c31504dc0d	udp: use atomic_inc_not_zero_hint UDP sockets refcount is usually 2, unless an incoming frame is going to be queued in receive or backlog queue. Using atomic_inc_not_zero_hint() permits to reduce latency, because processor issues less memory transactions. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-11-16 11:17:43 -08:00
Jan Engelhardt	3654654f7a	netlink: let nlmsg and nla functions take pointer-to-const args The changed functions do not modify the NL messages and/or attributes at all. They should use const (similar to strchr), so that callers which have a const nlmsg/nlattr around can make use of them without casting. While at it, constify a data array. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-11-16 09:52:32 -08:00
David S. Miller	b5e4156743	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6	2010-11-16 09:17:12 -08:00
Jussi Kivilinna	309075cf08	cfg80211: fix WIPHY_FLAG_IBSS_RSN bit WIPHY_FLAG_IBSS_RSN is BIT(7) as is WIPHY_FLAG_CONTROL_PORT_PROTOCOL. Change to BIT(8). Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-11-15 15:00:42 -05:00
Arnd Hannemann	62370e2b93	b43legacy: Fix compile on ARM architecture When b43legacy is compiled on the arm platform, the following errors are seen: CC [M] drivers/net/wireless/b43legacy/xmit.o In file included from include/net/dst.h:11, from drivers/net/wireless/b43legacy/xmit.c:31: include/net/dst_ops.h:28: error: expected ':', ',', ';', '}' or '__attribute__' before '____cacheline_aligned_in_smp' include/net/dst_ops.h: In function 'dst_entries_get_fast': include/net/dst_ops.h:33: error: 'struct dst_ops' has no member named 'pcpuc_entries' include/net/dst_ops.h: In function 'dst_entries_get_slow': include/net/dst_ops.h:41: error: 'struct dst_ops' has no member named 'pcpuc_entries' include/net/dst_ops.h: In function 'dst_entries_add': include/net/dst_ops.h:49: error: 'struct dst_ops' has no member named 'pcpuc_entries' include/net/dst_ops.h: In function 'dst_entries_init': include/net/dst_ops.h:55: error: 'struct dst_ops' has no member named 'pcpuc_entries' include/net/dst_ops.h: In function 'dst_entries_destroy': include/net/dst_ops.h:60: error: 'struct dst_ops' has no member named 'pcpuc_entries' make[4]: * [drivers/net/wireless/b43legacy/xmit.o] Error 1 make[3]: * [drivers/net/wireless/b43legacy] Error 2 make[2]: * [drivers/net/wireless] Error 2 make[1]: * [drivers/net] Error 2 make: *** [drivers] Error 2 The cause is a missing include of <linux/cache.h>, which is present for i386 and x86_64 architectures, but not for arm. Signed-off-by: Arnd Hannemann <arnd@arndnet.de> Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Cc: Stable <stable@kernel.org> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-11-15 15:00:42 -05:00
Joe Perches	d577f1ccdd	include/net/caif/cfctrl.h: Remove unnecessary semicolons Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Sjur Braendeland <sjur.brandeland@stericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-11-15 11:07:16 -08:00
Timo Teräs	cc9ff19da9	xfrm: use gre key as flow upper protocol info The GRE Key field is intended to be used for identifying an individual traffic flow within a tunnel. It is useful to be able to have XFRM policy selector matches to have different policies for different GRE tunnels. Signed-off-by: Timo Teräs <timo.teras@iki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-11-15 10:44:04 -08:00
Luis R. Rodriguez	749b527b21	cfg80211: fix allowing country IEs for WIPHY_FLAG_STRICT_REGULATORY We should be enabling country IE hints for WIPHY_FLAG_STRICT_REGULATORY even if we haven't yet recieved regulatory domain hint for the driver if it needed one. Without this Country IEs are not passed on to drivers that have set WIPHY_FLAG_STRICT_REGULATORY, today this is just all Atheros chipset drivers: ath5k, ath9k, ar9170, carl9170. This was part of the original design, however it was completely overlooked... Cc: Easwar Krishnan <easwar.krishnan@atheros.com> Cc: stable@kernel.org Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-11-15 13:24:09 -05:00
David S. Miller	c25ecd0a21	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6	2010-11-14 11:57:05 -08:00
Linus Torvalds	9457b24a09	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (66 commits) can-bcm: fix minor heap overflow gianfar: Do not call device_set_wakeup_enable() under a spinlock ipv6: Warn users if maximum number of routes is reached. docs: Add neigh/gc_thresh3 and route/max_size documentation. axnet_cs: fix resume problem for some Ax88790 chip ipv6: addrconf: don't remove address state on ifdown if the address is being kept tcp: Don't change unlocked socket state in tcp_v4_err(). x25: Prevent crashing when parsing bad X.25 facilities cxgb4vf: add call to Firmware to reset VF State. cxgb4vf: Fail open if link_start() fails. cxgb4vf: flesh out PCI Device ID Table ... cxgb4vf: fix some errors in Gather List to skb conversion cxgb4vf: fix bug in Generic Receive Offload cxgb4vf: don't implement trivial (and incorrect) ndo_select_queue() ixgbe: Look inside vlan when determining offload protocol. bnx2x: Look inside vlan when determining checksum proto. vlan: Add function to retrieve EtherType from vlan packets. virtio-net: init link state correctly ucc_geth: Fix deadlock ucc_geth: Do not bring the whole IF down when TX failure. ...	2010-11-12 17:17:55 -08:00
Eric Dumazet	1d7138de87	igmp: RCU conversion of in_dev->mc_list in_dev->mc_list is protected by one rwlock (in_dev->mc_list_lock). This can easily be converted to a RCU protection. Writers hold RTNL, so mc_list_lock is removed, not replaced by a spinlock. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Cypher Wu <cypher.w@gmail.com> Cc: Américo Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-11-12 13:18:57 -08:00
David S. Miller	c753796769	ipv4: Make rt->fl.iif tests lest obscure. When we test rt->fl.iif against zero, we're seeing if it's an output or an input route. Make that explicit with some helper functions. Signed-off-by: David S. Miller <davem@davemloft.net>	2010-11-11 17:07:48 -08:00
Eric Dumazet	72cdd1d971	net: get rid of rtable->idev It seems idev field in struct rtable has no special purpose, but adding extra atomic ops. We hold refcounts on the device itself (using percpu data, so pretty cheap in current kernel). infiniband case is solved using dst.dev instead of idev->dev Removal of this field means routing without route cache is now using shared data, percpu data, and only potential contention is a pair of atomic ops on struct neighbour per forwarded packet. About 5% speedup on routing test. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Roland Dreier <rolandd@cisco.com> Cc: Sean Hefty <sean.hefty@intel.com> Cc: Hal Rosenstock <hal.rosenstock@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-11-11 10:29:40 -08:00
Eric Dumazet	46b13fc5c0	neigh: reorder struct neighbour It is important to move nud_state outside of the often modified cache line (because of refcnt), to reduce false sharing in neigh_event_send() This is a followup of commit `0ed8ddf404` (neigh: Protect neigh->ha[] with a seqlock) This gives a 7% speedup on routing test with IP route cache disabled. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-11-11 10:29:40 -08:00
Eric Dumazet	8d987e5c75	net: avoid limits overflow Robin Holt tried to boot a 16TB machine and found some limits were reached : sysctl_tcp_mem[2], sysctl_udp_mem[2] We can switch infrastructure to use long "instead" of "int", now atomic_long_t primitives are available for free. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Reported-by: Robin Holt <holt@sgi.com> Reviewed-by: Robin Holt <holt@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-11-10 12:12:00 -08:00
Eric Dumazet	fc766e4c49	decnet: RCU conversion and get rid of dev_base_lock While tracking dev_base_lock users, I found decnet used it in dnet_select_source(), but for a wrong purpose: Writers only hold RTNL, not dev_base_lock, so readers must use RCU if they cannot use RTNL. Adds an rcu_head in struct dn_ifaddr and handle proper RCU management. Adds __rcu annotation in dn_route as well. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-11-08 13:50:08 -08:00
David S. Miller	d0eaeec8e8	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6	2010-11-08 12:38:28 -08:00
Paul Mundt	43b81f85eb	net dst: need linux/cache.h for ____cacheline_aligned_in_smp. Presently the b43legacy build fails on an sh randconfig: In file included from include/net/dst.h:12, from drivers/net/wireless/b43legacy/xmit.c:32: include/net/dst_ops.h:28: error: expected ':', ',', ';', '}' or '__attribute__' before '____cacheline_aligned_in_smp' include/net/dst_ops.h: In function 'dst_entries_get_fast': include/net/dst_ops.h:33: error: 'struct dst_ops' has no member named 'pcpuc_entries' include/net/dst_ops.h: In function 'dst_entries_get_slow': include/net/dst_ops.h:41: error: 'struct dst_ops' has no member named 'pcpuc_entries' include/net/dst_ops.h: In function 'dst_entries_add': include/net/dst_ops.h:49: error: 'struct dst_ops' has no member named 'pcpuc_entries' include/net/dst_ops.h: In function 'dst_entries_init': include/net/dst_ops.h:55: error: 'struct dst_ops' has no member named 'pcpuc_entries' include/net/dst_ops.h: In function 'dst_entries_destroy': include/net/dst_ops.h:60: error: 'struct dst_ops' has no member named 'pcpuc_entries' make[5]: * [drivers/net/wireless/b43legacy/xmit.o] Error 1 make[5]: * Waiting for unfinished jobs.... Signed-off-by: Paul Mundt <lethal@linux-sh.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-11-07 19:58:05 -08:00
Linus Torvalds	4b4a2700f4	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (41 commits) inet_diag: Make sure we actually run the same bytecode we audited. netlink: Make nlmsg_find_attr take a const nlmsghdr*. fib: fib_result_assign() should not change fib refcounts netfilter: ip6_tables: fix information leak to userspace cls_cgroup: Fix crash on module unload memory corruption in X.25 facilities parsing net dst: fix percpu_counter list corruption and poison overwritten rds: Remove kfreed tcp conn from list rds: Lost locking in loop connection freeing de2104x: fix panic on load atl1 : fix panic on load netxen: remove unused firmware exports caif: Remove noisy printout when disconnecting caif socket caif: SPI-driver bugfix - incorrect padding. caif: Bugfix for socket priority, bindtodev and dbg channel. smsc911x: Set Ethernet EEPROM size to supported device's size ipv4: netfilter: ip_tables: fix information leak to userland ipv4: netfilter: arp_tables: fix information leak to userland cxgb4vf: remove call to stop TX queues at load time. cxgb4: remove call to stop TX queues at load time. ...	2010-11-05 15:25:48 -07:00
Nelson Elhage	6b8c92ba07	netlink: Make nlmsg_find_attr take a const nlmsghdr*. This will let us use it on a nlmsghdr stored inside a netlink_callback. Signed-off-by: Nelson Elhage <nelhage@ksplice.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-11-04 12:26:34 -07:00
Sjur Brændeland	2c24a5d1b4	caif: SPI-driver bugfix - incorrect padding. Signed-off-by: Sjur Braendeland <sjur.brandeland@stericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-11-03 18:50:03 -07:00
André Carvalho de Matos	f2527ec436	caif: Bugfix for socket priority, bindtodev and dbg channel. Changes: o Bugfix: SO_PRIORITY for SOL_SOCKET could not be handled in caif's setsockopt, using the struct sock attribute priority instead. o Bugfix: SO_BINDTODEVICE for SOL_SOCKET could not be handled in caif's setsockopt, using the struct sock attribute ifindex instead. o Wrong assert statement for RFM layer segmentation. o CAIF Debug channels was not working over SPI, caif_payload_info containing padding info must be initialized. o Check on pointer before dereferencing when unregister dev in caif_dev.c Signed-off-by: Sjur Braendeland <sjur.brandeland@stericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-11-03 18:50:03 -07:00
Uwe Kleine-König	b595076a18	tree-wide: fix comment/printk typos "gadget", "through", "command", "maintain", "maintain", "controller", "address", "between", "initiali[zs]e", "instead", "function", "select", "already", "equal", "access", "management", "hierarchy", "registration", "interest", "relative", "memory", "offset", "already", Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2010-11-01 15:38:34 -04:00
Linus Torvalds	1840897ab5	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (34 commits) b43: Fix warning at drivers/mmc/core/core.c:237 in mmc_wait_for_cmd mac80211: fix failure to check kmalloc return value in key_key_read libertas: Fix sd8686 firmware reload ath9k: Fix incorrect access of rate flags in RC netfilter: xt_socket: Make tproto signed in socket_mt6_v1(). stmmac: enable/disable rx/tx in the core with a single write. net: atarilance - flags should be unsigned long netxen: fix kdump pktgen: Limit how much data we copy onto the stack. net: Limit socket I/O iovec total length to INT_MAX. USB: gadget: fix ethernet gadget crash in gether_setup fib: Fix fib zone and its hash leak on namespace stop cxgb3: Fix panic in free_tx_desc() cxgb3: fix crash due to manipulating queues before registration 8390: Don't oops on starting dev queue dccp ccid-2: Stop polling dccp: Refine the wait-for-ccid mechanism dccp: Extend CCID packet dequeueing interface dccp: Return-value convention of hc_tx_send_packet() igbvf: fix panic on load ...	2010-10-29 14:17:12 -07:00
Pavel Emelyanov	4aa2c466a7	fib: Fix fib zone and its hash leak on namespace stop When we stop a namespace we flush the table and free one, but the added fn_zone-s (and their hashes if grown) are leaked. Need to free. Tries releases all its stuff in the flushing code. Shame on us - this bug exists since the very first make-fib-per-net patches in 2.6.27 :( Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-10-28 10:27:03 -07:00
Venkateswararao Jujjuri (JV)	b165d60145	9p: Add datasync to client side TFSYNC/RFSYNC for dotl SYNOPSIS size[4] Tfsync tag[2] fid[4] datasync[4] size[4] Rfsync tag[2] DESCRIPTION The Tfsync transaction transfers ("flushes") all modified in-core data of file identified by fid to the disk device (or other permanent storage device) where that file resides. If datasync flag is specified data will be fleshed but does not flush modified metadata unless that metadata is needed in order to allow a subsequent data retrieval to be correctly handled. Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2010-10-28 09:08:49 -05:00
M. Mohan Kumar	329176cc2c	9p: Implement TREADLINK operation for 9p2000.L Synopsis size[4] TReadlink tag[2] fid[4] size[4] RReadlink tag[2] target[s] Description Readlink is used to return the contents of the symoblic link referred by fid. Contents of symboic link is returned as a response. target[s] - Contents of the symbolic link referred by fid. Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2010-10-28 09:08:48 -05:00
M. Mohan Kumar	1d769cd192	9p: Implement TGETLOCK Synopsis size[4] TGetlock tag[2] fid[4] getlock[n] size[4] RGetlock tag[2] getlock[n] Description TGetlock is used to test for the existence of byte range posix locks on a file identified by given fid. The reply contains getlock structure. If the lock could be placed it returns F_UNLCK in type field of getlock structure. Otherwise it returns the details of the conflicting locks in the getlock structure getlock structure: type[1] - Type of lock: F_RDLCK, F_WRLCK start[8] - Starting offset for lock length[8] - Number of bytes to check for the lock If length is 0, check for lock in all bytes starting at the location 'start' through to the end of file pid[4] - PID of the process that wants to take lock/owns the task in case of reply client[4] - Client id of the system that owns the process which has the conflicting lock Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2010-10-28 09:08:47 -05:00
M. Mohan Kumar	a099027c77	9p: Implement TLOCK Synopsis size[4] TLock tag[2] fid[4] flock[n] size[4] RLock tag[2] status[1] Description Tlock is used to acquire/release byte range posix locks on a file identified by given fid. The reply contains status of the lock request flock structure: type[1] - Type of lock: F_RDLCK, F_WRLCK, F_UNLCK flags[4] - Flags could be either of P9_LOCK_FLAGS_BLOCK - Blocked lock request, if there is a conflicting lock exists, wait for that lock to be released. P9_LOCK_FLAGS_RECLAIM - Reclaim lock request, used when client is trying to reclaim a lock after a server restrart (due to crash) start[8] - Starting offset for lock length[8] - Number of bytes to lock If length is 0, lock all bytes starting at the location 'start' through to the end of file pid[4] - PID of the process that wants to take lock client_id[4] - Unique client id status[1] - Status of the lock request, can be P9_LOCK_SUCCESS(0), P9_LOCK_BLOCKED(1), P9_LOCK_ERROR(2) or P9_LOCK_GRACE(3) P9_LOCK_SUCCESS - Request was successful P9_LOCK_BLOCKED - A conflicting lock is held by another process P9_LOCK_ERROR - Error while processing the lock request P9_LOCK_GRACE - Server is in grace period, it can't accept new lock requests in this period (except locks with P9_LOCK_FLAGS_RECLAIM flag set) Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2010-10-28 09:08:47 -05:00
Venkateswararao Jujjuri (JV)	920e65dc69	[9p] Introduce client side TFSYNC/RFSYNC for dotl. SYNOPSIS size[4] Tfsync tag[2] fid[4] size[4] Rfsync tag[2] DESCRIPTION The Tfsync transaction transfers ("flushes") all modified in-core data of file identified by fid to the disk device (or other permanent storage device) where that file resides. Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2010-10-28 09:08:47 -05:00
Arun R Bharadwaj	4f7ebe8072	net/9p: This patch implements TLERROR/RLERROR on the 9P client. Signed-off-by: Arun R Bharadwaj <arun@linux.vnet.ibm.com> Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2010-10-28 09:08:45 -05:00
Amarnath Revanna	a10c02036f	caif-u5500: Adding shared memory include Signed-off-by: Sjur Braendeland <sjur.brandeland@stericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-10-27 12:29:51 -07:00
Eric Dumazet	b914c4ea92	inetpeer: __rcu annotations Adds __rcu annotations to inetpeer (struct inet_peer)->avl_left (struct inet_peer)->avl_right This is a tedious cleanup, but removes one smp_wmb() from link_to_pool() since we now use more self documenting rcu_assign_pointer(). Note the use of RCU_INIT_POINTER() instead of rcu_assign_pointer() in all cases we dont need a memory barrier. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-10-27 11:37:33 -07:00
Eric Dumazet	7a2b03c517	fib_rules: __rcu annotates ctarget Adds __rcu annotation to (struct fib_rule)->ctarget Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-10-27 11:37:32 -07:00
Eric Dumazet	b33eab0844	tunnels: add __rcu annotations Add __rcu annotations to : (struct ip_tunnel)->prl (struct ip_tunnel_prl_entry)->next (struct xfrm_tunnel)->next struct xfrm_tunnel tunnel4_handlers struct xfrm_tunnel tunnel64_handlers And use appropriate rcu primitives to reduce sparse warnings if CONFIG_SPARSE_RCU_POINTER=y Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-10-27 11:37:32 -07:00
Eric Dumazet	e0ad61ec86	net: add __rcu annotations to protocol Add __rcu annotations to : struct net_protocol inet_protos struct net_protocol inet6_protos And use appropriate casts to reduce sparse warnings if CONFIG_SPARSE_RCU_POINTER=y Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-10-27 11:37:31 -07:00
Eric Dumazet	1c31720a74	ipv4: add __rcu annotations to routes.c Add __rcu annotations to : (struct dst_entry)->rt_next (struct rt_hash_bucket)->chain And use appropriate rcu primitives to reduce sparse warnings if CONFIG_SPARSE_RCU_POINTER=y Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-10-27 11:37:31 -07:00
Eric Dumazet	43a951e999	ipv4: add __rcu annotations to ip_ra_chain Add __rcu annotations to : (struct ip_ra_chain)->next struct ip_ra_chain *ip_ra_chain; And use appropriate rcu primitives. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-10-25 14:18:28 -07:00
Eric Dumazet	0d7da9ddd9	net: add __rcu annotation to sk_filter Add __rcu annotation to : (struct sock)->sk_filter And use appropriate rcu primitives to reduce sparse warnings if CONFIG_SPARSE_RCU_POINTER=y Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-10-25 14:18:28 -07:00
Eric Dumazet	1c87733d06	net_ns: add __rcu annotations add __rcu annotation to (struct net)->gen, and use rcu_dereference_protected() in net_assign_generic() Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-10-25 14:18:27 -07:00
Eric Dumazet	6f0bcf1525	tunnels: add _rcu annotations (struct ip6_tnl)->next is rcu protected : (struct ip_tunnel)->next is rcu protected : (struct xfrm6_tunnel)->next is rcu protected : add __rcu annotation and proper rcu primitives. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-10-25 13:09:45 -07:00
Eric Dumazet	3cc77ec74e	net/802: add __rcu annotations (struct net_device)->garp_port is rcu protected : (struct garp_port)->applicants is rcu protected : add __rcu annotation and proper rcu primitives. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-10-25 13:09:44 -07:00
Linus Torvalds	5f05647dd8	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1699 commits) bnx2/bnx2x: Unsupported Ethtool operations should return -EINVAL. vlan: Calling vlan_hwaccel_do_receive() is always valid. tproxy: use the interface primary IP address as a default value for --on-ip tproxy: added IPv6 support to the socket match cxgb3: function namespace cleanup tproxy: added IPv6 support to the TPROXY target tproxy: added IPv6 socket lookup function to nf_tproxy_core be2net: Changes to use only priority codes allowed by f/w tproxy: allow non-local binds of IPv6 sockets if IP_TRANSPARENT is enabled tproxy: added tproxy sockopt interface in the IPV6 layer tproxy: added udp6_lib_lookup function tproxy: added const specifiers to udp lookup functions tproxy: split off ipv6 defragmentation to a separate module l2tp: small cleanup nf_nat: restrict ICMP translation for embedded header can: mcp251x: fix generation of error frames can: mcp251x: fix endless loop in interrupt handler if CANINTF_MERRF is set can-raw: add msg_flags to distinguish local traffic 9p: client code cleanup rds: make local functions/variables static ... Fix up conflicts in net/core/dev.c, drivers/net/pcmcia/smc91c92_cs.c and drivers/net/wireless/ath/ath9k/debug.c as per David	2010-10-23 11:47:02 -07:00
Linus Torvalds	888a6f77e0	Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (52 commits) sched: fix RCU lockdep splat from task_group() rcu: using ACCESS_ONCE() to observe the jiffies_stall/rnp->qsmask value sched: suppress RCU lockdep splat in task_fork_fair net: suppress RCU lockdep false positive in sock_update_classid rcu: move check from rcu_dereference_bh to rcu_read_lock_bh_held rcu: Add advice to PROVE_RCU_REPEATEDLY kernel config parameter rcu: Add tracing data to support queueing models rcu: fix sparse errors in rcutorture.c rcu: only one evaluation of arg in rcu_dereference_check() unless sparse kernel: Remove undead ifdef CONFIG_DEBUG_LOCK_ALLOC rcu: fix _oddness handling of verbose stall warnings rcu: performance fixes to TINY_PREEMPT_RCU callback checking rcu: upgrade stallwarn.txt documentation for CPU-bound RT processes vhost: add __rcu annotations rcu: add comment stating that list_empty() applies to RCU-protected lists rcu: apply TINY_PREEMPT_RCU read-side speedup to TREE_PREEMPT_RCU rcu: combine duplicate code, courtesy of CONFIG_PREEMPT_RCU rcu: Upgrade srcu_read_lock() docbook about SRCU grace periods rcu: document ways of stalling updates in low-memory situations rcu: repair code-duplication FIXMEs ...	2010-10-21 12:54:12 -07:00
David S. Miller	9941fb6276	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6	2010-10-21 08:21:34 -07:00
Patrick McHardy	3b1a1ce6f4	Merge branch 'for-patrick' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/lvs-test-2.6	2010-10-21 16:25:51 +02:00
Balazs Scheidler	3b9afb2991	tproxy: added IPv6 socket lookup function to nf_tproxy_core Signed-off-by: Balazs Scheidler <bazsi@balabit.hu> Signed-off-by: KOVACS Krisztian <hidden@balabit.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>	2010-10-21 16:12:14 +02:00
Balazs Scheidler	aa976fc011	tproxy: added udp6_lib_lookup function Just like with IPv4, we need access to the UDP hash table to look up local sockets, but instead of exporting the global udp_table, export a lookup function. Signed-off-by: Balazs Scheidler <bazsi@balabit.hu> Signed-off-by: KOVACS Krisztian <hidden@balabit.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>	2010-10-21 16:05:41 +02:00
Balazs Scheidler	e97c3e278e	tproxy: split off ipv6 defragmentation to a separate module Like with IPv4, TProxy needs IPv6 defragmentation but does not require connection tracking. Since defragmentation was coupled with conntrack, I split off the two, creating an nf_defrag_ipv6 module, similar to the already existing nf_defrag_ipv4. Signed-off-by: Balazs Scheidler <bazsi@balabit.hu> Signed-off-by: KOVACS Krisztian <hidden@balabit.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>	2010-10-21 16:03:43 +02:00
stephen hemminger	32a875adcd	9p: client code cleanup Make p9_client_version static since only used in one file. Remove p9_client_auth because it is defined but never used. Compile tested only. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-10-21 04:26:39 -07:00
Balazs Scheidler	093d282321	tproxy: fix hash locking issue when using port redirection in __inet_inherit_port() When __inet_inherit_port() is called on a tproxy connection the wrong locks are held for the inet_bind_bucket it is added to. __inet_inherit_port() made an implicit assumption that the listener's port number (and thus its bind bucket). Unfortunately, if you're using the TPROXY target to redirect skbs to a transparent proxy that assumption is not true anymore and things break. This patch adds code to __inet_inherit_port() so that it can handle this case by looking up or creating a new bind bucket for the child socket and updates callers of __inet_inherit_port() to gracefully handle __inet_inherit_port() failing. Reported by and original patch from Stephen Buck <stephen.buck@exinda.com>. See http://marc.info/?t=128169268200001&r=1&w=2 for the original discussion. Signed-off-by: KOVACS Krisztian <hidden@balabit.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>	2010-10-21 13:06:43 +02:00
Balazs Scheidler	6006db84a9	tproxy: add lookup type checks for UDP in nf_tproxy_get_sock_v4() Also, inline this function as the lookup_type is always a literal and inlining removes branches performed at runtime. Signed-off-by: Balazs Scheidler <bazsi@balabit.hu> Signed-off-by: KOVACS Krisztian <hidden@balabit.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>	2010-10-21 12:47:34 +02:00
Balazs Scheidler	106e4c26b1	tproxy: kick out TIME_WAIT sockets in case a new connection comes in with the same tuple Without tproxy redirections an incoming SYN kicks out conflicting TIME_WAIT sockets, in order to handle clients that reuse ports within the TIME_WAIT period. The same mechanism didn't work in case TProxy is involved in finding the proper socket, as the time_wait processing code looked up the listening socket assuming that the listener addr/port matches those of the established connection. This is not the case with TProxy as the listener addr/port is possibly changed with the tproxy rule. Signed-off-by: Balazs Scheidler <bazsi@balabit.hu> Signed-off-by: KOVACS Krisztian <hidden@balabit.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>	2010-10-21 12:45:14 +02:00
Changli Gao	3511c9132f	net_sched: remove the unused parameter of qdisc_create_dflt() The first parameter dev isn't in use in qdisc_create_dflt(). Signed-off-by: Changli Gao <xiaosuo@gmail.com> Acked-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-10-21 03:09:47 -07:00
stephen hemminger	1c4c40c42d	xfrm: make xfrm_bundle_ok local Only used in one place. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-10-21 03:09:46 -07:00
stephen hemminger	8d8a0b1cc2	rtnetlink: remove rtnl_kill_links The function rtnl_kill_links is defined but never used. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-10-21 03:09:45 -07:00
stephen hemminger	6f747aca5e	xfrm6: make xfrm6_tunnel_free_spi local Function only defined and used in one file. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-10-21 03:09:45 -07:00
Julian Anastasov	0d79641a96	ipvs: provide address family for debugging As skb->protocol is not valid in LOCAL_OUT add parameter for address family in packet debugging functions. Even if ports are not present in AH and ESP change them to use ip_vs_tcpudp_debug_packet to show at least valid addresses as before. This patch removes the last user of skb->protocol in IPVS. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>	2010-10-21 11:04:43 +02:00
Julian Anastasov	fc60476761	ipvs: changes for local real server This patch deals with local real servers: - Add support for DNAT to local address (different real server port). It needs ip_vs_out hook in LOCAL_OUT for both families because skb->protocol is not set for locally generated packets and can not be used to set 'af'. - Skip packets in ip_vs_in marked with skb->ipvs_property because ip_vs_out processing can be executed in LOCAL_OUT but we still have the conn_out_get check in ip_vs_in. - Ignore packets with inet->nodefrag from local stack - Require skb_dst(skb) != NULL because we use it to get struct net - Add support for changing the route to local IPv4 stack after DNAT depending on the source address type. Local client sets output route and the remote client sets input route. It looks like IPv6 does not need such rerouting because the replies use addresses from initial incoming header, not from skb route. - All transmitters now have strict checks for the destination address type: redirect from non-local address to local real server requires NAT method, local address can not be used as source address when talking to remote real server. - Now LOCALNODE is not set explicitly as forwarding method in real server to allow the connections to provide correct forwarding method to the backup server. Not sure if this breaks tools that expect to see 'Local' real server type. If needed, this can be supported with new flag IP_VS_DEST_F_LOCAL. Now it should be possible connections in backup that lost their fwmark information during sync to be forwarded properly to their daddr, even if it is local address in the backup server. By this way backup could be used as real server for DR or TUN, for NAT there are some restrictions because tuple collisions in conntracks can create problems for the traffic. - Call ip_vs_dst_reset when destination is updated in case some real server IP type is changed between local and remote. [ horms@verge.net.au: removed trailing whitespace ] Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>	2010-10-21 11:03:46 +02:00
Julian Anastasov	190ecd27cd	ipvs: do not schedule conns from real servers This patch is needed to avoid scheduling of packets from local real server when we add ip_vs_in in LOCAL_OUT hook to support local client. Currently, when ip_vs_in can not find existing connection it tries to create new one by calling ip_vs_schedule. The default indication from ip_vs_schedule was if connection was scheduled to real server. If real server is not available we try to use the bypass forwarding method or to send ICMP error. But in some cases we do not want to use the bypass feature. So, add flag 'ignored' to indicate if the scheduler ignores this packet. Make sure we do not create new connections from replies. We can hit this problem for persistent services and local real server when ip_vs_in is added to LOCAL_OUT hook to handle local clients. Also, make sure ip_vs_schedule ignores SYN packets for Active FTP DATA from local real server. The FTP DATA connection should be created on SYN+ACK from client to assign correct connection daddr. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>	2010-10-21 10:50:41 +02:00
Julian Anastasov	cf356d69db	ipvs: switch to notrack mode Change skb->ipvs_property semantic. This is preparation to support ip_vs_out processing in LOCAL_OUT. ipvs_property=1 will be used to avoid expensive lookups for traffic sent by transmitters. Now when conntrack support is not used we call ip_vs_notrack method to avoid problems in OUTPUT and POST_ROUTING hooks instead of exiting POST_ROUTING as before. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>	2010-10-21 10:50:20 +02:00
Julian Anastasov	8b27b10f58	ipvs: optimize checksums for apps Avoid full checksum calculation for apps that can provide info whether csum was broken after payload mangling. For now only ip_vs_ftp mangles payload and it updates the csum, so the full recalculation is avoided for all packets. Add CHECKSUM_UNNECESSARY for snat_handler (TCP and UDP). It is needed to support SNAT from local address for the case when csum is fully recalculated. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>	2010-10-21 10:50:02 +02:00
David S. Miller	5eeaa2db16	Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6	2010-10-20 01:59:48 -07:00
Hans Schillstrom	714f095f74	ipvs: IPv6 tunnel mode IPv6 encapsulation uses a bad source address for the tunnel. i.e. VIP will be used as local-addr and encap. dst addr. Decapsulation will not accept this. Example LVS (eth1 2003::2:0:1/96, VIP 2003::2:0:100) (eth0 2003::1:0:1/96) RS (ethX 2003::1:0:5/96) tcpdump 2003::2:0:100 > 2003::1:0:5: IP6 (hlim 63, next-header TCP (6) payload length: 40) 2003::3:0:10.50991 > 2003::2:0:100.http: Flags [S], cksum 0x7312 (correct), seq 3006460279, win 5760, options [mss 1440,sackOK,TS val 1904932 ecr 0,nop,wscale 3], length 0 In Linux IPv6 impl. you can't have a tunnel with an any cast address receiving packets (I have not tried to interpret RFC 2473) To have receive capabilities the tunnel must have: - Local address set as multicast addr or an unicast addr - Remote address set as an unicast addr. - Loop back addres or Link local address are not allowed. This causes us to setup a tunnel in the Real Server with the LVS as the remote address, here you can't use the VIP address since it's used inside the tunnel. Solution Use outgoing interface IPv6 address (match against the destination). i.e. use ip6_route_output() to look up the route cache and then use ipv6_dev_get_saddr(...) to set the source address of the encapsulated packet. Additionally, cache the results in new destination fields: dst_cookie and dst_saddr and properly check the returned dst from ip6_route_output. We now add xfrm_lookup call only for the tunneling method where the source address is a local one. Signed-off-by:Hans Schillstrom <hans.schillstrom@ericsson.com> Signed-off-by: Patrick McHardy <kaber@trash.net>	2010-10-19 10:38:48 +02:00
Pablo Neira Ayuso	ebbf41df4a	netfilter: ctnetlink: add expectation deletion events This patch allows to listen to events that inform about expectations destroyed. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net>	2010-10-19 10:19:06 +02:00
Eric Dumazet	8e602ce298	netns: reorder fields in struct net In a network bench, I noticed an unfortunate false sharing between 'loopback_dev' and 'count' fields in "struct net". 'count' is written each time a socket is created or destroyed, while loopback_dev might be often read in routing code. Move loopback_dev in a read mostly section of "struct net" Note: struct netns_xfrm is cache line aligned on SMP. (It contains a "struct dst_ops") Move it at the end to avoid holes, and reduce sizeof(struct net) by 128 bytes on ia32. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-10-17 13:49:14 -07:00
stephen hemminger	31e3c3f6f1	tipc: cleanup function namespace Do some cleanups of TIPC based on make namespacecheck 1. Don't export unused symbols 2. Eliminate dead code 3. Make functions and variables local 4. Rename buf_acquire to tipc_buf_acquire since it is used in several files Compile tested only. This make break out of tree kernel modules that depend on TIPC routines. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Jon Maloy <jon.maloy@ericsson.com> Acked-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-10-16 11:13:24 -07:00
John W. Linville	c64557d666	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 into for-davem	2010-10-15 16:11:56 -04:00
John W. Linville	1a63c353c8	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/padovan/bluetooth-next-2.6 into for-davem	2010-10-15 16:00:02 -04:00
Kumar Sanghvi	b3d6255388	Phonet: 'connect' socket implementation for Pipe controller Based on suggestion by Rémi Denis-Courmont to implement 'connect' for Pipe controller logic, this patch implements 'connect' socket call for the Pipe controller logic. The patch does following:- - Removes setsockopts for PNPIPE_CREATE and PNPIPE_DESTROY - Adds setsockopt for setting the Pipe handle value - Implements connect socket call - Updates the Pipe controller logic User-space should now follow below sequence with Pipe controller:- -socket -bind -setsockopt for PNPIPE_PIPE_HANDLE -connect -setsockopt for PNPIPE_ENCAP_IP -setsockopt for PNPIPE_ENABLE GPRS/3G data has been tested working fine with this. Signed-off-by: Kumar Sanghvi <kumar.sanghvi@stericsson.com> Acked-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-10-13 14:40:34 -07:00
Johannes Berg	7be5086d4c	mac80211: add probe request filter flag Using the frame registration notification, we can see when probe requests are requested and notify the low-level driver via filtering. The flag is also set in AP and IBSS modes. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-10-13 15:45:22 -04:00
Johannes Berg	271733cf84	cfg80211: notify drivers about frame registrations Drivers may need to adjust their filters according to frame registrations, so notify them about them. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-10-13 15:45:22 -04:00
Andrei Emeltchenko	534c92fde7	Bluetooth: clean up rfcomm code Remove dead code and unused rfcomm thread events Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@nokia.com> Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>	2010-10-12 12:44:53 -03:00
Mat Martineau	796c86eec8	Bluetooth: Add common code for stream-oriented recvmsg() This commit adds a bt_sock_stream_recvmsg() function for use by any Bluetooth code that uses SOCK_STREAM sockets. This code is copied from rfcomm_sock_recvmsg() with minimal modifications to remove RFCOMM-specific functionality and improve readability. L2CAP (with the SOCK_STREAM socket type) and RFCOMM have common needs when it comes to reading data. Proper stream read semantics require that applications can read from a stream one byte at a time and not lose any data. The RFCOMM code already operated on and pulled data from the underlying L2CAP socket, so very few changes were required to make the code more generic for use with non-RFCOMM data over L2CAP. Applications that need more awareness of L2CAP frame boundaries are still free to use SOCK_SEQPACKET sockets, and may verify that they connection did not fall back to basic mode by calling getsockopt(). Signed-off-by: Mat Martineau <mathewm@codeaurora.org> Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>	2010-10-12 12:44:51 -03:00
David Vrabel	8f1e174223	Bluetooth: HCI devices are either BR/EDR or AMP radios HCI transport drivers may not know what type of radio an AMP device has so only say whether they're BR/EDR or AMP devices. Signed-off-by: David Vrabel <david.vrabel@csr.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>	2010-10-12 12:44:51 -03:00
Eric Dumazet	e37ef961e5	neigh: reorder struct neighbour fields Le mardi 12 octobre 2010 à 00:02 +0200, Eric Dumazet a écrit : > Here is the followup patch. > > Thanks ! > Oops, this was an old version, the up2date ones also took care of "used" field. I guess its time for a sleep, sorry again. [PATCH net-next V2] neigh: reorder struct neighbour fields (refcnt) and (ha_lock, ha, used, dev, output, ops, primary_key) should be placed on a separate cache lines. refcnt can be often written, while other fields are mostly read. This gave me good result on stress test : before: real 0m45.570s user 0m15.525s sys 9m56.669s After: real 0m41.841s user 0m15.261s sys 8m45.949s Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-10-11 16:09:14 -07:00
Eric Dumazet	fc66f95c68	net dst: use a percpu_counter to track entries struct dst_ops tracks number of allocated dst in an atomic_t field, subject to high cache line contention in stress workload. Switch to a percpu_counter, to reduce number of time we need to dirty a central location. Place it on a separate cache line to avoid dirtying read only fields. Stress test : (Sending 160.000.000 UDP frames, IP route cache disabled, dual E5540 @2.53GHz, 32bit kernel, FIB_TRIE, SLUB/NUMA) Before: real 0m51.179s user 0m15.329s sys 10m15.942s After: real 0m45.570s user 0m15.525s sys 9m56.669s With a small reordering of struct neighbour fields, subject of a following patch, (to separate refcnt from other read mostly fields) real 0m41.841s user 0m15.261s sys 8m45.949s Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-10-11 13:06:53 -07:00
Eric Dumazet	0ed8ddf404	neigh: Protect neigh->ha[] with a seqlock Add a seqlock in struct neighbour to protect neigh->ha[], and avoid dirtying neighbour in stress situation (many different flows / dsts) Dirtying takes place because of read_lock(&n->lock) and n->used writes. Switching to a seqlock, and writing n->used only on jiffies changes permits less dirtying. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-10-11 12:54:04 -07:00
David S. Miller	d122179a3c	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: net/core/ethtool.c	2010-10-11 12:30:34 -07:00
Felix Fietkau	8610c29a2c	cfg80211: add channel utilization stats to the survey command Using these, user space can calculate a relative channel utilization with arbitrary intervals by regularly taking snapshots of the survey results. Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-10-11 15:04:20 -04:00
Ben Greear	5a5c731aa5	wireless: Set some stats used by /proc/net/wireless (wext) Some stats for /proc/net/wireless (and wext in general) are not being set. This patch addresses a few of those with values easily obtained from mac80211 core. Signed-off-by: Ben Greear <greearb@candelatech.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-10-11 15:04:19 -04:00
John W. Linville	e9a68707d7	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 into for-davem Conflicts: Documentation/feature-removal-schedule.txt drivers/net/wireless/ipw2x00/ipw2200.c	2010-10-08 15:39:28 -04:00
Johannes Berg	388ac775be	cfg80211: constify WDS address There's no need for the WDS peer address to not be const, so make it const. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-10-07 14:41:28 -04:00
Ingo Molnar	556ef63255	Merge commit 'v2.6.36-rc7' into core/rcu Merge reason: Update from -rc3 to -rc7. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-10-07 09:43:45 +02:00
Ingo Molnar	d4f8f217b8	Merge branch 'rcu/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-2.6-rcu into core/rcu	2010-10-07 09:43:11 +02:00
Eric Dumazet	767e97e1e0	neigh: RCU conversion of struct neighbour This is the second step for neighbour RCU conversion. (first was commit `d6bf7817` : RCU conversion of neigh hash table) neigh_lookup() becomes lockless, but still take a reference on found neighbour. (no more read_lock()/read_unlock() on tbl->lock) struct neighbour gets an additional rcu_head field and is freed after an RCU grace period. Future work would need to eventually not take a reference on neighbour for temporary dst (DST_NOCACHE), but this would need dst->_neighbour to use a noref bit like we did for skb->_dst. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-10-06 18:01:33 -07:00
Bruno Randolf	b206b4ef06	nl80211/mac80211: Add retry and failed transmission count to station info This information is already available in mac80211, we just need to export it via cfg80211 and nl80211. Signed-off-by: Bruno Randolf <br1@einfach.org> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-10-06 16:30:43 -04:00
Johannes Berg	e31b82136d	cfg80211/mac80211: allow per-station GTKs This adds API to allow adding per-station GTKs, updates mac80211 to support it, and also allows drivers to remove a key from hwaccel again when this may be necessary due to multiple GTKs. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-10-06 16:30:40 -04:00
Eric Dumazet	ebc0ffae5d	fib: RCU conversion of fib_lookup() fib_lookup() converted to be called in RCU protected context, no reference taken and released on a contended cache line (fib_clntref) fib_table_lookup() and fib_semantic_match() get an additional parameter. struct fib_info gets an rcu_head field, and is freed after an rcu grace period. Stress test : (Sending 160.000.000 UDP frames on same neighbour, IP route cache disabled, dual E5540 @2.53GHz, 32bit kernel, FIB_HASH) (about same results for FIB_TRIE) Before patch : real 1m31.199s user 0m13.761s sys 23m24.780s After patch: real 1m5.375s user 0m14.997s sys 15m50.115s Before patch Profile : 13044.00 15.4% __ip_route_output_key vmlinux 8438.00 10.0% dst_destroy vmlinux 5983.00 7.1% fib_semantic_match vmlinux 5410.00 6.4% fib_rules_lookup vmlinux 4803.00 5.7% neigh_lookup vmlinux 4420.00 5.2% _raw_spin_lock vmlinux 3883.00 4.6% rt_set_nexthop vmlinux 3261.00 3.9% _raw_read_lock vmlinux 2794.00 3.3% fib_table_lookup vmlinux 2374.00 2.8% neigh_resolve_output vmlinux 2153.00 2.5% dst_alloc vmlinux 1502.00 1.8% _raw_read_lock_bh vmlinux 1484.00 1.8% kmem_cache_alloc vmlinux 1407.00 1.7% eth_header vmlinux 1406.00 1.7% ipv4_dst_destroy vmlinux 1298.00 1.5% __copy_from_user_ll vmlinux 1174.00 1.4% dev_queue_xmit vmlinux 1000.00 1.2% ip_output vmlinux After patch Profile : 13712.00 15.8% dst_destroy vmlinux 8548.00 9.9% __ip_route_output_key vmlinux 7017.00 8.1% neigh_lookup vmlinux 4554.00 5.3% fib_semantic_match vmlinux 4067.00 4.7% _raw_read_lock vmlinux 3491.00 4.0% dst_alloc vmlinux 3186.00 3.7% neigh_resolve_output vmlinux 3103.00 3.6% fib_table_lookup vmlinux 2098.00 2.4% _raw_read_lock_bh vmlinux 2081.00 2.4% kmem_cache_alloc vmlinux 2013.00 2.3% _raw_spin_lock vmlinux 1763.00 2.0% __copy_from_user_ll vmlinux 1763.00 2.0% ip_output vmlinux 1761.00 2.0% ipv4_dst_destroy vmlinux 1631.00 1.9% eth_header vmlinux 1440.00 1.7% _raw_read_unlock_bh vmlinux Reference results, if IP route cache is enabled : real 0m29.718s user 0m10.845s sys 7m37.341s 25213.00 29.5% __ip_route_output_key vmlinux 9011.00 10.5% dst_release vmlinux 4817.00 5.6% ip_push_pending_frames vmlinux 4232.00 5.0% ip_finish_output vmlinux 3940.00 4.6% udp_sendmsg vmlinux 3730.00 4.4% __copy_from_user_ll vmlinux 3716.00 4.4% ip_route_output_flow vmlinux 2451.00 2.9% __xfrm_lookup vmlinux 2221.00 2.6% ip_append_data vmlinux 1718.00 2.0% _raw_spin_lock_bh vmlinux 1655.00 1.9% __alloc_skb vmlinux 1572.00 1.8% sock_wfree vmlinux 1345.00 1.6% kfree vmlinux Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-10-05 20:39:38 -07:00
Eric Dumazet	d6bf781712	net neigh: RCU conversion of neigh hash table David This is the first step for RCU conversion of neigh code. Next patches will convert hash_buckets[] and "struct neighbour" to RCU protected objects. Thanks [PATCH net-next] net neigh: RCU conversion of neigh hash table Instead of storing hash_buckets, hash_mask and hash_rnd in "struct neigh_table", a new structure is defined : struct neigh_hash_table { struct neighbour *hash_buckets; unsigned int hash_mask; __u32 hash_rnd; struct rcu_head rcu; }; And "struct neigh_table" has an RCU protected pointer to such a neigh_hash_table. This means the signature of (hash)() function changed: We need to add a third parameter with the actual hash_rnd value, since this is not anymore a neigh_table field. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-10-05 14:54:36 -07:00
Johannes Berg	ff4c92d85c	genetlink: introduce pre_doit/post_doit hooks Each family may have some amount of boilerplate locking code that applies to most, or even all, commands. This allows a family to handle such things in a more generic way, by allowing it to a) include private flags in each operation b) specify a pre_doit hook that is called, before an operation's doit() callback and may return an error directly, c) specify a post_doit hook that can undo locking or similar things done by pre_doit, and finally d) include two private pointers in each info struct passed between all these operations including doit(). (It's two because I'll need two in nl80211 -- can be extended.) Signed-off-by: Johannes Berg <johannes.berg@intel.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-10-05 13:35:30 -04:00
Helmut Schaa	78be49ec2a	mac80211: distinct between max rates and the number of rates the hw can report Some drivers cannot handle multiple retry rates specified by the rc algorithm but instead use their own retry table (for example rt2800). However, if such a device registers itself with a max_rates value of 1 the rc algorithm cannot make use of the extended information the device can provide about retried rates. On the other hand, if a device registers itself with a max_rates value > 1 the rc algorithm assumes that the device can handle multi rate retries. Fix this issue by introducing another hw parameter max_report_rates that can be set to a different value then max_rates to indicate if a device is capable of reporting more rates then specified in max_rates. Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com> Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-10-05 13:35:28 -04:00
Johannes Berg	ea229e6826	cfg80211: remove spurious __KERNEL__ ifdef The net/cfg80211.h header file isn't exported to userspace, so there's no need for any kind of __KERNEL__ protection in it. If it was exported, everything else in it would need protection as well, not just the logging stuff ... Cc:Joe Perches <joe@perches.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-10-05 13:35:23 -04:00
Felix Fietkau	17e5a80828	nl80211: allow drivers to indicate whether the survey data channel is in use Some user space applications only want to display survey data for the operating channel, however there is no API to get that yet. Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-10-05 13:35:22 -04:00
stephen hemminger	c61393ea83	ipv6: make __ipv6_isatap_ifid static Another exported symbol only used in one file Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-10-05 00:47:39 -07:00
stephen hemminger	1df9916e46	fib: fib_rules_cleanup can be static fib_rules_cleanup_ups is only defined and used in one place. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-10-05 00:47:39 -07:00
Patrick McHardy	eecc545856	netfilter: add missing xt_log.h file Forgot to add xt_log.h in commit `a8defca0` (netfilter: ipt_LOG: add bufferisation to call printk() once) Signed-off-by: Patrick McHardy <kaber@trash.net>	2010-10-04 23:24:21 +02:00
David S. Miller	21a180cda0	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: net/ipv4/Kconfig net/ipv4/tcp_timer.c	2010-10-04 11:56:38 -07:00
Stephen Hemminger	0c200d9353	netfilter: nf_nat: make find/put static The functions nf_nat_proto_find_get and nf_nat_proto_put are only used internally in nf_nat_core. This might break some out of tree NAT module. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: Patrick McHardy <kaber@trash.net>	2010-10-04 20:53:18 +02:00
Simon Horman	0d1e71b04a	IPVS: Allow configuration of persistence engines Allow the persistence engine of a virtual service to be set, edited and unset. This feature only works with the netlink user-space interface. Signed-off-by: Simon Horman <horms@verge.net.au> Acked-by: Julian Anastasov <ja@ssi.bg>	2010-10-04 22:45:24 +09:00
Simon Horman	8be67a6617	IPVS: management of persistence engine modules This is based heavily on the scheduler management code Signed-off-by: Simon Horman <horms@verge.net.au> Acked-by: Julian Anastasov <ja@ssi.bg>	2010-10-04 22:45:24 +09:00
Simon Horman	a3c918acd2	IPVS: Add persistence engine data to /proc/net/ip_vs_conn This shouldn't break compatibility with userspace as the new data is at the end of the line. I have confirmed that this doesn't break ipvsadm, the main (only?) user-space user of this data. Signed-off-by: Simon Horman <horms@verge.net.au> Acked-by: Julian Anastasov <ja@ssi.bg>	2010-10-04 22:45:24 +09:00
Simon Horman	85999283a2	IPVS: Add struct ip_vs_pe Signed-off-by: Simon Horman <horms@verge.net.au> Acked-by: Julian Anastasov <ja@ssi.bg>	2010-10-04 22:45:24 +09:00
Simon Horman	f11017ec2d	IPVS: Add struct ip_vs_conn_param Signed-off-by: Simon Horman <horms@verge.net.au> Acked-by: Julian Anastasov <ja@ssi.bg>	2010-10-04 22:45:24 +09:00
Eric Dumazet	c7d4426a98	net: introduce DST_NOCACHE flag While doing stress tests with IP route cache disabled, and multi queue devices, I noticed a very high contention on one rwlock used in neighbour code. When many cpus are trying to send frames (possibly using a high performance multiqueue device) to the same neighbour, they fight for the neigh->lock rwlock in order to call neigh_hh_init(), and fight on hh->hh_refcnt (a pair of atomic_inc/atomic_dec_and_test()) But we dont need to call neigh_hh_init() for dst that are used only once. It costs four atomic operations at least, on two contended cache lines, plus the high contention on neigh->lock rwlock. Introduce a new dst flag, DST_NOCACHE, that is set when dst was not inserted in route cache. With the stress test bench, sending 160000000 frames on one neighbour, results are : Before patch: real 2m28.406s user 0m11.781s sys 36m17.964s After patch: real 1m26.532s user 0m12.185s sys 20m3.903s Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-10-03 22:17:54 -07:00
John W. Linville	41f4a6f71f	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 into for-davem	2010-10-01 11:12:36 -04:00
Eric Dumazet	367e5e3769	neigh: reorder fields in struct neighbour On 64bit arches, there are two 32bit holes that we can remove. sizeof(struct neighbour) shrinks from 0xf8 to 0xf0 bytes Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-10-01 00:36:51 -07:00
Gustavo F. Padovan	e454c84464	Bluetooth: Fix deadlock in the ERTM logic The Enhanced Retransmission Mode(ERTM) is a realiable mode of operation of the Bluetooth L2CAP layer. Think on it like a simplified version of TCP. The problem we were facing here was a deadlock. ERTM uses a backlog queue to queue incomimg packets while the user is helding the lock. At some moment the sk_sndbuf can be exceeded and we can't alloc new skbs then the code sleep with the lock to wait for memory, that stalls the ERTM connection once we can't read the acknowledgements packets in the backlog queue to free memory and make the allocation of outcoming skb successful. This patch actually affect all users of bt_skb_send_alloc(), i.e., all L2CAP modes and SCO. We are safe against socket states changes or channels deletion while the we are sleeping wait memory. Checking for the sk->sk_err and sk->sk_shutdown make the code safe, since any action that can leave the socket or the channel in a not usable state set one of the struct members at least. Then we can check both of them when getting the lock again and return with the proper error if something unexpected happens. Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi> Signed-off-by: Ulisses Furquim <ulisses@profusion.mobi>	2010-09-30 12:19:35 -03:00
stephen hemminger	1b9f409293	tcp: tcp_enter_quickack_mode can be static Function only used in tcp_input.c Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-29 19:45:36 -07:00
stephen hemminger	a64de47c09	arp: remove unnecessary export of arp_broken_ops arp_broken_ops is only used in arp.c Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-29 19:45:35 -07:00
Tom Herbert	4465b46900	ipv4: Allow configuring subnets as local addresses This patch allows a host to be configured to respond to any address in a specified range as if it were local, without actually needing to configure the address on an interface. This is done through routing table configuration. For instance, to configure a host to respond to any address in 10.1/16 received on eth0 as a local address we can do: ip rule add from all iif eth0 lookup 200 ip route add local 10.1/16 dev lo proto kernel scope host src 127.0.0.1 table 200 This host is now reachable by any 10.1/16 address (route lookup on input for packets received on eth0 can find the route). On output, the rule will not be matched so that this host can still send packets to 10.1/16 (not sent on loopback). Presumably, external routing can be configured to make sense out of this. To make this work, we needed to modify the logic in finding the interface which is assigned a given source address for output (dev_ip_find). We perform a normal fib_lookup instead of just a lookup on the local table, and in the lookup we ignore the input interface for matching. This patch is useful to implement IP-anycast for subnets of virtual addresses. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-28 23:38:15 -07:00
Pablo Neira Ayuso	bc01befdcf	netfilter: ctnetlink: add support for user-space expectation helpers This patch adds the basic infrastructure to support user-space expectation helpers via ctnetlink and the netfilter queuing infrastructure NFQUEUE. Basically, this patch: * adds NF_CT_EXPECT_USERSPACE flag to identify user-space created expectations. I have also added a sanity check in __nf_ct_expect_check() to avoid that kernel-space helpers may create an expectation if the master conntrack has no helper assigned. * adds some branches to check if the master conntrack helper exists, otherwise we skip the code that refers to kernel-space helper such as the local expectation list and the expectation policy. * allows to set the timeout for user-space expectations with no helper assigned. * a list of expectations created from user-space that depends on ctnetlink (if this module is removed, they are deleted). * includes USERSPACE in the /proc output for expectations that have been created by a user-space helper. This patch also modifies ctnetlink to skip including the helper name in the Netlink messages if no kernel-space helper is set (since no user-space expectation has not kernel-space kernel assigned). You can access an example user-space FTP conntrack helper at: http://people.netfilter.org/pablo/userspace-conntrack-helpers/nf-ftp-helper-userspace-POC.tar.bz Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net>	2010-09-28 21:06:34 +02:00
Eric Dumazet	290b895e0b	tunnels: prepare percpu accounting Tunnels are going to use percpu for their accounting. They are going to use a new tstats field in net_device. skb_tunnel_rx() is changed to be a wrapper around __skb_tunnel_rx() IPTUNNEL_XMIT() is changed to be a wrapper around __IPTUNNEL_XMIT() Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-27 21:30:42 -07:00
Kumar Sanghvi	8d98efa84b	Phonet: Implement Pipe Controller to support Nokia Slim Modems Phonet stack assumes the presence of Pipe Controller, either in Modem or on Application Processing Engine user-space for the Pipe data. Nokia Slim Modems like WG2.5 used in ST-Ericsson U8500 platform do not implement Pipe controller in them. This patch adds Pipe Controller implemenation to Phonet stack to support Pipe data over Phonet stack for Nokia Slim Modems. Signed-off-by: Kumar Sanghvi <kumar.sanghvi@stericsson.com> Acked-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-27 21:30:41 -07:00
Ulrich Weber	fb0c5f0bc8	tproxy: check for transparent flag in ip_route_newports as done in ip_route_connect() Signed-off-by: Ulrich Weber <uweber@astaro.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-27 15:03:33 -07:00
Johannes Berg	554891e63a	mac80211: move packet flags into packet commit `8c0c709eea` Author: Johannes Berg <johannes@sipsolutions.net> Date: Wed Nov 25 17:46:15 2009 +0100 mac80211: move cmntr flag out of rx flags moved the CMNTR flag into the skb RX flags for some aggregation cleanups, but this was wrong since the optimisation this flag tried to make requires that it is kept across the processing of multiple interfaces -- which isn't true for flags in the skb. The patch not only broke the optimisation, it also introduced a bug: under some (common!) circumstances the flag will be set on an already freed skb! However, investigating this in more detail, I found that most of the flags that we set should be per packet, _except_ for this one, due to a-MPDU processing. Additionally, the flags used for processing (currently just this one) need to be reset before processing a new packet. Since we haven't actually seen bugs reported as a result of the wrong flags handling (which is not too surprising -- the only real bug case I can come up with is an a-MSDU contained in an a-MPDU), I'll make a different fix for rc. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-09-27 15:57:54 -04:00
Ben Greear	686b9cb994	mac80211/ath9k: Support AMPDU with multiple VIFs. The old ieee80211_find_sta_by_hw method didn't properly find VIFS when there was more than one per AP. This caused AMPDU logic in ath9k to get the wrong VIF when trying to account for transmitted SKBs. This patch changes ieee80211_find_sta_by_hw to take a localaddr argument to distinguish between VIFs with the same AP but different local addresses. The method name is changed to ieee80211_find_sta_by_ifaddr. Signed-off-by: Ben Greear <greearb@candelatech.com> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-09-27 15:57:45 -04:00
David S. Miller	e40051d134	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/qlcnic/qlcnic_init.c net/ipv4/ip_output.c	2010-09-27 01:03:03 -07:00
Neil Horman	2cc6d2bf3d	ipv6: add a missing unregister_pernet_subsys call Clean up a missing exit path in the ipv6 module init routines. In addrconf_init we call ipv6_addr_label_init which calls register_pernet_subsys for the ipv6_addr_label_ops structure. But if module loading fails, or if the ipv6 module is removed, there is no corresponding unregister_pernet_subsys call, which leaves a now-bogus address on the pernet_list, leading to oopses in subsequent registrations. This patch cleans up both the failed load path and the unload path. Tested by myself with good results. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> include/net/addrconf.h \| 1 + net/ipv6/addrconf.c \| 11 ++++++++--- net/ipv6/addrlabel.c \| 5 +++++ 3 files changed, 14 insertions(+), 3 deletions(-) Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-26 19:09:25 -07:00

1 2 3 4 5 ...

3703 Коммитов