WSL2-Linux-Kernel

Граф коммитов

Автор	SHA1	Сообщение	Дата
Johannes Berg	3832c287f1	mac80211: fix RX path My previous patch ("mac80211: remove mixed-cell and userspace MLME code") was too obvious to me, so obvious that a stupid bug crept in. The IBSS RX function must be invoked for IBSS, of course, not anything != IBSS. Reported-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Tested-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-27 20:13:19 -04:00
Johannes Berg	2b874e83c9	mac80211: rate control status only for controlled packets This patch changes mac80211 to not notify the rate control algorithm's tx_status() method when reporting status for a packet that didn't go through the rate control algorithm's get_rate() method. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-27 20:13:15 -04:00
Kalle Valo	04de838159	mac80211: add beacon filtering support Add IEEE80211_HW_BEACON_FILTERING flag so that driver inform that it supports beacon filtering. Drivers need to call the new function ieee80211_beacon_loss() to notify about beacon loss. Signed-off-by: Kalle Valo <kalle.valo@nokia.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-27 20:13:13 -04:00
Kalle Valo	a08c1c1ac0	cfg80211: add feature to hold bss In beacon filtering there needs to be a way to not expire the BSS even when no beacons are received. Add an interface to cfg80211 to hold BSS and make sure that it's not expired. Signed-off-by: Kalle Valo <kalle.valo@nokia.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-27 20:13:13 -04:00
Kalle Valo	9050bdd858	mac80211: disable power save when scanning When software scanning we need to disable power save so that all possible probe responses and beacons are received. For hardware scanning assume that hardware will take care of that and document that assumption. Signed-off-by: Kalle Valo <kalle.valo@nokia.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-27 20:13:12 -04:00
Kalle Valo	15b7b0629c	mac80211: track beacons separately from the rx path activity Separate beacon and rx path tracking in preparation for the beacon filtering support. At the same time change ieee80211_associated() to look a bit simpler. Probe requests are now sent only after IEEE80211_PROBE_IDLE_TIME, which is now set to 60 seconds. Signed-off-by: Kalle Valo <kalle.valo@nokia.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-27 20:13:12 -04:00
Kalle Valo	3cf335d527	mac80211: decrease execution of the associated timer Currently the timer is triggering every two seconds (IEEE80211_MONITORING_INTERVAL). Decrease the timer to only trigger during data idle periods to avoid waking up CPU unnecessary. The timer will still trigger during idle periods, that needs to be fixed later. There's also a functional change that probe requests are sent only when the data path is idle, earlier they were sent also while there was activity on the data path. This is also preparation for the beacon filtering support. Thanks to Johannes Berg for the idea. Signed-off-by: Kalle Valo <kalle.valo@nokia.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-27 20:13:12 -04:00
Johannes Berg	7986cf9581	mac80211: remove mixed-cell and userspace MLME code Neither can currently be set from userspace, so there's no regression potential, and neither will be supported from userspace since the new userspace APIs allow the SME, which is in userspace, to control all we need. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-27 20:13:08 -04:00
Johannes Berg	ac7f9cfa2c	cfg80211: accept no-op interface mode changes When somebody tries to set the interface mode to the existing mode, don't ask the driver but silently accept the setting. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-27 20:13:08 -04:00
Luis R. Rodriguez	86f04680df	cfg80211: remove code about country IE support with OLD_REG We had left in code to allow interested developers to add support for parsing country IEs when OLD_REG was enabled. This never happened and since we're going to remove OLD_REG lets just remove these comments and code for it. This code path was never being entered so this has no functional change. Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-27 20:13:07 -04:00
Luis R. Rodriguez	6ee7d33056	cfg80211: make regdom module parameter available oustide of OLD_REG It seems a few users are using this module parameter although its not recommended. People are finding it useful despite there being utilities for setting this in userspace. I'm not aware of any distribution using this though. Until userspace and distributions catch up with a default userspace automatic replacement (GeoClue integration would be nirvana) we copy the ieee80211_regdom module parameter from OLD_REG to the new reg code to help these users migrate. Users who are using the non-valid ISO / IEC 3166 alpha "EU" in their ieee80211_regdom module parameter and migrate to non-OLD_REG enabled system will world roam. This also schedules removal of this same ieee80211_regdom module parameter circa March 2010. Hope is by then nirvana is reached and users will abandoned the module parameter completely. Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-27 20:13:07 -04:00
Luis R. Rodriguez	cc0b6fe88e	cfg80211: fix incorrect assumption on last_request for 11d The incorrect assumption is the last regulatory request (last_request) is always a country IE when processing country IEs. Although this is true 99% of the time the first time this happens this could not be true. This fixes an oops in the branch check for the last_request when accessing drv_last_ie. The access was done under the assumption the struct won't be null. Note to stable: to port to 29 replace as follows, only 29 has country IE code: s\|NL80211_REGDOM_SET_BY_COUNTRY_IE\|REGDOM_SET_BY_COUNTRY_IE Cc: stable@kernel.org Reported-by: Quentin Armitage <Quentin@armitage.org.uk> Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-27 20:13:07 -04:00
Luis R. Rodriguez	2e097dc656	cfg80211: force last_request to be set for OLD_REG if regdom is EU Although EU is a bogus alpha2 we need to process the send request as our code depends on last_request being set. Cc: stable@kernel.org Reported-by: Quentin Armitage <Quentin@armitage.org.uk> Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-27 20:13:06 -04:00
Jouni Malinen	eec60b037a	nl80211: Check iftype in cfg80211 code We do not want to require all the drivers using cfg80211 to need to do this. In addition, make the error values consistent by using EOPNOTSUPP instead of semi-random assortment of errno values. Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-27 20:13:05 -04:00
Jouni Malinen	35a8efe1a6	nl80211: Check that netif_runnin is true in cfg80211 code We do not want to require all the drivers using cfg80211 to need to do this or to be prepared to handle these commands when the interface is down. Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-27 20:13:05 -04:00
Jouni Malinen	255e737eab	nl80211: Add more through validation of MLME command parameters Check that the used authentication type and reason code are valid here so that drivers/mac80211 do not need to care about this. In addition, remove the unnecessary validation of SSID attribute length which is taken care of by netlink policy. Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-27 20:13:04 -04:00
Jouni Malinen	65fc73ac4a	nl80211: Remove NL80211_CMD_SET_MGMT_EXTRA_IE The functionality that NL80211_CMD_SET_MGMT_EXTRA_IE provided can now be achieved with cleaner design by adding IE(s) into NL80211_CMD_TRIGGER_SCAN, NL80211_CMD_AUTHENTICATE, NL80211_CMD_ASSOCIATE, NL80211_CMD_DEAUTHENTICATE, and NL80211_CMD_DISASSOCIATE. Since this is a very recently added command and there are no known (or known planned) applications using NL80211_CMD_SET_MGMT_EXTRA_IE and taken into account how much extra complexity it adds to the IE processing we have now (and need to add in the future to fix IE order in couple of frames), it looks like the best option is to just remove the implementation of this command for now. The enum values themselves are left to avoid changing the nl80211 command or attribute numbers. Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-27 20:13:04 -04:00
Jouni Malinen	d7873cb9ab	mac80211: Fix memleak in nl80211 authentication on deinit This file was forgotten from the quilt patch that added MLME primitives, so the kfree on interface removal is missing. Fix this potential memleak by freeing the temporary Authentication frame IEs from SME when the interface is being removed. Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-27 20:13:04 -04:00
Johannes Berg	827b1fb44b	mac80211: resume properly, add suspend/resume test When mac80211 resumes, it currently doesn't reconfigure the interfaces entirely and also doesn't reconfigure BSS information -- fix this. Also, to be able to test this, add a debugfs file that just calls the suspend/resume code to see what happens when we go through that, without needing the time-consuming suspend/resume cycle. (Original version broke the build for CONFIG_PM=n. Define alternative functions for that situation. -- JWL) Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-27 20:13:03 -04:00
Jouni Malinen	636a5d3625	nl80211: Add MLME primitives to support external SME This patch adds new nl80211 commands to allow user space to request authentication and association (and also deauthentication and disassociation). The commands are structured to allow separate authentication and association steps, i.e., the interface between kernel and user space is similar to the MLME SAP interface in IEEE 802.11 standard and an user space application takes the role of the SME. The patch introduces MLME-AUTHENTICATE.request, MLME-{,RE}ASSOCIATE.request, MLME-DEAUTHENTICATE.request, and MLME-DISASSOCIATE.request primitives. The authentication and association commands request the actual operations in two steps (assuming the driver supports this; if not, separate authentication step is skipped; this could end up being a separate "connect" command). The initial implementation for mac80211 uses the current net/mac80211/mlme.c for actual sending and processing of management frames and the new nl80211 commands will just stop the current state machine from moving automatically from authentication to association. Future cleanup may move more of the MLME operations into cfg80211. The goal of this design is to provide more control of authentication and association process to user space without having to move the full MLME implementation. This should be enough to allow IEEE 802.11r FT protocol and 802.11s SAE authentication to be implemented. Obviously, this will also bring the extra benefit of not having to use WEXT for association requests with mac80211. An example implementation of a user space SME using the new nl80211 commands is available for wpa_supplicant. This patch is enough to get IEEE 802.11r FT protocol working with over-the-air mechanism (over-the-DS will need additional MLME primitives for handling the FT Action frames). Signed-off-by: Jouni Malinen <j@w1.fi> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-27 20:13:02 -04:00
Jouni Malinen	6039f6d23f	nl80211: Event notifications for MLME events Add new nl80211 event notifications (and a new multicast group, "mlme") for informing user space about received and processed Authentication, (Re)Association Response, Deauthentication, and Disassociation frames in station and IBSS modes (i.e., MLME SAP interface primitives MLME-AUTHENTICATE.confirm, MLME-ASSOCIATE.confirm, MLME-REASSOCIATE.confirm, MLME-DEAUTHENTICATE.indicate, and MLME-DISASSOCIATE.indication). The event data is encapsulated as the 802.11 management frame since we already have the frame in that format and it includes all the needed information. This is the initial step in providing MLME SAP interface for authentication and association with nl80211. In other words, kernel code will act as the MLME and a user space application can control it as the SME. Signed-off-by: Jouni Malinen <j@w1.fi> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-27 20:13:02 -04:00
Jouni Malinen	a299542e97	mac80211: Fix reassociation by not clearing previous BSSID We must not clear the previous BSSID when roaming to another AP within the same ESS for reassociation to be used properly. It is fine to clear this when the SSID changes, so let's move the code into ieee80211_sta_set_ssid(). Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-27 20:13:01 -04:00
Jouni Malinen	4b4698c443	mac80211: Fix a typo in assoc vs. reassoc check Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-27 20:13:01 -04:00
Kalle Valo	a9a6ffffd0	mac80211: don't drop nullfunc frames during software scan ieee80211_tx_h_check_assoc() was dropping everything else than probe requests during software scan. So the nullfunc frame with the power save bit was dropped and AP never received it. This meant that AP never buffered any frames for the station during software scan. Fix this by allowing to transmit both probe request and nullfunc frames during software scan. Tested with stlc45xx. Signed-off-by: Kalle Valo <kalle.valo@nokia.com> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-27 20:13:00 -04:00
Johannes Berg	3b85875a25	nl80211: rework locking When I added scanning to cfg80211, we got a lock dependency like this: rtnl --> cfg80211_mtx nl80211, on the other hand, has the reverse lock dependency: cfg80211_mtx --> rtnl which clearly is a bad idea. This patch reworks nl80211 to take these two locks in the other order to fix the possible, and easily triggerable, deadlock. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-27 20:12:56 -04:00
Sujith	722f069a6d	mac80211: Tear down aggregation sessions for suspend/resume When the driver has been notified with a STA_REMOVE, it tears down the internal ADDBA state. On resume, trying to initiate aggregation would fail because mac80211 has not cleared the operational state for that <TID,STA>. This can be fixed by tearing down the existing sessions on a suspend. Also, the driver can initiate a new BA session when suspend is in progress. This is fixed by marking the station as being in suspend state and denying ADDBA requests for such STAs. Signed-off-by: Sujith <Sujith.Manoharan@atheros.com> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-27 20:12:55 -04:00
Johannes Berg	7f0216a49b	mac80211: acquire sta_lock for station suspend/resume To avoid concurrent manipulations of the sta list (which shouldn't be possible at this point, but anyway) we need to hold the sta_lock around iterating the list. At the same time, we do not need to iterate the list at all if the driver doesn't want to be notified. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Acked-by: Bob Copeland <me@bobcopeland.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-27 20:12:53 -04:00
Johannes Berg	8fdc621dc7	nl80211: export supported commands This makes nl80211 export the supported commands (command groups) per wiphy so userspace has an idea what it can do -- this will be required reading for userspace when we introduce auth/assoc /or/ connect for older hardware that cannot separate auth and assoc. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-27 20:12:53 -04:00
Vasanthakumar Thiagarajan	ec30415f79	mac80211: Populate HT limitation with TKIP/WEP to the handler for SIOCSIWENCODE too Signed-off-by: Vasanthakumar Thiagarajan <vasanth@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-27 20:12:52 -04:00
Johannes Berg	aae89831df	wireless: radiotap updates Radiotap was updated to include a "bad PLCP" flag and standardise the "bad FCS" flag in the "flags" rather than "RX flags" field, this patch updates Linux to that standard. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-27 20:12:52 -04:00
Johannes Berg	25420604c8	mac80211: stop queues across suspend/resume Even though userland probably cannot submit packets, there might still be some coming, and that's no good when the driver doesn't expect them. Stop the queues across suspend/resume. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-27 20:12:52 -04:00
Johannes Berg	b5bde374f0	mac80211: fix warnings in ieee80211_if_config The last warning can never trigger, and the explicit AP_VLAN check is pointless if we move the config_interface check down, in practice config_interface is required anyway. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-27 20:12:52 -04:00
Helmut Schaa	11432379fd	mac80211: start pending scan after probe/auth/assoc timed out If a scan is queued in STA mode while the interface is in state direct probe, authenticate or associate the scan is delayed until the interface enters disabled or associated state. But in case of direct probe-, authentication- or association- timeout sta_work will not be scheduled anymore (without external trigger) and thus the pending scan is not executed and prevents a new scan from being triggered (-EBUSY). Fix this by queueing the sta work again after direct probe-, authentication- and association- timeout. Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-27 20:12:45 -04:00
Johannes Berg	176be728ee	mac80211: remove ieee80211_num_regular_queues This inline is useless and actually makes the code _longer_ rather than shorter. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-27 20:12:42 -04:00
Reinette Chatre	633e24ed95	cfg80211/nl80211: remove usage of CONFIG_NL80211 The scan capability added to cfg80211/nl80211 introduced a dependency on nl80211 by cfg80211. We can thus no longer have just cfg80211 without nl80211. Specifically, cfg80211_scan_done() calls nl80211_send_scan_aborted() or nl80211_send_scan_done(). Now we remove the option for user to select nl80211. It will always be compiled if user selects cfg80211. Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-27 20:12:42 -04:00
Alina Friedrichsen	fa56dddd67	mac80211: ieee80211_ibss_commit() cleanup Don't call ieee80211_sta_find_ibss() directly, like it's done in STA mode, so that the commit() call is more harmless respectively has less site-effects. Signed-off-by: Alina Friedrichsen <x-alina@gmx.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-27 20:12:41 -04:00
Linus Torvalds	3ae5080f4c	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (37 commits) fs: avoid I_NEW inodes Merge code for single and multiple-instance mounts Remove get_init_pts_sb() Move common mknod_ptmx() calls into caller Parse mount options just once and copy them to super block Unroll essentials of do_remount_sb() into devpts vfs: simple_set_mnt() should return void fs: move bdev code out of buffer.c constify dentry_operations: rest constify dentry_operations: configfs constify dentry_operations: sysfs constify dentry_operations: JFS constify dentry_operations: OCFS2 constify dentry_operations: GFS2 constify dentry_operations: FAT constify dentry_operations: FUSE constify dentry_operations: procfs constify dentry_operations: ecryptfs constify dentry_operations: CIFS constify dentry_operations: AFS ...	2009-03-27 16:23:12 -07:00
ideawu	abd91ee979	sunrpc/svc.c: Remove unused line 'rqstp->rq_server = serv;' in svc_process There is no need to set rqstp->rq_server to serv, while serv is initialized as rqstp->rq_server at previous line. And between these two lines, there is no change to rqstp->rq_server. Signed-off-by: ideawu <ideawu@163.com> Reviewed-by: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-03-27 19:15:21 -04:00
Al Viro	3ba13d179e	constify dentry_operations: rest Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-03-27 14:44:03 -04:00
Ingo Molnar	6e15cf0486	Merge branch 'core/percpu' into percpu-cpumask-x86-for-linus-2 Conflicts: arch/parisc/kernel/irq.c arch/x86/include/asm/fixmap_64.h arch/x86/include/asm/setup.h kernel/irq/handle.c Semantic merge: arch/x86/include/asm/fixmap.h Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-27 17:28:43 +01:00
Alan Cox	83e0bbcbe2	af_rose/x25: Sanity check the maximum user frame size Otherwise we can wrap the sizes and end up sending garbage. Closes #10423 Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-27 00:28:21 -07:00
Alan Cox	03ba999117	appletalk: this warning can go I think Its past 2.2 ... Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-27 00:27:18 -07:00
Chuck Ebbert	7d0b591c65	xfrm: spin_lock() should be spin_unlock() in xfrm_state.c spin_lock() should be spin_unlock() in xfrm_state_walk_done(). caused by: commit `12a169e7d8` "ipsec: Put dumpers on the dump list" Reported-by: Marc Milgram <mmilgram@redhat.com> Signed-off-by: Chuck Ebbert <cebbert@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-27 00:23:04 -07:00
Jesper Nilsson	71f6f6dfdf	ipv6: Plug sk_buff leak in ipv6_rcv (net/ipv6/ip6_input.c) Commit `778d80be52` (ipv6: Add disable_ipv6 sysctl to disable IPv6 operaion on specific interface) seems to have introduced a leak of sk_buff's for ipv6 traffic, at least in some configurations where idev is NULL, or when ipv6 is disabled via sysctl. The problem is that if the first condition of the if-statement returns non-NULL, it returns an skb with only one reference, and when the other conditions apply, execution jumps to the "out" label, which does not call kfree_skb for it. To plug this leak, change to use the "drop" label instead. (this relies on it being ok to call kfree_skb on NULL) This also allows us to avoid calling rcu_read_unlock here, and removes the only user of the "out" label. Signed-off-by: Jesper Nilsson <jesper.nilsson@axis.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-27 00:17:45 -07:00
David S. Miller	01e6de64d9	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6	2009-03-26 22:45:23 -07:00
Herbert Xu	8f1ead2d1a	GRO: Disable GRO on legacy netif_rx path When I fixed the GRO crash in the legacy receive path I used napi_complete to replace __napi_complete. Unfortunately they're not the same when NETPOLL is enabled, which may result in us not calling __napi_complete at all. What's more, we really do need to keep the __napi_complete call within the IRQ-off section since in theory an IRQ can occur in between and fill up the backlog to the maximum, causing us to lock up. Since we can't seem to find a fix that works properly right now, this patch reverts all the GRO support from the netif_rx path. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-26 22:24:28 -07:00
Linus Torvalds	8e9d208972	Merge branch 'bkl-removal' of git://git.lwn.net/linux-2.6 * 'bkl-removal' of git://git.lwn.net/linux-2.6: Rationalize fasync return values Move FASYNC bit handling to f_op->fasync() Use f_lock to protect f_flags Rename struct file->f_ep_lock	2009-03-26 16:14:02 -07:00
David S. Miller	08abe18af1	Merge branch 'master' of /home/davem/src/GIT/linux-2.6/ Conflicts: drivers/net/wimax/i2400m/usb-notif.c	2009-03-26 15:23:24 -07:00
Linus Torvalds	0c93ea4064	Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6: (61 commits) Dynamic debug: fix pr_fmt() build error Dynamic debug: allow simple quoting of words dynamic debug: update docs dynamic debug: combine dprintk and dynamic printk sysfs: fix some bin_vm_ops errors kobject: don't block for each kobject_uevent sysfs: only allow one scheduled removal callback per kobj Driver core: Fix device_move() vs. dpm list ordering, v2 Driver core: some cleanup on drivers/base/sys.c Driver core: implement uevent suppress in kobject vcs: hook sysfs devices into object lifetime instead of "binding" driver core: fix passing platform_data driver core: move platform_data into platform_device sysfs: don't block indefinitely for unmapped files. driver core: move knode_bus into private structure driver core: move knode_driver into private structure driver core: move klist_children into private structure driver core: create a private portion of struct device driver core: remove polling for driver_probe_done(v5) sysfs: reference sysfs_dirent from sysfs inodes ... Fixed conflicts in drivers/sh/maple/maple.c manually	2009-03-26 11:17:04 -07:00
Linus Torvalds	562f477a54	Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (29 commits) crypto: sha512-s390 - Add missing block size hwrng: timeriomem - Breaks an allyesconfig build on s390: nlattr: Fix build error with NET off crypto: testmgr - add zlib test crypto: zlib - New zlib crypto module, using pcomp crypto: testmgr - Add support for the pcomp interface crypto: compress - Add pcomp interface netlink: Move netlink attribute parsing support to lib crypto: Fix dead links hwrng: timeriomem - New driver crypto: chainiv - Use kcrypto_wq instead of keventd_wq crypto: cryptd - Per-CPU thread implementation based on kcrypto_wq crypto: api - Use dedicated workqueue for crypto subsystem crypto: testmgr - Test skciphers with no IVs crypto: aead - Avoid infinite loop when nivaead fails selftest crypto: skcipher - Avoid infinite loop when cipher fails selftest crypto: api - Fix crypto_alloc_tfm/create_create_tfm return convention crypto: api - crypto_alg_mod_lookup either tested or untested crypto: amcc - Add crypt4xx driver crypto: ansi_cprng - Add maintainer ...	2009-03-26 11:04:34 -07:00
Holger Eitzenberger	d271e8bd8c	ctnetlink: compute generic part of event more acurately On a box with most of the optional Netfilter switches turned off some of the NLAs are never send, e. g. secmark, mark or the conntrack byte/packet counters. As a worst case scenario this may possibly still lead to ctnetlink skbs being reallocated in netlink_trim() later, loosing all the nice effects from the previous patches. I try to solve that (at least partly) by correctly #ifdef'ing the NLAs in the computation. Signed-off-by: Holger Eitzenberger <holger@eitzenberger.org> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-03-26 13:37:14 +01:00
David S. Miller	f0de70f8bb	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6	2009-03-26 01:22:01 -07:00
Rami Rosen	ede5ad0e29	net: core: remove unneeded include in net/core/utils.c. Signed-off-by: Rami Rosen <ramirose@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-26 01:11:48 -07:00
Eric Leblond	7249dee5bd	netfilter: fix nf_logger name in ebt_ulog. This patch renames the ebt_ulog nf_logger from "ulog" to "ebt_ulog" to be in sync with other modules naming. As this name was currently only used for informational purpose, the renaming should be harmless. Signed-off-by: Eric Leblond <eric@inl.fr> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-26 01:04:28 -07:00
Eric Leblond	3b334d427c	netfilter: fix warning in ebt_ulog init function. The ebt_ulog module does not follow the fixed convention about function return. Loading the module is triggering the following message: sys_init_module: 'ebt_ulog'->init suspiciously returned 1, it should follow 0/-E convention sys_init_module: loading module anyway... Pid: 2334, comm: modprobe Not tainted 2.6.29-rc5edenwall0-00883-g199e57b #146 Call Trace: [<c0441b81>] ? printk+0xf/0x16 [<c02311af>] sys_init_module+0x107/0x186 [<c0202cfa>] syscall_call+0x7/0xb The following patch fixes the return treatment in ebt_ulog_init() function. Signed-off-by: Eric Leblond <eric@inl.fr> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-26 01:04:02 -07:00
Eric Leblond	704b3ea3b9	netfilter: fix warning about invalid const usage This patch fixes the declaration of the logger structure in ebt_log and ebt_ulog: I forgot to remove the const option from their declaration in the commit `ca735b3aaa` ("netfilter: use a linked list of loggers"). Pointed-out-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Eric Leblond <eric@inl.fr> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-26 01:03:23 -07:00
Stephen Hemminger	cda6d377ec	bridge: bad error handling when adding invalid ether address This fixes an crash when empty bond device is added to a bridge. If an interface with invalid ethernet address (all zero) is added to a bridge, then bridge code detects it when setting up the forward databas entry. But the error unwind is broken, the bridge port object can get freed twice: once when ref count went to zeo, and once by kfree. Since object is never really accessible, just free it. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-25 21:01:47 -07:00
Holger Eitzenberger	a400c30edb	netfilter: nf_conntrack: calculate per-protocol nlattr size Signed-off-by: Holger Eitzenberger <holger@eitzenberger.org> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-03-25 21:53:39 +01:00
Holger Eitzenberger	5c0de29d06	netfilter: nf_conntrack: add generic function to get len of generic policy Usefull for all protocols which do not add additional data, such as GRE or UDPlite. Signed-off-by: Holger Eitzenberger <holger@eitzenberger.org> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-03-25 21:52:17 +01:00
Holger Eitzenberger	2732c4e45b	netfilter: ctnetlink: allocate right-sized ctnetlink skb Try to allocate a Netlink skb roughly the size of the actual message, with the help from the l3 and l4 protocol helpers. This is all to prevent a reallocation in netlink_trim() later. The overhead of allocating the right-sized skb is rather small, with ctnetlink_alloc_skb() actually being inlined away on my x86_64 box. The size of the per-proto space is determined at registration time of the protocol helper. Signed-off-by: Holger Eitzenberger <holger@eitzenberger.org> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-03-25 21:50:59 +01:00
Eric Dumazet	ea781f197d	netfilter: nf_conntrack: use SLAB_DESTROY_BY_RCU and get rid of call_rcu() Use "hlist_nulls" infrastructure we added in 2.6.29 for RCUification of UDP & TCP. This permits an easy conversion from call_rcu() based hash lists to a SLAB_DESTROY_BY_RCU one. Avoiding call_rcu() delay at nf_conn freeing time has numerous gains. First, it doesnt fill RCU queues (up to 10000 elements per cpu). This reduces OOM possibility, if queued elements are not taken into account This reduces latency problems when RCU queue size hits hilimit and triggers emergency mode. - It allows fast reuse of just freed elements, permitting better use of CPU cache. - We delete rcu_head from "struct nf_conn", shrinking size of this structure by 8 or 16 bytes. This patch only takes care of "struct nf_conn". call_rcu() is still used for less critical conntrack parts, that may be converted later if necessary. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-03-25 21:05:46 +01:00
Patrick McHardy	1f9352ae22	netfilter: {ip,ip6,arp}_tables: fix incorrect loop detection Commit `e1b4b9f` ([NETFILTER]: {ip,ip6,arp}_tables: fix exponential worst-case search for loops) introduced a regression in the loop detection algorithm, causing sporadic incorrectly detected loops. When a chain has already been visited during the check, it is treated as having a standard target containing a RETURN verdict directly at the beginning in order to not check it again. The real target of the first rule is then incorrectly treated as STANDARD target and checked not to contain invalid verdicts. Fix by making sure the rule does actually contain a standard target. Based on patch by Francis Dupont <Francis_Dupont@isc.org> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-03-25 19:26:35 +01:00
Holger Eitzenberger	af9d32ad67	netfilter: limit the length of the helper name This is necessary in order to have an upper bound for Netlink message calculation, which is not a problem at all, as there are no helpers with a longer name. Signed-off-by: Holger Eitzenberger <holger@eitzenberger.org> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-03-25 18:44:01 +01:00
Holger Eitzenberger	e487eb99cf	netlink: add nla_policy_len() It calculates the max. length of a Netlink policy, which is usefull for allocating Netlink buffers roughly the size of the actual message. Signed-off-by: Holger Eitzenberger <holger@eitzenberger.org> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-03-25 18:26:30 +01:00
Holger Eitzenberger	d0dba7255b	netfilter: ctnetlink: add callbacks to the per-proto nlattrs There is added a single callback for the l3 proto helper. The two callbacks for the l4 protos are necessary because of the general structure of a ctnetlink event, which is in short: CTA_TUPLE_ORIG <l3/l4-proto-attributes> CTA_TUPLE_REPLY <l3/l4-proto-attributes> CTA_ID ... CTA_PROTOINFO <l4-proto-attributes> CTA_TUPLE_MASTER <l3/l4-proto-attributes> Therefore the formular is size := sizeof(generic-nlas) + 3 * sizeof(tuple_nlas) + sizeof(protoinfo_nlas) Some of the NLAs are optional, e. g. CTA_TUPLE_MASTER, which is only set if it's an expected connection. But the number of optional NLAs is small enough to prevent netlink_trim() from reallocating if calculated properly. Signed-off-by: Holger Eitzenberger <holger@eitzenberger.org> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-03-25 18:24:48 +01:00
Eric Dumazet	b8dfe49877	netfilter: factorize ifname_compare() We use same not trivial helper function in four places. We can factorize it. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-03-25 17:31:52 +01:00
Eric Dumazet	78f3648601	netfilter: nf_conntrack: use hlist_add_head_rcu() in nf_conntrack_set_hashsize() Using hlist_add_head() in nf_conntrack_set_hashsize() is quite dangerous. Without any barrier, one CPU could see a loop while doing its lookup. Its true new table cannot be seen by another cpu, but previous table is still readable. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-03-25 17:24:34 +01:00
Patrick McHardy	a9a9adfe2f	netfilter: fix xt_LED build failure net/netfilter/xt_LED.c:40: error: field netfilter_led_trigger has incomplete type net/netfilter/xt_LED.c: In function led_timeout_callback: net/netfilter/xt_LED.c:78: warning: unused variable ledinternal net/netfilter/xt_LED.c: In function led_tg_check: net/netfilter/xt_LED.c:102: error: implicit declaration of function led_trigger_register net/netfilter/xt_LED.c: In function led_tg_destroy: net/netfilter/xt_LED.c:135: error: implicit declaration of function led_trigger_unregister Fix by adding a dependency on LED_TRIGGERS. Reported-by: Sachin Sant <sachinp@in.ibm.com> Tested-by: Subrata Modak <tosubrata@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-03-25 17:21:34 +01:00
Vlad Yasevich	b2f5e7cd3d	ipv6: Fix conflict resolutions during ipv6 binding The ipv6 version of bind_conflict code calls ipv6_rcv_saddr_equal() which at times wrongly identified intersections between addresses. It particularly broke down under a few instances and caused erroneous bind conflicts. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-24 19:49:11 -07:00
Vlad Yasevich	63d9950b08	ipv6: Make v4-mapped bindings consistent with IPv4 Binding to a v4-mapped address on an AF_INET6 socket should produce the same result as binding to an IPv4 address on AF_INET socket. The two are interchangable as v4-mapped address is really a portability aid. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-24 19:49:10 -07:00
Vlad Yasevich	0f8d3c7ac3	ipv6: Allow ipv4 wildcard binds after ipv6 address binds The IPv4 wildcard (0.0.0.0) address does not intersect in any way with explicit IPv6 addresses. These two should be permitted, but the IPv4 conflict code checks the ipv6only bit as part of the test. Since binding to an explicit IPv6 address restricts the socket to only that IPv6 address, the side-effect is that the socket behaves as v6-only. By explicitely setting ipv6only in this case, allows the 2 binds to succeed. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-24 19:49:10 -07:00
Vlad Yasevich	783ed5a783	ipv6: Disallow binding to v4-mapped address on v6-only socket. A socket marked v6-only, can not receive or send traffic to v4-mapped addresses. Thus allowing binding to v4-mapped address on such a socket makes no sense. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-24 19:49:09 -07:00
David S. Miller	c80dd2da73	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6	2009-03-24 16:38:53 -07:00
Jason Baron	e9d376f0fa	dynamic debug: combine dprintk and dynamic printk This patch combines Greg Bank's dprintk() work with the existing dynamic printk patchset, we are now calling it 'dynamic debug'. The new feature of this patchset is a richer /debugfs control file interface, (an example output from my system is at the bottom), which allows fined grained control over the the debug output. The output can be controlled by function, file, module, format string, and line number. for example, enabled all debug messages in module 'nf_conntrack': echo -n 'module nf_conntrack +p' > /mnt/debugfs/dynamic_debug/control to disable them: echo -n 'module nf_conntrack -p' > /mnt/debugfs/dynamic_debug/control A further explanation can be found in the documentation patch. Signed-off-by: Greg Banks <gnb@sgi.com> Signed-off-by: Jason Baron <jbaron@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-03-24 16:38:26 -07:00
Cornelia Huck	ffa6a7054d	Driver core: Fix device_move() vs. dpm list ordering, v2 dpm_list currently relies on the fact that child devices will be registered after their parents to get a correct suspend order. Using device_move() however destroys this assumption, as an already registered device may be moved under a newly registered one. This patch adds a new argument to device_move(), allowing callers to specify how dpm_list should be adapted. Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-03-24 16:38:26 -07:00
Pablo Neira Ayuso	38938bfe34	netlink: add NETLINK_NO_ENOBUFS socket flag This patch adds the NETLINK_NO_ENOBUFS socket flag. This flag can be used by unicast and broadcast listeners to avoid receiving ENOBUFS errors. Generally speaking, ENOBUFS errors are useful to notify two things to the listener: a) You may increase the receiver buffer size via setsockopt(). b) You have lost messages, you may be out of sync. In some cases, ignoring ENOBUFS errors can be useful. For example: a) nfnetlink_queue: this subsystem does not have any sort of resync method and you can decide to ignore ENOBUFS once you have set a given buffer size. b) ctnetlink: you can use this together with the socket flag NETLINK_BROADCAST_SEND_ERROR to stop getting ENOBUFS errors as you do not need to resync (packets whose event are not delivered are drop to provide reliable logging and state-synchronization). Moreover, the use of NETLINK_NO_ENOBUFS also reduces a "go up, go down" effect in terms of performance which is due to the netlink congestion control when the listener cannot back off. The effect is the following: 1) throughput rate goes up and netlink messages are inserted in the receiver buffer. 2) Then, netlink buffer fills and overruns (set on nlk->state bit 0). 3) While the listener empties the receiver buffer, netlink keeps dropping messages. Thus, throughput goes dramatically down. 4) Then, once the listener has emptied the buffer (nlk->state bit 0 is set off), goto step 1. This effect is easy to trigger with netlink broadcast under heavy load, and it is more noticeable when using a big receiver buffer. You can find some results in [1] that show this problem. [1] http://1984.lsi.us.es/linux/netlink/ This patch also includes the use of sk_drop to account the number of netlink messages drop due to overrun. This value is shown in /proc/net/netlink. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-24 16:37:55 -07:00
Eric Dumazet	35c7f6de73	arp_tables: ifname_compare() can assume 16bit alignment Arches without efficient unaligned access can still perform a loop assuming 16bit alignment in ifname_compare() Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-24 14:15:22 -07:00
Jan Engelhardt	8dd1d0471b	netfilter: trivial Kconfig spelling fixes Supplements commit `67c0d57930`. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-24 13:35:27 -07:00
David S. Miller	b5bb14386e	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6	2009-03-24 13:24:36 -07:00
Eric Dumazet	1d45209d89	netfilter: nf_conntrack: Reduce conntrack count in nf_conntrack_free() We use RCU to defer freeing of conntrack structures. In DOS situation, RCU might accumulate about 10.000 elements per CPU in its internal queues. To get accurate conntrack counts (at the expense of slightly more RAM used), we might consider conntrack counter not taking into account "about to be freed elements, waiting in RCU queues". We thus decrement it in nf_conntrack_free(), not in the RCU callback. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Tested-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-03-24 14:26:50 +01:00
Vitaly Mayatskikh	30842f2989	udp: Wrong locking code in udp seq_file infrastructure Reading zero bytes from /proc/net/udp or other similar files which use the same seq_file udp infrastructure panics kernel in that way: ===================================== [ BUG: bad unlock balance detected! ] ------------------------------------- read/1985 is trying to release lock (&table->hash[i].lock) at: [<ffffffff81321d83>] udp_seq_stop+0x27/0x29 but there are no more locks to release! other info that might help us debug this: 1 lock held by read/1985: #0: (&p->lock){--..}, at: [<ffffffff810eefb6>] seq_read+0x38/0x348 stack backtrace: Pid: 1985, comm: read Not tainted 2.6.29-rc8 #9 Call Trace: [<ffffffff81321d83>] ? udp_seq_stop+0x27/0x29 [<ffffffff8106dab9>] print_unlock_inbalance_bug+0xd6/0xe1 [<ffffffff8106db62>] lock_release_non_nested+0x9e/0x1c6 [<ffffffff810ef030>] ? seq_read+0xb2/0x348 [<ffffffff8106bdba>] ? mark_held_locks+0x68/0x86 [<ffffffff81321d83>] ? udp_seq_stop+0x27/0x29 [<ffffffff8106dde7>] lock_release+0x15d/0x189 [<ffffffff8137163c>] _spin_unlock_bh+0x1e/0x34 [<ffffffff81321d83>] udp_seq_stop+0x27/0x29 [<ffffffff810ef239>] seq_read+0x2bb/0x348 [<ffffffff810eef7e>] ? seq_read+0x0/0x348 [<ffffffff8111aedd>] proc_reg_read+0x90/0xaf [<ffffffff810d878f>] vfs_read+0xa6/0x103 [<ffffffff8106bfac>] ? trace_hardirqs_on_caller+0x12f/0x153 [<ffffffff810d88a2>] sys_read+0x45/0x69 [<ffffffff8101123a>] system_call_fastpath+0x16/0x1b BUG: scheduling while atomic: read/1985/0xffffff00 INFO: lockdep is turned off. Modules linked in: cpufreq_ondemand acpi_cpufreq freq_table dm_multipath kvm ppdev snd_hda_codec_analog snd_hda_intel snd_hda_codec snd_hwdep snd_seq_dummy snd_seq_oss snd_seq_midi_event arc4 snd_s eq ecb thinkpad_acpi snd_seq_device iwl3945 hwmon sdhci_pci snd_pcm_oss sdhci rfkill mmc_core snd_mixer_oss i2c_i801 mac80211 yenta_socket ricoh_mmc i2c_core iTCO_wdt snd_pcm iTCO_vendor_support rs rc_nonstatic snd_timer snd lib80211 cfg80211 soundcore snd_page_alloc video parport_pc output parport e1000e [last unloaded: scsi_wait_scan] Pid: 1985, comm: read Not tainted 2.6.29-rc8 #9 Call Trace: [<ffffffff8106b456>] ? __debug_show_held_locks+0x1b/0x24 [<ffffffff81043660>] __schedule_bug+0x7e/0x83 [<ffffffff8136ede9>] schedule+0xce/0x838 [<ffffffff810d7972>] ? fsnotify_access+0x5f/0x67 [<ffffffff810112d0>] ? sysret_careful+0xb/0x37 [<ffffffff8106be9c>] ? trace_hardirqs_on_caller+0x1f/0x153 [<ffffffff8137127b>] ? trace_hardirqs_on_thunk+0x3a/0x3f [<ffffffff810112f6>] sysret_careful+0x31/0x37 read[1985]: segfault at 7fffc479bfe8 ip 0000003e7420a180 sp 00007fffc479bfa0 error 6 Kernel panic - not syncing: Aiee, killing interrupt handler! udp_seq_stop() tries to unlock not yet locked spinlock. The lock was lost during splitting global udp_hash_lock to subsequent spinlocks. Signed-off by: Vitaly Mayatskikh <v.mayatskih@gmail.com> Acked-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-23 15:22:33 -07:00
David S. Miller	8be7cdccac	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/ucc_geth.c	2009-03-23 13:35:04 -07:00
Linus Torvalds	d56ffd38a9	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (32 commits) ucc_geth: Fix oops when using fixed-link support dm9000: locking bugfix net: update dnet.c for bus_id removal dnet: DNET should depend on HAS_IOMEM dca: add missing copyright/license headers nl80211: Check that function pointer != NULL before using it sungem: missing net_device_ops be2net: fix to restore vlan ids into BE2 during a IF DOWN->UP cycle be2net: replenish when posting to rx-queue is starved in out of mem conditions bas_gigaset: correctly allocate USB interrupt transfer buffer smsc911x: reset last known duplex and carrier on open sh_eth: Fix mistake of the address of SH7763 sh_eth: Change handling of IRQ netns: oops in ip[6]_frag_reasm incrementing stats net: kfree(napi->skb) => kfree_skb net: fix sctp breakage ipv6: fix display of local and remote sit endpoints net: Document /proc/sys/net/core/netdev_budget tulip: fix crash on iface up with shirq debug virtio_net: Make virtio_net support carrier detection ...	2009-03-23 09:25:58 -07:00
Mark H. Weaver	534f81a506	netfilter: nf_conntrack_tcp: fix unaligned memory access in tcp_sack This patch fixes an unaligned memory access in tcp_sack while reading sequence numbers from TCP selective acknowledgement options. Prior to applying this patch, upstream linux-2.6.27.20 was occasionally generating messages like this on my sparc64 system: [54678.532071] Kernel unaligned access at TPC[6b17d4] tcp_packet+0xcd4/0xd00 Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-03-23 13:46:12 +01:00
Pablo Neira Ayuso	dd5b6ce6fd	nefilter: nfnetlink: add nfnetlink_set_err and use it in ctnetlink This patch adds nfnetlink_set_err() to propagate the error to netlink broadcast listener in case of memory allocation errors in the message building. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-03-23 13:21:06 +01:00
Eric Leblond	176252746e	netfilter: sysctl support of logger choice This patchs adds support of modification of the used logger via sysctl. It can be used to change the logger to module that can not use the bind operation (ipt_LOG and ipt_ULOG). For this purpose, it creates a directory /proc/sys/net/netfilter/nf_log which contains a file per-protocol. The content of the file is the name current logger (NONE if not set) and a logger can be setup by simply echoing its name to the file. By echoing "NONE" to a /proc/sys/net/netfilter/nf_log/PROTO file, the logger corresponding to this PROTO is set to NULL. Signed-off-by: Eric Leblond <eric@inl.fr> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-03-23 13:16:53 +01:00
John Dykstra	96e0bf4b51	tcp: Discard segments that ack data not yet sent Discard incoming packets whose ack field iincludes data not yet sent. This is consistent with RFC 793 Section 3.9. Change tcp_ack() to distinguish between too-small and too-large ack field values. Keep segments with too-large ack fields out of the fast path, and change slow path to discard them. Reported-by: Oliver Zheng <mailinglists+netdev@oliverzheng.com> Signed-off-by: John Dykstra <john.dykstra1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-22 21:49:57 -07:00
Stephen Hemminger	d44c3a2e0e	netdev: expose net_device_ops compat as config option Now that most network device drivers in (all but one in x86_64 allmodconfig) support net_device_ops. Expose it as a configuration parameter. Still need to address even older 32 bit drivers, and other arch before compatiablity can be scheduled for removal in some future release. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-21 22:55:36 -07:00
Stephen Hemminger	9cc8ba783d	irlan: convert to net_device_ops Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-21 19:19:16 -07:00
Stephen Hemminger	92bcd4fe9a	irda: net_device_ops ioctl fix Need to reference net_device_ops not old pointer. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-21 19:19:14 -07:00
Stephen Hemminger	dde0975855	atm: convert clip driver to net_device_ops Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-21 19:19:12 -07:00
Stephen Hemminger	788dee0a95	atm: convert mpc device to using netdev_ops This converts the mpc device to using new netdevice_ops. Compile tested only, needs more than usual review since device was swaping pointers around etc. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Chas Williams <chas@cmf.nrl.navy.mil> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-21 19:19:12 -07:00
Lennert Buytenhek	e84665c9cb	dsa: add switch chip cascading support The initial version of the DSA driver only supported a single switch chip per network interface, while DSA-capable switch chips can be interconnected to form a tree of switch chips. This patch adds support for multiple switch chips on a network interface. An example topology for a 16-port device with an embedded CPU is as follows: +-----+ +--------+ +--------+ \| \|eth0 10\| switch \|9 10\| switch \| \| CPU +----------+ +-------+ \| \| \| \| chip 0 \| \| chip 1 \| +-----+ +---++---+ +---++---+ \|\| \|\| \|\| \|\| \|\|1000baseT \|\|1000baseT \|\|ports 1-8 \|\|ports 9-16 This requires a couple of interdependent changes in the DSA layer: - The dsa platform driver data needs to be extended: there is still only one netdevice per DSA driver instance (eth0 in the example above), but each of the switch chips in the tree needs its own mii_bus device pointer, MII management bus address, and port name array. (include/net/dsa.h) The existing in-tree dsa users need some small changes to deal with this. (arch/arm) - The DSA and Ethertype DSA tagging modules need to be extended to use the DSA device ID field on receive and demultiplex the packet accordingly, and fill in the DSA device ID field on transmit according to which switch chip the packet is heading to. (net/dsa/tag_{dsa,edsa}.c) - The concept of "CPU port", which is the switch chip port that the CPU is connected to (port 10 on switch chip 0 in the example), needs to be extended with the concept of "upstream port", which is the port on the switch chip that will bring us one hop closer to the CPU (port 10 for both switch chips in the example above). - The dsa platform data needs to specify which ports on which switch chips are links to other switch chips, so that we can enable DSA tagging mode on them. (For inter-switch links, we always use non-EtherType DSA tagging, since it has lower overhead. The CPU link uses dsa or edsa tagging depending on what the 'root' switch chip supports.) This is done by specifying "dsa" for the given port in the port array. - The dsa platform data needs to be extended with information on via which port to reach any given switch chip from any given switch chip. This info is specified via the per-switch chip data struct ->rtable[] array, which gives the nexthop ports for each of the other switches in the tree. For the example topology above, the dsa platform data would look something like this: static struct dsa_chip_data sw[2] = { { .mii_bus = &foo, .sw_addr = 1, .port_names[0] = "p1", .port_names[1] = "p2", .port_names[2] = "p3", .port_names[3] = "p4", .port_names[4] = "p5", .port_names[5] = "p6", .port_names[6] = "p7", .port_names[7] = "p8", .port_names[9] = "dsa", .port_names[10] = "cpu", .rtable = (s8 []){ -1, 9, }, }, { .mii_bus = &foo, .sw_addr = 2, .port_names[0] = "p9", .port_names[1] = "p10", .port_names[2] = "p11", .port_names[3] = "p12", .port_names[4] = "p13", .port_names[5] = "p14", .port_names[6] = "p15", .port_names[7] = "p16", .port_names[10] = "dsa", .rtable = (s8 []){ 10, -1, }, }, }, static struct dsa_platform_data pd = { .netdev = &foo, .nr_switches = 2, .sw = sw, }; Signed-off-by: Lennert Buytenhek <buytenh@marvell.com> Tested-by: Gary Thomas <gary@mlbassoc.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-21 19:06:54 -07:00
Lennert Buytenhek	076d3e10a5	dsa: add support for the Marvell 88E6095/6095F switch chips Add support for the Marvell 88E6095/6095F switch chips. These chips are similar to the 88e6131, so we can add the support to mv88e6131.c easily. Thanks to Gary Thomas <gary@mlbassoc.com> and Jesper Dangaard Brouer <hawk@diku.dk> for testing various patches. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com> Tested-by: Gary Thomas <gary@mlbassoc.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-21 19:06:54 -07:00
Lennert Buytenhek	c084080151	dsa: set ->iflink on slave interfaces to the ifindex of the parent ..so that we can parse the DSA topology from 'ip link' output: 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000 4: lan1@eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue 5: lan2@eth0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue 6: lan3@eth0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue 7: lan4@eth0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue Signed-off-by: Lennert Buytenhek <buytenh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-21 19:06:53 -07:00
Stephen Hemminger	fa665ccf01	ipx: use constant for strings and desciptor Fix compiler warning about non-const format string. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-21 19:06:51 -07:00
Stephen Hemminger	7ca98fa234	snap: use const for descriptor Protocols should be able to use constant value for the descriptor. Minor whitespace cleanup as well Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-21 19:06:50 -07:00
Eric Dumazet	ed734a97c6	net: remove useless prefetch() call There is no gain using prefetch() in dev_hard_start_xmit(), since we already had to read ops->ndo_select_queue pointer in dev_pick_tx(), and both pointers are probably located in the same cache line. This prefetch call slows down fast path because of a stall in address computation. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-21 13:42:55 -07:00
Vlad Yasevich	8d2f9e8116	sctp: Clean up TEST_FRAME hacks. Remove 2 TEST_FRAME hacks that are no longer needed. These allowed sctp regression tests to compile before, but are no longer needed. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-21 13:41:09 -07:00
Stephen Hemminger	9247744e5e	skb: expose and constify hash primitives Some minor changes to queue hashing: 1. Use const on accessor functions 2. Export skb_tx_hash for use in drivers (see ixgbe) Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-21 13:39:26 -07:00
Stephen Hemminger	1f1900f935	atm: lec use dev_change_mtu Rather than calling device pointer directly (which is incorrect with net_device_ops), use the standard dev_change_mtu. Compile tested only. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-21 13:37:28 -07:00
Ilpo Järvinen	a0bffffc14	net/*: use linux/kernel.h swap() tcp_sack_swap seems unnecessary so I pushed swap to the caller. Also removed comment that seemed then pointless, and added include when not already there. Compile tested. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-21 13:36:17 -07:00
Bernard Pidoux	a3ac80a130	netrom: zero length frame filtering in NetRom A zero length frame filter was recently introduced in ROSE protocole. Previous commit makes the same at AX25 protocole level. This patch has the same purpose for NetRom protocole. The reason is that empty frames have no meaning in NetRom protocole. Signed-off-by: Bernard Pidoux <f6bvp@amsat.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-21 13:34:20 -07:00
Bernard Pidoux	f99bcff7a2	ax25: zero length frame filtering in AX25 In previous commit `244f46ae6e` was introduced a zero length frame filter for ROSE protocole. This patch has the same purpose at AX25 frame level for the same reason. Empty frames have no meaning in AX25 protocole. Signed-off-by: Bernard Pidoux <f6bvp@amsat.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-21 13:33:55 -07:00
Bernard Pidoux	60784427ab	ax25: SOCK_DEBUG message simplification This patch condenses two debug messages in one. Signed-off-by: Bernard Pidoux <f6bvp@amsat.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-21 13:33:18 -07:00
Jouni Malinen	f3f9258678	nl80211: Check that function pointer != NULL before using it NL80211_CMD_GET_MESH_PARAMS and NL80211_CMD_SET_MESH_PARAMS handlers did not verify whether a function pointer is NULL (not supported by the driver) before trying to call the function. The former nl80211 command is available for unprivileged users, too, so this can potentially allow normal users to kill networking (or worse..) if mac80211 is built without CONFIG_MAC80211_MESH=y. Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-20 16:01:57 -04:00
David S. Miller	2b1c4354de	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/virtio_net.c	2009-03-20 02:27:41 -07:00
Tom Talpey	2e3c230bc7	SVCRDMA: fix recent printk format warnings. printk formats in prior commit were reversed/incorrect. Compiled without warning on x86 and x86_64, but detected on ppc. Signed-off-by: Tom Talpey <tmtalpey@gmail.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-19 15:17:37 -04:00
Trond Myklebust	55420c24a0	SUNRPC: Ensure we close the socket on EPIPE errors too... As long as one task is holding the socket lock, then calls to xprt_force_disconnect(xprt) will not succeed in shutting down the socket. In particular, this would mean that a server initiated shutdown will not succeed until the lock is relinquished. In order to avoid the deadlock, we should ensure that xs_tcp_send_request() closes the socket on EPIPE errors too. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-19 15:17:36 -04:00
Trond Myklebust	b61d59fffd	SUNRPC: xs_tcp_connect_worker{4,6}: merge common code Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-19 15:17:35 -04:00
Trond Myklebust	25fe6142a5	SUNRPC: Add a sysctl to control the duration of the socket linger timeout Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-19 15:17:34 -04:00
Trond Myklebust	7d1e8255cf	SUNRPC: Add the equivalent of the linger and linger2 timeouts to RPC sockets This fixes a regression against FreeBSD servers as reported by Tomas Kasparek. Apparently when using RPC over a TCP socket, the FreeBSD servers don't ever react to the client closing the socket, and so commit `e06799f958` (SUNRPC: Use shutdown() instead of close() when disconnecting a TCP socket) causes the setup to hang forever whenever the client attempts to close and then reconnect. We break the deadlock by adding a 'linger2' style timeout to the socket, after which, the client will abort the connection using a TCP 'RST'. The default timeout is set to 15 seconds. A subsequent patch will put it under user control by means of a systctl. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-19 15:17:34 -04:00
Jorge Boncompte [DTI2]	2bad35b7c9	netns: oops in ip[6]_frag_reasm incrementing stats dev can be NULL in ip[6]_frag_reasm for skb's coming from RAW sockets. Quagga's OSPFD sends fragmented packets on a RAW socket, when netfilter conntrack reassembles them on the OUTPUT path you hit this code path. You can test it with something like "hping2 -0 -d 2000 -f AA.BB.CC.DD" With help from Jarek Poplawski. Signed-off-by: Jorge Boncompte [DTI2] <jorge@dti2.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-18 23:26:11 -07:00
Roel Kluin	e4a389a9b5	net: kfree(napi->skb) => kfree_skb struct sk_buff pointers should be freed with kfree_skb. Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-18 23:12:13 -07:00
Al Viro	cb0dc77de0	net: fix sctp breakage broken by commit 5e739d1752aca4e8f3e794d431503bfca3162df4; AFAICS should be -stable fodder as well... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Aced-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-18 19:12:42 -07:00
Stephen Hemminger	4b704d59d6	tipc: fix non-const printf format arguments Fix warnings from current gcc about using non-const strings as printf args in TIPC. Compile tested only (not a TIPC user). Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-18 19:11:29 -07:00
Bjørn Mork	1b1d8f73a4	ipv6: fix display of local and remote sit endpoints This fixes the regressions cause by commit `1326c3d5a4` (v2.6.28-rc6-461-g23a12b1) broke the display of local and remote addresses of an SIT tunnel in iproute2. nt->parms is used by ipip6_tunnel_init() and therefore need to be initialized first. Tracked as http://bugzilla.kernel.org/show_bug.cgi?id=12868 Reported-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-18 18:56:54 -07:00
Rami Rosen	beedad923a	tcp: remove parameter from tcp_recv_urg(). This patch removes an unused parameter (addr_len) from tcp_recv_urg() method in net/ipv4/tcp.c. Signed-off-by: Rami Rosen <ramirose@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-18 18:50:09 -07:00
Brian Haley	9bdd8d40c8	ipv6: Fix incorrect disable_ipv6 behavior Fix the behavior of allowing both sysctl and addrconf_dad_failure() to set the disable_ipv6 parameter without any bad side-effects. If DAD fails and accept_dad > 1, we will still set disable_ipv6=1, but then instead of allowing an RA to add an address then immediately fail DAD, we simply don't allow the address to be added in the first place. This also lets the user set this flag and disable all IPv6 addresses on the interface, or on the entire system. Signed-off-by: Brian Haley <brian.haley@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-18 18:22:48 -07:00
Olga Kornievskaia	47a14ef1af	svcrpc: take advantage of tcp autotuning Allow the NFSv4 server to make use of TCP autotuning behaviour, which was previously disabled by setting the sk_userlocks variable. Set the receive buffers to be big enough to receive the whole RPC request, and set this for the listening socket, not the accept socket. Remove the code that readjusts the receive/send buffer sizes for the accepted socket. Previously this code was used to influence the TCP window management behaviour, which is no longer needed when autotuning is enabled. This can improve IO bandwidth on networks with high bandwidth-delay products, where a large tcp window is required. It also simplifies performance tuning, since getting adequate tcp buffers previously required increasing the number of nfsd threads. Signed-off-by: Olga Kornievskaia <aglo@citi.umich.edu> Cc: Jim Rees <rees@umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-03-18 17:46:59 -04:00
Greg Banks	03cf6c9f49	knfsd: add file to export stats about nfsd pools Add /proc/fs/nfsd/pool_stats to export to userspace various statistics about the operation of rpc server thread pools. This patch is based on a forward-ported version of knfsd-add-pool-thread-stats which has been shipping in the SGI "Enhanced NFS" product since 2006 and which was previously posted: http://article.gmane.org/gmane.linux.nfs/10375 It has also been updated thus: * moved EXPORT_SYMBOL() to near the function it exports * made the new struct struct seq_operations const * used SEQ_START_TOKEN instead of ((void )1) merged fix from SGI PV 990526 "sunrpc: use dprintk instead of printk in svc_pool_stats_()" by Harshula Jayasuriya. merged fix from SGI PV 964001 "Crash reading pool_stats before nfsds are started". Signed-off-by: Greg Banks <gnb@sgi.com> Signed-off-by: Harshula Jayasuriya <harshula@sgi.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-03-18 17:38:42 -04:00
Greg Banks	59a252ff8c	knfsd: avoid overloading the CPU scheduler with enormous load averages Avoid overloading the CPU scheduler with enormous load averages when handling high call-rate NFS loads. When the knfsd bottom half is made aware of an incoming call by the socket layer, it tries to choose an nfsd thread and wake it up. As long as there are idle threads, one will be woken up. If there are lot of nfsd threads (a sensible configuration when the server is disk-bound or is running an HSM), there will be many more nfsd threads than CPUs to run them. Under a high call-rate low service-time workload, the result is that almost every nfsd is runnable, but only a handful are actually able to run. This situation causes two significant problems: 1. The CPU scheduler takes over 10% of each CPU, which is robbing the nfsd threads of valuable CPU time. 2. At a high enough load, the nfsd threads starve userspace threads of CPU time, to the point where daemons like portmap and rpc.mountd do not schedule for tens of seconds at a time. Clients attempting to mount an NFS filesystem timeout at the very first step (opening a TCP connection to portmap) because portmap cannot wake up from select() and call accept() in time. Disclaimer: these effects were observed on a SLES9 kernel, modern kernels' schedulers may behave more gracefully. The solution is simple: keep in each svc_pool a counter of the number of threads which have been woken but have not yet run, and do not wake any more if that count reaches an arbitrary small threshold. Testing was on a 4 CPU 4 NIC Altix using 4 IRIX clients, each with 16 synthetic client threads simulating an rsync (i.e. recursive directory listing) workload reading from an i386 RH9 install image (161480 regular files in 10841 directories) on the server. That tree is small enough to fill in the server's RAM so no disk traffic was involved. This setup gives a sustained call rate in excess of 60000 calls/sec before being CPU-bound on the server. The server was running 128 nfsds. Profiling showed schedule() taking 6.7% of every CPU, and __wake_up() taking 5.2%. This patch drops those contributions to 3.0% and 2.2%. Load average was over 120 before the patch, and 20.9 after. This patch is a forward-ported version of knfsd-avoid-nfsd-overload which has been shipping in the SGI "Enhanced NFS" product since 2006. It has been posted before: http://article.gmane.org/gmane.linux.nfs/10374 Signed-off-by: Greg Banks <gnb@sgi.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-03-18 17:38:41 -04:00
Patrick McHardy	0f5b3e85a3	netfilter: ctnetlink: fix rcu context imbalance Introduced by `7ec47496` (netfilter: ctnetlink: cleanup master conntrack assignation): net/netfilter/nf_conntrack_netlink.c:1275:2: warning: context imbalance in 'ctnetlink_create_conntrack' - different lock contexts for basic block Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-03-18 17:36:40 +01:00
Florian Westphal	711d60a9e7	netfilter: remove nf_ct_l4proto_find_get/nf_ct_l4proto_put users have been moved to __nf_ct_l4proto_find. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-03-18 17:30:50 +01:00
Florian Westphal	cd91566e4b	netfilter: ctnetlink: remove remaining module refcounting Convert the remaining refcount users. As pointed out by Patrick McHardy, the protocols can be accessed safely using RCU. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-03-18 17:28:37 +01:00
David S. Miller	af4330631c	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6	2009-03-17 15:04:31 -07:00
David S. Miller	2d6a5e9500	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/igb/igb_main.c drivers/net/qlge/qlge_main.c drivers/net/wireless/ath9k/ath9k.h drivers/net/wireless/ath9k/core.h drivers/net/wireless/ath9k/hw.c	2009-03-17 15:01:30 -07:00
David S. Miller	f10023a4ef	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6	2009-03-17 14:29:22 -07:00
David S. Miller	4ada8107f4	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6	2009-03-17 13:12:47 -07:00
Herbert Xu	303c6a0251	gro: Fix legacy path napi_complete crash On the legacy netif_rx path, I incorrectly tried to optimise the napi_complete call by using __napi_complete before we reenable IRQs. This simply doesn't work since we need to flush the held GRO packets first. This patch fixes it by doing the obvious thing of reenabling IRQs first and then calling napi_complete. Reported-by: Frank Blaschka <blaschka@linux.vnet.ibm.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-17 13:11:29 -07:00
Herbert Xu	2ffb455819	gro: Fix vlan/netpoll check again Jarek Poplawski pointed out that my previous fix is broken for VLAN+netpoll as if netpoll is enabled we'd end up in the normal receive path instead of the VLAN receive path. This patch fixes it by calling the VLAN receive hook. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-17 13:10:52 -07:00
Luis R. Rodriguez	73d54c9e74	cfg80211: add regulatory netlink multicast group This allows us to send to userspace "regulatory" events. For now we just send an event when we change regulatory domains. We also notify userspace when devices are using their own custom world roaming regulatory domains. Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-16 18:09:40 -04:00
Luis R. Rodriguez	7db90f4a25	cfg80211: move enum reg_set_by to nl80211.h We do this so we can later inform userspace who set the regulatory domain and provide details of the request. Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-16 18:09:40 -04:00
Luis R. Rodriguez	0fee54cab7	cfg80211: remove REGDOM_SET_BY_INIT This is not used as we can always just assume the first regulatory domain set will _always_ be a static regulatory domain. REGDOM_SET_BY_CORE will be the first request from cfg80211 for a regdomain and that then populates the first regulatory request. Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-16 18:09:39 -04:00
Herton Ronaldo Krzesinski	1a28c78b46	mac80211: deauth before flushing STA information Even after commit "mac80211: deauth when interface is marked down" (`e327b847` on Linus tree), userspace still isn't notified when interface goes down. There isn't a problem with this commit, but because of other code changes it doesn't work on kernels >= 2.6.28 (works if same/similar change applied on 2.6.27 for example). The issue is as follows: after commit "mac80211: restructure disassoc/deauth flows" in 2.6.28, the call to ieee80211_sta_deauthenticate added by commit `e327b847` will not work: because we do sta_info_flush(local, sdata) inside ieee80211_stop (iface.c), all stations in interface are cleared, so when calling ieee80211_sta_deauthenticate->ieee80211_set_disassoc (mlme.c), inside ieee80211_set_disassoc we have this in the beginning: sta = sta_info_get(local, ifsta->bssid); if (!sta) { The !sta check triggers, thus the function returns early and ieee80211_sta_send_apinfo(sdata, ifsta) later isn't called, so wpa_supplicant/userspace isn't notified with SIOCGIWAP. This commit moves deauthentication to before flushing STA info (sta_info_flush), thus the above can't happen and userspace is really notified when interface goes down. Signed-off-by: Herton Ronaldo Krzesinski <herton@mandriva.com.br> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-16 18:09:39 -04:00
Helmut Schaa	af88b9078d	mac80211: handle failed scan requests in STA mode If cfg80211 requests a scan it awaits either a return code != 0 from the scan function or the cfg80211_scan_done to be called. In case of a STA mac80211's scan function ever returns 0 and queues the scan request. If ieee80211_sta_work is executed and ieee80211_start_scan fails for some reason cfg80211_scan_done will never be called but cfg80211 still thinks the scan was triggered successfully and will refuse any future scan requests due to drv->scan_req not being cleaned up. If a scan is triggered from within the MLME a similar problem appears. If ieee80211_start_scan returns an error, local->scan_req will not be reset and mac80211 will refuse any future scan requests. Hence, in both cases call ieee80211_scan_failed (which notifies cfg80211 and resets local->scan_req) if ieee80211_start_scan returns an error. Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-16 18:09:38 -04:00
Luis R. Rodriguez	ec329acef9	cfg80211: fix max tx power for world regdom on 5 GHz to 20dBm This is the lowest value amongst countries which do enable 5 GHz operation. Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-16 18:09:29 -04:00
Luis R. Rodriguez	611b6a82aa	cfg80211: Enable passive scan on channels 12-14 for world roaming Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-16 18:09:29 -04:00
Jouni Malinen	0eeb59fe2c	mac80211: Fix WMM ACM parsing and AC downgrade operation Incorrect local->wmm_acm bits were set for AC_BK and AC_BE. Fix this and add some comments to make it easier to understand the AC-to-UP(pair) mapping. Set the wmm_acm bits (and show WMM debug) even if the driver does not implement conf_tx() handler. In addition, fix the ACM-based AC downgrade code to not use the highest priority in error cases. We need to break the loop to get the correct AC_BK value (3) instead of returning 0 (which would indicate AC_VO). The comment here was not really very useful either, so let's provide somewhat more helpful description of the situation. Since it is very unlikely that the ACM flag would be set for AC_BK and AC_BE, these bugs are not likely to be seen in real life networks. Anyway, better do these things correctly should someone really use silly AP configuration (and to pass some functionality tests, too). Remove the TODO comment about handling ACM. Downgrading AC is perfectly valid mechanism for ACM. Eventually, we may add support for WMM-AC and send a request for a TS, but anyway, that functionality won't be here at the location of this TODO comment. Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-16 18:09:27 -04:00
Jouni Malinen	055249d20d	mac80211: Fix panic on fragmentation with power saving It was possible to hit a kernel panic on NULL pointer dereference in dev_queue_xmit() when sending power save buffered frames to a STA that woke up from sleep. This happened when the buffered frame was requeued for transmission in ap_sta_ps_end(). In order to avoid the panic, copy the skb->dev and skb->iif values from the first fragment to all other fragments. Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-16 18:01:59 -04:00
John W. Linville	6f16bf3bdb	lib80211: silence excessive crypto debugging messages When they were part of the now defunct ieee80211 component, these messages were only visible when special debugging settings were enabled. Let's mirror that with a new lib80211 debugging Kconfig option. Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-16 18:01:58 -04:00
Herbert Xu	d1c76af9e2	GRO: Move netpoll checks to correct location As my netpoll fix for net doesn't really work for net-next, we need this update to move the checks into the right place. As it stands we may pass freed skbs to netpoll_receive_skb. This patch also introduces a netpoll_rx_on function to avoid GRO completely if we're invoked through netpoll. This might seem paranoid but as netpoll may have an external receive hook it's better to be safe than sorry. I don't think we need this for 2.6.29 though since there's nothing immediately broken by it. This patch also moves the GRO_* return values to netdevice.h since VLAN needs them too (I tried to avoid this originally but alas this seems to be the easiest way out). This fixes a bug in VLAN where it continued to use the old return value 2 instead of the correct GRO_DROP. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-16 10:50:02 -07:00
Pablo Neira Ayuso	0269ea4937	netfilter: xtables: add cluster match This patch adds the iptables cluster match. This match can be used to deploy gateway and back-end load-sharing clusters. The cluster can be composed of 32 nodes maximum (although I have only tested this with two nodes, so I cannot tell what is the real scalability limit of this solution in terms of cluster nodes). Assuming that all the nodes see all packets (see below for an example on how to do that if your switch does not allow this), the cluster match decides if this node has to handle a packet given: (jhash(source IP) % total_nodes) & node_mask For related connections, the master conntrack is used. The following is an example of its use to deploy a gateway cluster composed of two nodes (where this is the node 1): iptables -I PREROUTING -t mangle -i eth1 -m cluster \ --cluster-total-nodes 2 --cluster-local-node 1 \ --cluster-proc-name eth1 -j MARK --set-mark 0xffff iptables -A PREROUTING -t mangle -i eth1 \ -m mark ! --mark 0xffff -j DROP iptables -A PREROUTING -t mangle -i eth2 -m cluster \ --cluster-total-nodes 2 --cluster-local-node 1 \ --cluster-proc-name eth2 -j MARK --set-mark 0xffff iptables -A PREROUTING -t mangle -i eth2 \ -m mark ! --mark 0xffff -j DROP And the following commands to make all nodes see the same packets: ip maddr add 01:00:5e:00:01:01 dev eth1 ip maddr add 01:00:5e:00:01:02 dev eth2 arptables -I OUTPUT -o eth1 --h-length 6 \ -j mangle --mangle-mac-s 01:00:5e:00:01:01 arptables -I INPUT -i eth1 --h-length 6 \ --destination-mac 01:00:5e:00:01:01 \ -j mangle --mangle-mac-d 00:zz:yy:xx:5a:27 arptables -I OUTPUT -o eth2 --h-length 6 \ -j mangle --mangle-mac-s 01:00:5e:00:01:02 arptables -I INPUT -i eth2 --h-length 6 \ --destination-mac 01:00:5e:00:01:02 \ -j mangle --mangle-mac-d 00:zz:yy:xx:5a:27 In the case of TCP connections, pickup facility has to be disabled to avoid marking TCP ACK packets coming in the reply direction as valid. echo 0 > /proc/sys/net/netfilter/nf_conntrack_tcp_loose BTW, some final notes: * This match mangles the skbuff pkt_type in case that it detects PACKET_MULTICAST for a non-multicast address. This may be done in a PKTTYPE target for this sole purpose. * This match supersedes the CLUSTERIP target. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-03-16 17:10:36 +01:00
Cyrill Gorcunov	1546000fe8	net: netfilter conntrack - add per-net functionality for DCCP protocol Module specific data moved into per-net site and being allocated/freed during net namespace creation/deletion. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Acked-by: Daniel Lezcano <daniel.lezcano@free.fr> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-03-16 16:30:49 +01:00
Cyrill Gorcunov	81a1d3c31e	net: sysctl_net - use net_eq to compare nets Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Acked-by: Daniel Lezcano <daniel.lezcano@free.fr> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-03-16 16:23:30 +01:00
Linus Torvalds	8e91f178a2	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (21 commits) r8169: revert "r8169: read MAC address from EEPROM on init (2nd attempt)" r8169: use hardware auto-padding. igb: remove ASPM L0s workaround netxen: remove old flash check. mv643xx_eth: fix unicast address filter corruption on mtu change xfrm: Fix xfrm_state_find() wrt. wildcard source address. emac: Fix clock control for 405EX and 405EXr chips ixgbe: fix multiple unicast address support via-velocity: Fix DMA mapping length errors on transmit. qlge: bugfix: Pad outbound frames smaller than 60 bytes. qlge: bugfix: Move netif_napi_del() to common call point. qlge: bugfix: Tell hw to strip vlan header. qlge: bugfix: Increase filter on inbound csum. dnet: replace obsolete netif_rx_ functions with napi_ net: Add be2net driver. dnet: Fix warnings on 64-bit. dnet: Dave DNET ethernet controller driver (updated) ipv6: Fix BUG when disabled ipv6 module is unloaded bnx2x: Using DMAE to initialize the chip bnx2x: Casting page alignment ...	2009-03-16 07:56:58 -07:00
Christoph Paasch	d1238d5337	netfilter: conntrack: check for NEXTHDR_NONE before header sanity checking NEXTHDR_NONE doesn't has an IPv6 option header, so the first check for the length will always fail and results in a confusing message "too short" if debugging enabled. With this patch, we check for NEXTHDR_NONE before length sanity checkings are done. Signed-off-by: Christoph Paasch <christoph.paasch@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-03-16 15:52:11 +01:00
Christoph Paasch	ec8d540969	netfilter: conntrack: fix dropping packet after l4proto->packet() We currently use the negative value in the conntrack code to encode the packet verdict in the error. As NF_DROP is equal to 0, inverting NF_DROP makes no sense and, as a result, no packets are ever dropped. Signed-off-by: Christoph Paasch <christoph.paasch@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-03-16 15:51:29 +01:00
Pablo Neira Ayuso	626ba8fbac	netfilter: ctnetlink: fix crash during expectation creation This patch fixes a possible crash due to the missing initialization of the expectation class when nf_ct_expect_related() is called. Reported-by: BORBELY Zoltan <bozo@andrews.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-03-16 15:50:51 +01:00
Jan Engelhardt	acc738fec0	netfilter: xtables: avoid pointer to self Commit `784544739a` (netfilter: iptables: lock free counters) broke a number of modules whose rule data referenced itself. A reallocation would not reestablish the correct references, so it is best to use a separate struct that does not fall under RCU. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-03-16 15:35:29 +01:00
Jonathan Corbet	76398425bb	Move FASYNC bit handling to f_op->fasync() Removing the BKL from FASYNC handling ran into the challenge of keeping the setting of the FASYNC bit in filp->f_flags atomic with regard to calls to the underlying fasync() function. Andi Kleen suggested moving the handling of that bit into fasync(); this patch does exactly that. As a result, we have a couple of internal API changes: fasync() must now manage the FASYNC bit, and it will be called without the BKL held. As it happens, every fasync() implementation in the kernel with one exception calls fasync_helper(). So, if we make fasync_helper() set the FASYNC bit, we can avoid making any changes to the other fasync() functions - as long as those functions, themselves, have proper locking. Most fasync() implementations do nothing but call fasync_helper() - which has its own lock - so they are easily verified as correct. The BKL had already been pushed down into the rest. The networking code has its own version of fasync_helper(), so that code has been augmented with explicit FASYNC bit handling. Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: David Miller <davem@davemloft.net> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jonathan Corbet <corbet@lwn.net>	2009-03-16 08:32:27 -06:00
Scott James Remnant	95ba434f89	netfilter: auto-load ip_queue module when socket opened The ip_queue module is missing the net-pf-16-proto-3 alias that would causae it to be auto-loaded when a socket of that type is opened. This patch adds the alias. Signed-off-by: Scott James Remnant <scott@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-03-16 15:31:10 +01:00
Scott James Remnant	26c3b67806	netfilter: auto-load ip6_queue module when socket opened The ip6_queue module is missing the net-pf-16-proto-13 alias that would cause it to be auto-loaded when a socket of that type is opened. This patch adds the alias. Signed-off-by: Scott James Remnant <scott@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-03-16 15:30:14 +01:00
Pablo Neira Ayuso	f0a3c0869f	netfilter: ctnetlink: move event reporting for new entries outside the lock This patch moves the event reporting outside the lock section. With this patch, the creation and update of entries is homogeneous from the event reporting perspective. Moreover, as the event reporting is done outside the lock section, the netlink broadcast delivery can benefit of the yield() call under congestion. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-03-16 15:28:09 +01:00
Pablo Neira Ayuso	e098360f15	netfilter: ctnetlink: cleanup conntrack update preliminary checkings This patch moves the preliminary checkings that must be fulfilled to update a conntrack, which are the following: * NAT manglings cannot be updated * Changing the master conntrack is not allowed. This patch is a cleanup. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-03-16 15:27:22 +01:00
Pablo Neira Ayuso	7ec4749675	netfilter: ctnetlink: cleanup master conntrack assignation This patch moves the assignation of the master conntrack to ctnetlink_create_conntrack(), which is where it really belongs. This patch is a cleanup. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-03-16 15:25:46 +01:00
Pablo Neira Ayuso	1db7a748df	netfilter: conntrack: increase drop stats if sequence adjustment fails This patch increases the statistics of packets drop if the sequence adjustment fails in ipv4_confirm(). Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-03-16 15:18:50 +01:00
Stephen Hemminger	67c0d57930	netfilter: Kconfig spelling fixes (trivial) Signed-off-by: Stephen Hemminger <sheminger@vyatta.com> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-03-16 15:17:23 +01:00
Christoph Paasch	9d2493f88f	netfilter: remove IPvX specific parts from nf_conntrack_l4proto.h Moving the structure definitions to the corresponding IPvX specific header files. Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-03-16 15:15:35 +01:00
Eric Leblond	c7a913cd55	netfilter: print the list of register loggers This patch modifies the proc output to add display of registered loggers. The content of /proc/net/netfilter/nf_log is modified. Instead of displaying a protocol per line with format: proto:logger it now displays: proto:logger (comma_separated_list_of_loggers) NONE is used as keyword if no logger is used. Signed-off-by: Eric Leblond <eric@inl.fr> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-03-16 14:55:27 +01:00
Eric Leblond	ca735b3aaa	netfilter: use a linked list of loggers This patch modifies nf_log to use a linked list of loggers for each protocol. This list of loggers is read and write protected with a mutex. This patch separates registration and binding. To be used as logging module, a module has to register calling nf_log_register() and to bind to a protocol it has to call nf_log_bind_pf(). This patch also converts the logging modules to the new API. For nfnetlink_log, it simply switchs call to register functions to call to bind function and adds a call to nf_log_register() during init. For other modules, it just remove a const flag from the logger structure and replace it with a __read_mostly. Signed-off-by: Eric Leblond <eric@inl.fr> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-03-16 14:54:21 +01:00
Ilpo Järvinen	afece1c658	tcp: make sure xmit goal size never becomes zero It's not too likely to happen, would basically require crafted packets (must hit the max guard in tcp_bound_to_half_wnd()). It seems that nothing that bad would happen as there's tcp_mems and congestion window that prevent runaway at some point from hurting all too much (I'm not that sure what all those zero sized segments we would generate do though in write queue). Preventing it regardless is certainly the best way to go. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Cc: Evgeniy Polyakov <zbr@ioremap.net> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-15 20:09:55 -07:00
Ilpo Järvinen	2a3a041c4e	tcp: cache result of earlier divides when mss-aligning things The results is very unlikely change every so often so we hardly need to divide again after doing that once for a connection. Yet, if divide still becomes necessary we detect that and do the right thing and again settle for non-divide state. Takes the u16 space which was previously taken by the plain xmit_size_goal. This should take care part of the tso vs non-tso difference we found earlier. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-15 20:09:55 -07:00
Ilpo Järvinen	0c54b85f28	tcp: simplify tcp_current_mss There's very little need for most of the callsites to get tp->xmit_goal_size updated. That will cost us divide as is, so slice the function in two. Also, the only users of the tp->xmit_goal_size are directly behind tcp_current_mss(), so there's no need to store that variable into tcp_sock at all! The drop of xmit_goal_size currently leaves 16-bit hole and some reorganization would again be necessary to change that (but I'm aiming to fill that hole with u16 xmit_goal_size_segs to cache the results of the remaining divide to get that tso on regression). Bring xmit_goal_size parts into tcp.c Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Cc: Evgeniy Polyakov <zbr@ioremap.net> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-15 20:09:54 -07:00
Ilpo Järvinen	72211e9050	tcp: don't check mtu probe completion in the loop It seems that no variables clash such that we couldn't do the check just once later on. Therefore move it. Also kill dead obvious comment, dead argument and add unlikely since this mtu probe does not happen too often. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-15 20:09:53 -07:00
Ilpo Järvinen	c887e6d2d9	tcp: consolidate paws check Wow, it was quite tricky to merge that stream of negations but I think I finally got it right: check & replace_ts_recent: (s32)(rcv_tsval - ts_recent) >= 0 => 0 (s32)(ts_recent - rcv_tsval) <= 0 => 0 discard: (s32)(ts_recent - rcv_tsval) > TCP_PAWS_WINDOW => 1 (s32)(ts_recent - rcv_tsval) <= TCP_PAWS_WINDOW => 0 I toggled the return values of tcp_paws_check around since the old encoding added yet-another negation making tracking of truth-values really complicated. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-15 20:09:52 -07:00
Ilpo Järvinen	c43d558a51	tcp: kill dead end_seq variable in clean_rtx_queue I've already forgotten what for this was necessary, anyway it's no longer used (if it ever was). Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-15 20:09:51 -07:00
Ilpo Järvinen	5861f8e58d	tcp: remove pointless .dsack/.num_sacks code In the pure assignment case, the earlier zeroing is still in effect. David S. Miller raised concerns if the ifs are there to avoid dirtying cachelines. I came to these conclusions: > We'll be dirty it anyway (now that I check), the first "real" statement > in tcp_rcv_established is: > > tp->rx_opt.saw_tstamp = 0; > > ...that'll land on the same dword. :-/ > > I suppose the blocks are there just because they had more complexity > inside when they had to calculate the eff_sacks too (maybe it would > have been better to just remove them in that drop-patch so you would > have had less head-ache :-)). Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-15 20:09:51 -07:00
Jarek Poplawski	7cd0a63872	pkt_sched: Change misleading code in class delete. While looking for a possible reason of bugzilla report on HTB oops: http://bugzilla.kernel.org/show_bug.cgi?id=12858 I found the code in htb_delete calling htb_destroy_class on zero refcount is very misleading: it can suggest this is a common path, and destroy is called under sch_tree_lock. Actually, this can never happen like this because before deletion cops->get() is done, and after delete a class is still used by tclass_notify. The class destroy is always called from cops->put(), so without sch_tree_lock. This doesn't mean much now (since 2.6.27) because all vulnerable calls were moved from htb_destroy_class to htb_delete, but there was a bug in older kernels. The same change is done for other classful scheds, which, it seems, didn't have similar locking problems here. Reported-by: m0sia <m0sia@m0sia.ru> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-15 20:00:19 -07:00
Roel Kluin	a2025b8b10	tcp: '< 0' test on unsigned promote 'cnt' to size_t, to match 'len'. Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-13 16:05:14 -07:00
Roel Kluin	8db09f26f9	x25: '< 0' and '>= 0' test on unsigned skb->len is an unsigned int, so the test in x25_rx_call_request() always evaluates to true. len in x25_sendmsg() is unsigned as well. so -ERRORS returned by x25_output() are not noticed. Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-13 16:04:12 -07:00
Denys Fedoryshchenko	73ce7b01b4	ipv4: arp announce, arp_proxy and windows ip conflict verification Windows (XP at least) hosts on boot, with configured static ip, performing address conflict detection, which is defined in RFC3927. Here is quote of important information: " An ARP announcement is identical to the ARP Probe described above, except that now the sender and target IP addresses are both set to the host's newly selected IPv4 address. " But it same time this goes wrong with RFC5227. " The 'sender IP address' field MUST be set to all zeroes; this is to avoid polluting ARP caches in other hosts on the same link in the case where the address turns out to be already in use by another host. " When ARP proxy configured, it must not answer to both cases, because it is address conflict verification in any case. For Windows it is just causing to detect false "ip conflict". Already there is code for RFC5227, so just trivially we just check also if source ip == target ip. Signed-off-by: Denys Fedoryshchenko <denys@visp.net.lb> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-13 16:02:07 -07:00
David S. Miller	08ec9af1c0	xfrm: Fix xfrm_state_find() wrt. wildcard source address. The change to make xfrm_state objects hash on source address broke the case where such source addresses are wildcarded. Fix this by doing a two phase lookup, first with fully specified source address, next using saddr wildcarded. Reported-by: Nicolas Dichtel <nicolas.dichtel@dev.6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-13 14:22:40 -07:00
Yi Zou	1c8dbcf649	[SCSI] net: add NETIF_F_FCOE_CRC to can_checksum_protocol Add FC CRC offload check for ETH_P_FCOE. Signed-off-by: Yi Zou <yi.zou@intel.com> Acked-by: David Miller <davem@davemloft.net> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2009-03-13 15:11:38 -05:00
Neil Horman	273ae44b9c	Network Drop Monitor: Adding Build changes to enable drop monitor Network Drop Monitor: Adding Build changes to enable drop monitor Signed-off-by: Neil Horman <nhorman@tuxdriver.com> include/linux/Kbuild \| 1 + net/Kconfig \| 11 +++++++++++ net/core/Makefile \| 1 + 3 files changed, 13 insertions(+) Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-13 12:09:29 -07:00
Neil Horman	9a8afc8d39	Network Drop Monitor: Adding drop monitor implementation & Netlink protocol Signed-off-by: Neil Horman <nhorman@tuxdriver.com> include/linux/net_dropmon.h \| 56 +++++++++ net/core/drop_monitor.c \| 263 ++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 319 insertions(+) Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-13 12:09:29 -07:00
Neil Horman	ead2ceb0ec	Network Drop Monitor: Adding kfree_skb_clean for non-drops and modifying end-of-line points for skbs Signed-off-by: Neil Horman <nhorman@tuxdriver.com> include/linux/skbuff.h \| 4 +++- net/core/datagram.c \| 2 +- net/core/skbuff.c \| 22 ++++++++++++++++++++++ net/ipv4/arp.c \| 2 +- net/ipv4/udp.c \| 2 +- net/packet/af_packet.c \| 2 +- 6 files changed, 29 insertions(+), 5 deletions(-) Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-13 12:09:28 -07:00
Neil Horman	4893d39e86	Network Drop Monitor: Add trace declaration for skb frees Signed-off-by: Neil Horman <nhorman@tuxdriver.com> include/trace/skb.h \| 8 ++++++++ net/core/Makefile \| 2 ++ net/core/net-traces.c \| 29 +++++++++++++++++++++++++++++ 3 files changed, 39 insertions(+) Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-13 12:09:27 -07:00
malc	6fc791ee63	sctp: add Adaptation Layer Indication parameter only when it's set RFC5061 states: Each adaptation layer that is defined that wishes to use this parameter MUST specify an adaptation code point in an appropriate RFC defining its use and meaning. If the user has not set one - assume they don't want to sent the param with a zero Adaptation Code Point. Rationale - Currently the IANA defines zero as reserved - and 1 as the only valid value - so we consider zero to be unset - to save adding a boolean to the socket structure. Including this parameter unconditionally causes endpoints that do not understand it to report errors unnecessarily. Signed-off-by: Malcolm Lashley <mlashley@gmail.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-13 11:37:58 -07:00
Wei Yongjun	76595024ff	sctp: fix to send FORWARD-TSN chunk only if peer has such capable RFC3758 Section 3.3.1. Sending Forward-TSN-Supported param in INIT Note that if the endpoint chooses NOT to include the parameter, then at no time during the life of the association can it send or process a FORWARD TSN. If peer does not support PR-SCTP capable, don't send FORWARD-TSN chunk to peer. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-13 11:37:58 -07:00
Wei Yongjun	5ffad5aceb	sctp: fix to indicate ASCONF support in INIT-ACK only if peer has such capable This patch fix to indicate ASCONF support in INIT-ACK only if peer has such capable. This patch also fix to calc the chunk size if peer has no FWD-TSN capable. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-13 11:37:56 -07:00
Vlad Yasevich	5e8f3f703a	sctp: simplify sctp listening code sctp_inet_listen() call is split between UDP and TCP style. Looking at the code, the two functions are almost the same and can be merged into a single helper. This also fixes a bug that was fixed in the UDP function, but missed in the TCP function. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-13 11:37:56 -07:00
Rusty Russell	a70f730282	cpumask: replace node_to_cpumask with cpumask_of_node. Impact: cleanup node_to_cpumask (and the blecherous node_to_cpumask_ptr which contained a declaration) are replaced now everyone implements cpumask_of_node. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2009-03-13 14:49:46 +10:30
Trond Myklebust	5e3771ce2d	SUNRPC: Ensure that xs_nospace return values are propagated If xs_nospace() finds that the socket has disconnected, it attempts to return ENOTCONN, however that value is then squashed by the callers. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-11 14:38:01 -04:00
Trond Myklebust	8a2cec295f	SUNRPC: Delay, then retry on connection errors. Enforce the comment in xs_tcp_connect_worker4/xs_tcp_connect_worker6 that we should delay, then retry on certain connection errors. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-11 14:38:01 -04:00
Trond Myklebust	2a4919919a	SUNRPC: Return EAGAIN instead of ENOTCONN when waking up xprt->pending While we should definitely return socket errors to the task that is currently trying to send data, there is no need to propagate the same error to all the other tasks on xprt->pending. Doing so actually slows down recovery, since it causes more than one tasks to attempt socket recovery. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-11 14:38:00 -04:00
Trond Myklebust	482f32e65d	SUNRPC: Handle socket errors correctly Ensure that we pick up and handle socket errors as they occur. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-11 14:38:00 -04:00
Trond Myklebust	c8485e4d63	SUNRPC: Handle ECONNREFUSED correctly in xprt_transmit() If we get an ECONNREFUSED error, we currently go to sleep on the 'xprt->sending' wait queue. The problem is that no timeout is set there, and there is nothing else that will wake the task up later. We should deal with ECONNREFUSED in call_status, given that is where we also deal with -EHOSTDOWN, and friends. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-11 14:37:59 -04:00
Trond Myklebust	40d2549db5	SUNRPC: Don't disconnect if a connection is still in progress. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-11 14:37:58 -04:00
Trond Myklebust	670f945731	SUNRPC: Ensure we set XPRT_CLOSING only after we've sent a tcp FIN... ...so that we can distinguish between when we need to shutdown and when we don't. Also remove the call to xs_tcp_shutdown() from xs_tcp_connect(), since xprt_connect() makes the same test. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-11 14:37:58 -04:00
Trond Myklebust	15f081ca8d	SUNRPC: Avoid an unnecessary task reschedule on ENOTCONN If the socket is unconnected, and xprt_transmit() returns ENOTCONN, we currently give up the lock on the transport channel. Doing so means that the lock automatically gets assigned to the next task in the xprt->sending queue, and so that task needs to be woken up to do the actual connect. The following patch aims to avoid that unnecessary task switch. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-11 14:37:57 -04:00
Tom Talpey	441e3e2429	SUNRPC: dynamically load RPC transport modules on-demand Provide an api to attempt to load any necessary kernel RPC client transport module automatically. By convention, the desired module name is "xprt"+"transport name". For example, when NFS mounting with "-o proto=rdma", attempt to load the "xprtrdma" module. Signed-off-by: Tom Talpey <tmtalpey@gmail.com> Cc: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-11 14:37:56 -04:00
Tom Talpey	b38ab40ad5	XPRTRDMA: correct an rpc/rdma inline send marshaling error Certain client rpc's which contain both lengthy page-contained metadata and a non-empty xdr_tail buffer require careful handling to avoid overlapped memory copying. Rearranging of existing rpcrdma marshaling code avoids it; this fixes an NFSv4 symlink creation error detected with connectathon basic/test8 to multiple servers. Signed-off-by: Tom Talpey <tmtalpey@gmail.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-11 14:37:55 -04:00
Tom Talpey	b1e1e15877	SVCRDMA: remove faulty assertions in rpc/rdma chunk validation. Certain client-provided RPCRDMA chunk alignments result in an additional scatter/gather entry, which triggered nfs/rdma server assertions incorrectly. OpenSolaris nfs/rdma client connectathon testing was blocked by these in the special/locking section. Signed-off-by: Tom Talpey <tmtalpey@gmail.com> Cc: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-11 14:37:55 -04:00
Chuck Lever	fe315e76fc	SUNRPC: Avoid spurious wake-up during UDP connect processing To clear out old state, the UDP connect workers unconditionally invoke xs_close() before proceeding with a new connect. Nowadays this causes a spurious wake-up of the task waiting for the connect to complete. This is a little racey, but usually harmless. The waiting task immediately retries the connect via a call_bind/call_connect sequence, which usually finds the transport already in the connected state because the connect worker has finished in the background. To avoid a spurious wake-up, factor the xs_close() logic that resets the underlying socket into a helper, and have the UDP connect workers call that helper instead of xs_close(). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-11 14:10:21 -04:00
Trond Myklebust	01d37c428a	SUNRPC: xprt_connect() don't abort the task if the transport isn't bound If the transport isn't bound, then we should just return ENOTCONN, letting call_connect_status() and/or call_status() deal with retrying. Currently, we appear to abort all pending tasks with an EIO error. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-11 14:09:39 -04:00
Trond Myklebust	fba91afbec	SUNRPC: Fix an Oops due to socket not set up yet... We can Oops in both xs_udp_send_request() and xs_tcp_send_request() if the call to xs_sendpages() returns an error due to the socket not yet being set up. Deal with that situation by returning a new error: ENOTSOCK, so that we know to avoid dereferencing transport->sock. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-11 14:06:41 -04:00
Eric Dumazet	fc1ad92dfc	tcp: allow timestamps even if SYN packet has tsval=0 Some systems send SYN packets with apparently wrong RFC1323 timestamp option values [timestamp tsval=0 tsecr=0]. It might be for security reasons (http://www.secuobs.com/plugs/25220.shtml ) Linux TCP stack ignores this option and sends back a SYN+ACK packet without timestamp option, thus many TCP flows cannot use timestamps and lose some benefit of RFC1323. Other operating systems seem to not care about initial tsval value, and let tcp flows to negotiate timestamp option. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-11 09:23:57 -07:00
John Dykstra	ff8cf9a938	ipv6: Fix BUG when disabled ipv6 module is unloaded Do not try to "uninitialize" ipv6 if its initialization had been skipped because module parameter disable=1 had been specified. Reported-by: Thomas Backlund <tmb@mandriva.org> Signed-off-by: John Dykstra <john.dykstra1@gmail.com> Acked-by: Brian Haley <brian.haley@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-11 09:22:51 -07:00
Ingo Molnar	78b020d035	Merge branches 'x86/cleanups', 'x86/kexec', 'x86/mce2' and 'linus' into x86/core	2009-03-11 10:49:15 +01:00

... 2 3 4 5 6 ...

12188 Коммитов