WSL2-Linux-Kernel

Граф коммитов

Автор	SHA1	Сообщение	Дата
Al Viro	4de930efc2	net: validate the range we feed to iov_iter_init() in sys_sendto/sys_recvfrom Cc: stable@vger.kernel.org # v3.19 Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 16:38:06 -04:00
David S. Miller	b4c11cb437	Merge branch 'amd-xgbe-next' Tom Lendacky says: ==================== amd-xgbe: AMD XGBE driver updates 2015-03-19 The following series of patches includes functional updates and changes to the driver. - Use the phydev->advertising field instead of the phydev->supported field when configuring for auto-negotiation, etc. - Use the phy_driver flags field for setting the transceiver type instead of hardcoding it in the ethtool support. - Provide an auto-negotiation timeout check - Clarify the Tx/Rx queue information messages - Use the new DMA memory barrier operations - Set the device DMA mask based on what the hardware reports - Remove the software implementation of Tx coalescing - Fix the reporting of the Rx coalescing value - Use napi_alloc_skb when allocating an SKB in softirq This patch series is based on net-next. Changes from v2: - Use jiffies instead of timespec for the auto-negotiation timeout check - Remove the Rx path SKB allocation re-work patch since we should only inline the headers and the current code guards better against any hardware bugs Changes from v1: - Default to 32-bit DMA width (minimum supported) if hardware returns an unexpected DMA width value ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 16:34:03 -04:00
Lendacky, Thomas	385565a1f0	amd-xgbe: Use napi_alloc_skb when allocating skb in softirq Use the napi_alloc_skb function to allocate an skb when running within the softirq context to avoid calls to local_irq_save/restore. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 16:33:57 -04:00
Lendacky, Thomas	4a57ebcc2c	amd-xgbe: Fix Rx coalescing reporting The Rx coalescing value is internally converted from usecs to a value that the hardware can use. When reporting the Rx coalescing value, this internal value is converted back to usecs. During the conversion from and back to usecs some rounding occurs. So, for example, when setting an Rx usec of 30, it will be reported as 29. Fix this reporting issue by keeping the original usec value and using that during reporting. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 16:33:57 -04:00
Lendacky, Thomas	c635eaacbf	amd-xgbe: Remove Tx coalescing The Tx coalescing support in the driver was a software implementation for something lacking in the hardware. Using hrtimers, the idea was to trigger a timer interrupt after having queued a packet for transmit. Unfortunately, as the timer value was lowered, the timer expired before the hardware actually did the transmit and so it was racey and resulted in unnecessary interrupts. Remove the Tx coalescing support and hrtimer and replace with a Tx timer that is used as a reclaim timer in case of inactivity. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 16:33:57 -04:00
Lendacky, Thomas	386d325dbd	amd-xgbe: Set DMA mask based on hardware register value The hardware supplies a value that indicates the DMA range that it is capable of using. Use this value rather than hard-coding it in the driver. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 16:33:57 -04:00
Lendacky, Thomas	ceb8f6be7e	amd-xgbe: Use the new DMA memory barriers where appropriate Use the new lighter weight memory barriers when working with the device descriptors. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 16:33:57 -04:00
Lendacky, Thomas	600c8811d3	amd-xgbe: Clarify output message about queues Clarify that the queues referred to in a message when the device is brought up are hardware queues and not necessarily related to the Linux network queues. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 16:33:57 -04:00
Lendacky, Thomas	9ae5eecdba	amd-xgbe-phy: Provide support for auto-negotiation timeout Currently, there is no interrupt code that indicates auto-negotiation has timed out. If the auto-negotiation has timed out then the start of a new auto-negotiation will begin again with a new base page being received. The state machine could be in a state that is not expecting this interrupt code which results in an error during auto-negotiation. Update the code to timestamp when the auto-negotiation starts. Should another page received interrupt code occur before auto-negotiation has completed but after the auto-negotiation timeout, then reset the state machine to allow the auto-negotiation to continue. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 16:33:56 -04:00
Lendacky, Thomas	65f57cb152	amd-xgbe-phy: Use the phy_driver flags field Remove the setting of the transceiver type when retrieving the device settings using ethtool and instead set the transceiver type in the phy_driver structure flags field. Change the transceiver type to be internal, also. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 16:33:56 -04:00
Lendacky, Thomas	d9663c8c21	amd-xgbe-phy: Use phydev advertising field vs supported With ethtool being able to control what is advertised, the advertising field is what should be used for priming the auto-negotiation registers and for various other checks, instead of the supported field. Also, move the initial setting of the supported and advertising fields into the probe function so that they are not reset each time the device is brought up, thus allowing the user to set as desired before bringing the device up. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 16:33:56 -04:00
Catalin Marinas	91edd096e2	net: compat: Update get_compat_msghdr() to match copy_msghdr_from_user() behaviour Commit `db31c55a6f` (net: clamp ->msg_namelen instead of returning an error) introduced the clamping of msg_namelen when the unsigned value was larger than sizeof(struct sockaddr_storage). This caused a msg_namelen of -1 to be valid. The native code was subsequently fixed by commit `dbb490b965` (net: socket: error on a negative msg_namelen). In addition, the native code sets msg_namelen to 0 when msg_name is NULL. This was done in commit (`6a2a2b3ae0` net:socket: set msg_namelen to 0 if msg_name is passed as NULL in msghdr struct from userland) and subsequently updated by `08adb7dabd` (fold verify_iovec() into copy_msghdr_from_user()). This patch brings the get_compat_msghdr() in line with copy_msghdr_from_user(). Fixes: `db31c55a6f` (net: clamp ->msg_namelen instead of returning an error) Cc: David S. Miller <davem@davemloft.net> Cc: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 16:31:09 -04:00
David S. Miller	ebd6af092a	Merge branch 'rhashtable-inlined-interface' Herbert Xu says: ==================== rhashtable: Introduce inlined interface This series of patches introduces the inlined rhashtable interface. The idea is to make all the function pointers visible to the compiler by providing the rhashtable_params structure explicitly to each inline rhashtable function. For example, instead of doing obj = rhashtable_lookup(ht, key); you would now do obj = rhashtable_lookup_fast(ht, key, params); Where params is the same data that you would give to rhashtable_init. In particular, within rhashtable.c itself we would simply supply ht->p. So to convert users over, you simply have to make params globally accessible, e.g., by placing it in a static const variable, which can then be used at each inlined call site, as well as by the rhashtable_init call. The only ticky bit is that some users (i.e., netfilter) has a dynamic key length. This is dealt with by using params.key_len in the inline functions when it is non-zero, and otherwise falling back on ht->p.key_len. Note that I've only tested this on one compiler, gcc 4.7.2. So please test this with your compilers as well and make sure that the code is actually inlined without indirect function calls. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 16:16:32 -04:00
Herbert Xu	dc0ee268d8	rhashtable: Rip out obsolete out-of-line interface Now that all rhashtable users have been converted over to the inline interface, this patch removes the unused out-of-line interface. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 16:16:24 -04:00
Herbert Xu	6cca7289d5	tipc: Use inlined rhashtable interface This patch converts tipc to the inlined rhashtable interface. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 16:16:24 -04:00
Herbert Xu	b182aa6e96	test_rhashtable: Use inlined rhashtable interface This patch converts test_rhashtable to the inlined rhashtable interface. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 16:16:24 -04:00
Herbert Xu	fa3773211e	netfilter: Convert nft_hash to inlined rhashtable This patch converts nft_hash to the inlined rhashtable interface. This patch also replaces the call to rhashtable_lookup_compare with a straight rhashtable_lookup_fast because it's simply doing a memcmp (in fact nft_hash_lookup already uses memcmp instead of nft_data_cmp). Furthermore, the compare function is only meant to compare, it is not supposed to have side-effects. The current side-effect code can simply be moved into the nft_hash_get. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 16:16:24 -04:00
Herbert Xu	c428ecd1a2	netlink: Move namespace into hash key Currently the name space is a de facto key because it has to match before we find an object in the hash table. However, it isn't in the hash value so all objects from different name spaces with the same port ID hash to the same bucket. This is bad as the number of name spaces is unbounded. This patch fixes this by using the namespace when doing the hash. Because the namespace field doesn't lie next to the portid field in the netlink socket, this patch switches over to the rhashtable interface without a fixed key. This patch also uses the new inlined rhashtable interface where possible. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 16:16:24 -04:00
Herbert Xu	02fd97c3d4	rhashtable: Allow hash/comparison functions to be inlined This patch deals with the complaint that we make indirect function calls on the fast paths unnecessarily in rhashtable. We resolve it by moving the fast paths into inline functions that take struct rhashtable_param (which obviously must be the same set of parameters supplied to rhashtable_init) as an argument. The only remaining indirect call is to obj_hashfn (or key_hashfn it obj_hashfn is unset) on the rehash as well as the insert-during- rehash slow path. This patch also extends the support of vairable-length keys to include those where the key is fixed but scattered in the object. For example, in netlink we want to key off the namespace and the portid but they're not next to each other. This patch does this by directly using the object hash function as the indicator of whether the key is accessible or not. It also adds a new function obj_cmpfn to compare a key against an object. This means that the caller no longer needs to supply explicit compare functions. All this is done in a backwards compatible manner so no existing users are affected until they convert to the new interface. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 16:16:24 -04:00
Herbert Xu	488fb86ee9	rhashtable: Make rhashtable_init params argument const This patch marks the rhashtable_init params argument const as there is no reason to modify it since we will always make a copy of it in the rhashtable. This patch also fixes a bug where we don't actually round up the value of min_size unless it is less than HASH_MIN_SIZE. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 16:16:24 -04:00
Daniel Borkmann	0b8c707ddf	ebpf, filter: do not convert skb->protocol to host endianess during runtime Commit `c249739579` ("bpf: allow BPF programs access 'protocol' and 'vlan_tci' fields") has added support for accessing protocol, vlan_present and vlan_tci into the skb offset map. As referenced in the below discussion, accessing skb->protocol from an eBPF program should be converted without handling endianess. The reason for this is that an eBPF program could simply do a check more naturally, by f.e. testing skb->protocol == htons(ETH_P_IP), where the LLVM compiler resolves htons() against a constant automatically during compilation time, as opposed to an otherwise needed run time conversion. After all, the way of programming both from a user perspective differs quite a lot, i.e. bpf_asm ["ld proto"] versus a C subset/LLVM. Reference: https://patchwork.ozlabs.org/patch/450819/ Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 15:24:26 -04:00
Suzuki K. Poulose	7132813c38	arm64: Honor __GFP_ZERO in dma allocations Current implementation doesn't zero out the pages allocated. Honor the __GFP_ZERO flag and zero out if set. Cc: <stable@vger.kernel.org> # v3.14+ Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2015-03-20 18:18:54 +00:00
Marcelo Ricardo Leitner	c4a6853d8f	ipv6: invert join/leave anycast rtnl/socket locking order Commit `baf606d9c9` ("ipv4,ipv6: grab rtnl before locking the socket") missed to update two setsockopt options, IPV6_JOIN_ANYCAST and IPV6_LEAVE_ANYCAST, causing a lock inverstion regarding to the updated ones. As ipv6_sock_ac_join and ipv6_sock_ac_leave are only called from do_ipv6_setsockopt, we are good to just move the rtnl lock upper. Fixes: `baf606d9c9` ("ipv4,ipv6: grab rtnl before locking the socket") Reported-by: Ying Huang <ying.huang@intel.com> Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 13:32:38 -04:00
Marcelo Ricardo Leitner	149d7549c2	vxlan: fix possible use of uninitialized in vxlan_igmp_{join, leave} Test robot noticed that we check the return of vxlan_igmp_join and leave but inside them there was a path that it could be used initialized. It's not really possible because those if() inside these igmp functions would always match as we can't have sockets of other type in there, but this way we keep the compiler happy. Fixes: `56ef9c909b` ("vxlan: Move socket initialization to within rtnl scope") Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 13:31:24 -04:00
David S. Miller	de58a6da85	Merge branch 'be2net' Sathya Perla says: ==================== be2net: patch set Hi David, this patch set includes 3 bug fixes to the be2net driver. Patch 1 fixes a vlan isolation issue with VFs. When a VF is placed in promiscous mode, it could receive packets belonging to any vlan, as the PF driver grants vlan promisc capability to VFs. The PF driver now disables the vlan promisc capability for VFs to fix this problem. Patch 2 fixes the call to MODIFY_EQ_DELAY FW cmd to not include more than 8 EQs per cmd. The FW is not capable of handling more than 8 EQs per cmd. Patch 3 fixes an EEH error detection issue. On Power platforms, when an EEH error occurs, the slot disconnect state is more reliably detected via an MMIO read compared to a config read. So, the error register reads that occur every second are now done via MMIO. Pls apply this patch set to the "net" tree. Thanks! ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 13:25:56 -04:00
Suresh Reddy	25848c9015	be2net: use PCI MMIO read instead of config read for errors When an EEH error occurs, the device/slot is disconnected. This condition is more reliably detected (i.e., returns all ones) with an MMIO read rather than a config read -- especially on power platforms. Hence, this patch fixes EEH error detection by replacing config reads with MMIO reads for reading the error registers. The error registers in Skyhawk-R/BE2/BE3 are accessible both via the config space and the PCICFG (BAR0) memory space. Reported-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Suresh Reddy <Suresh.Reddy@emulex.com> Signed-off-by: Sathya Perla <sathya.perla@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 13:25:51 -04:00
Suresh Reddy	c8ba4ad0b5	be2net: restrict MODIFY_EQ_DELAY cmd to a max of 8 EQs Issuing this cmd for more than 8 EQs does not have the intended effect even on BEx and Skyhawk-R. This patch fixes this by issuing this cmd for upto 8 EQs at a time. Signed-off-by: Suresh Reddy <Suresh.Reddy@emulex.com> Signed-off-by: Sathya Perla <sathya.perla@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 13:25:51 -04:00
Vasundhara Volam	435452aa88	be2net: Prevent VFs from enabling VLAN promiscuous mode Currently, a PF does not restrict its VF interface from enabling vlan promiscuous mode. This breaks vlan isolation when a vlan (transparent tagging) is configured on a VF. This patch fixes this problem by disabling the vlan promisc capability for VFs. Reported-by: Yoann Juet <veilletechno-irts@univ-nantes.fr> Signed-off-by: Vasundhara Volam <vasundhara.volam@emulex.com> Signed-off-by: Sathya Perla <sathya.perla@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 13:25:51 -04:00
Josh Hunt	d22e153718	tcp: fix tcp fin memory accounting tcp_send_fin() does not account for the memory it allocates properly, so sk_forward_alloc can be negative in cases where we've sent a FIN: ss example output (ss -amn \| grep -B1 f4294): tcp FIN-WAIT-1 0 1 192.168.0.1:45520 192.0.2.1:8080 skmem:(r0,rb87380,t0,tb87380,f4294966016,w1280,o0,bl0) Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 13:18:52 -04:00
Will Deacon	130c93fd10	arm64: efi: don't restore TTBR0 if active_mm points at init_mm init_mm isn't a normal mm: it has swapper_pg_dir as its pgd (which contains kernel mappings) and is used as the active_mm for the idle thread. When restoring the pgd after an EFI call, we write current->active_mm into TTBR0. If the current task is actually the idle thread (e.g. when initialising the EFI RTC before entering userspace), then the TLB can erroneously populate itself with junk global entries as a result of speculative table walks. When we do eventually return to userspace, the task can end up hitting these junk mappings leading to lockups, corruption or crashes. This patch fixes the problem in the same way as the CPU suspend code by ensuring that we never switch to the init_mm in efi_set_pgd and instead point TTBR0 at the zero page. A check is also added to cpu_switch_mm to BUG if we get passed swapper_pg_dir. Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Fixes: `f3cdfd239d` ("arm64/efi: move SetVirtualAddressMap() to UEFI stub") Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2015-03-20 17:05:16 +00:00
Steven Barth	73ba57bfae	ipv6: fix backtracking for throw routes for throw routes to trigger evaluation of other policy rules EAGAIN needs to be propagated up to fib_rules_lookup similar to how its done for IPv4 A simple testcase for verification is: ip -6 rule add lookup 33333 priority 33333 ip -6 route add throw 2001:db8::1 ip -6 route add 2001:db8::1 via fe80::1 dev wlan0 table 33333 ip route get 2001:db8::1 Signed-off-by: Steven Barth <cyrus@openwrt.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 12:57:23 -04:00
Markos Chandras	87f966d97b	net: ethernet: pcnet32: Setup the SRAM and NOUFLO on Am79C97{3, 5} On a MIPS Malta board, tons of fifo underflow errors have been observed when using u-boot as bootloader instead of YAMON. The reason for that is that YAMON used to set the pcnet device to SRAM mode but u-boot does not. As a result, the default Tx threshold (64 bytes) is now too small to keep the fifo relatively used and it can result to Tx fifo underflow errors. As a result of which, it's best to setup the SRAM on supported controllers so we can always use the NOUFLO bit. Cc: <netdev@vger.kernel.org> Cc: <stable@vger.kernel.org> Cc: <linux-kernel@vger.kernel.org> Cc: Don Fry <pcnet32@frontier.com> Signed-off-by: Markos Chandras <markos.chandras@imgtec.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 12:56:40 -04:00
Sabrina Dubroca	8e199dfd82	ipv6: call ipv6_proxy_select_ident instead of ipv6_select_ident in udp6_ufo_fragment Matt Grant reported frequent crashes in ipv6_select_ident when udp6_ufo_fragment is called from openvswitch on a skb that doesn't have a dst_entry set. ipv6_proxy_select_ident generates the frag_id without using the dst associated with the skb. This approach was suggested by Vladislav Yasevich. Fixes: `0508c07f5e` ("ipv6: Select fragment id during UFO segmentation if not set.") Cc: Vladislav Yasevich <vyasevic@redhat.com> Reported-by: Matt Grant <matt@mattgrant.net.nz> Tested-by: Matt Grant <matt@mattgrant.net.nz> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Acked-by: Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 12:56:11 -04:00
Palik, Imre	edafc132ba	xen-netback: making the bandwidth limiter runtime settable With the current netback, the bandwidth limiter's parameters are only settable during vif setup time. This patch register a watch on them, and thus makes them runtime changeable. When the watch fires, the timer is reset. The timer's mutex is used for fencing the change. Cc: Anthony Liguori <aliguori@amazon.com> Signed-off-by: Imre Palik <imrep@amazon.de> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 12:55:15 -04:00
David S. Miller	750f2f9165	Merge branch 'listener_refactor_part_14' Eric Dumazet says: ==================== inet: tcp listener refactoring part 14 OK, we have serious patches here. We get rid of the central timer handling SYNACK rtx, which is killing us under even medium SYN flood. We still use the listener specific hash table. This will be done in next round ;) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 12:40:33 -04:00
Eric Dumazet	becb74f0ac	net: increase sk_[max_]ack_backlog sk_ack_backlog & sk_max_ack_backlog were 16bit fields, meaning listen() backlog was limited to 65535. It is time to increase the width to allow much bigger backlog, if admins change /proc/sys/net/core/somaxconn & /proc/sys/net/ipv4/tcp_max_syn_backlog default values. Tested: echo 5000000 >/proc/sys/net/core/somaxconn echo 5000000 >/proc/sys/net/ipv4/tcp_max_syn_backlog Ran a SYNFLOOD test against a listener using listen(fd, 5000000) myhost~# grep request_sock_TCP /proc/slabinfo request_sock_TCP 4185642 4411940 304 13 1 : tunables 54 27 8 : slabdata 339380 339380 0 Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 12:40:25 -04:00
Eric Dumazet	fa76ce7328	inet: get rid of central tcp/dccp listener timer One of the major issue for TCP is the SYNACK rtx handling, done by inet_csk_reqsk_queue_prune(), fired by the keepalive timer of a TCP_LISTEN socket. This function runs for awful long times, with socket lock held, meaning that other cpus needing this lock have to spin for hundred of ms. SYNACK are sent in huge bursts, likely to cause severe drops anyway. This model was OK 15 years ago when memory was very tight. We now can afford to have a timer per request sock. Timer invocations no longer need to lock the listener, and can be run from all cpus in parallel. With following patch increasing somaxconn width to 32 bits, I tested a listener with more than 4 million active request sockets, and a steady SYNFLOOD of ~200,000 SYN per second. Host was sending ~830,000 SYNACK per second. This is ~100 times more what we could achieve before this patch. Later, we will get rid of the listener hash and use ehash instead. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 12:40:25 -04:00
Eric Dumazet	52452c5425	inet: drop prev pointer handling in request sock When request sock are put in ehash table, the whole notion of having a previous request to update dl_next is pointless. Also, following patch will get rid of big purge timer, so we want to delete a request sock without holding listener lock. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 12:40:25 -04:00
Rafael J. Wysocki	9e8ce4b96b	Revert "x86/PCI: Refine the way to release PCI IRQ resources" Commit `b4b55cda58` (Refine the way to release PCI IRQ resources) introduced a regression in the PCI IRQ resource management by causing the IRQ resource of a device, established when pci_enabled_device() is called on a fully disabled device, to be released when the driver is unbound from the device, regardless of the enable_cnt. This leads to the situation that an ill-behaved driver can now make a device unusable to subsequent drivers by an imbalance in their use of pci_enable/disable_device(). That is a serious problem for secondary drivers like vfio-pci, which are innocent of the transgressions of the previous driver. Since the solution of this problem is not immediate and requires further discussion, revert commit `b4b55cda58` and the issue it was supposed to address (a bug related to xen-pciback) will be taken care of in a different way going forward. Reported-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-03-20 14:56:19 +01:00
Pablo Neira Ayuso	3d8c6dce53	netfilter: xt_TPROXY: fix invflags check in tproxy_tg6_check() We have to check for IP6T_INV_PROTO in invflags, instead of flags. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Balazs Scheidler <bazsi@balabit.hu>	2015-03-20 14:35:33 +01:00
Dave Airlie	8265d4486d	Merge tag 'drm-intel-fixes-2015-03-19' of git://anongit.freedesktop.org/drm-intel into drm-fixes Backporting a couple of plane related fixes from drm-next to v4.0. * tag 'drm-intel-fixes-2015-03-19' of git://anongit.freedesktop.org/drm-intel: drm/i915: Make sure the primary plane is enabled before reading out the fb state drm/i915: Ensure plane->state->fb stays in sync with plane->fb	2015-03-20 17:32:21 +10:00
Dave Airlie	f42e2c2429	Merge tag 'drm-amdkfd-fixes-2015-03-19' of git://people.freedesktop.org/~gabbayo/linux into drm-fixes - Fixing SDMA initialization when in non-HWS mode (debug mode) - Memory leak fix when destroying kernel queue - Fix number of available compute pipelines according to new firmware * tag 'drm-amdkfd-fixes-2015-03-19' of git://people.freedesktop.org/~gabbayo/linux: drm/radeon: Changing number of compute pipe lines drm/amdkfd: Fix SDMA queue init. in non-HWS mode drm/amdkfd: destroy mqd when destroying kernel queue	2015-03-20 17:32:01 +10:00
Christophe Vu-Brugier	9bc6548f37	target: do not reject FUA CDBs when write cache is enabled but emulate_write_cache is 0 A check that rejects a CDB with FUA bit set if no write cache is emulated was added by the following commit: `fde9f50` target: Add sanity checks for DPO/FUA bit usage The condition is as follows: if (!dev->dev_attrib.emulate_fua_write \|\| !dev->dev_attrib.emulate_write_cache) However, this check is wrong if the backend device supports WCE but "emulate_write_cache" is disabled. This patch uses se_dev_check_wce() (previously named spc_check_dev_wce) to invoke transport->get_write_cache() if the device has a write cache or check the "emulate_write_cache" attribute otherwise. Reported-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christophe Vu-Brugier <cvubrugier@fastmail.fm> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2015-03-19 23:26:46 -07:00
Nicholas Bellinger	5f7da044f8	target: Fix virtual LUN=0 target_configure_device failure OOPs This patch fixes a NULL pointer dereference triggered by a late target_configure_device() -> alloc_workqueue() failure that results in target_free_device() being called with DF_CONFIGURED already set, which subsequently OOPses in destroy_workqueue() code. Currently this only happens at modprobe target_core_mod time when core_dev_setup_virtual_lun0() -> target_configure_device() fails, and the explicit target_free_device() gets called. To address this bug originally introduced by commit `0fd97ccf45`, go ahead and move DF_CONFIGURED to end of target_configure_device() code to handle this special failure case. Reported-by: Claudio Fleiner <cmf@daterainc.com> Cc: Claudio Fleiner <cmf@daterainc.com> Cc: Christoph Hellwig <hch@lst.de> Cc: <stable@vger.kernel.org> # v3.7+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2015-03-19 23:26:44 -07:00
Nicholas Bellinger	215a8fe419	target/pscsi: Fix NULL pointer dereference in get_device_type This patch fixes a NULL pointer dereference OOPs with pSCSI backends within target_core_stat.c code. The bug is caused by a configfs attr read if no pscsi_dev_virt->pdv_sd has been configured. Reported-by: Olaf Hering <olaf@aepfle.de> Cc: <stable@vger.kernel.org> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2015-03-19 23:26:42 -07:00
Dan Carpenter	d556546e7e	tcm_fc: missing curly braces in ft_invl_hw_context() This patch adds a missing set of conditional check braces in ft_invl_hw_context() originally introduced by commit `dcd998ccd` when handling DDP failures in ft_recv_write_data() code. commit `dcd998ccdb` Author: Kiran Patil <kiran.patil@intel.com> Date: Wed Aug 3 09:20:01 2011 +0000 tcm_fc: Handle DDP/SW fc_frame_payload_get failures in ft_recv_write_data Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: Kiran Patil <kiran.patil@intel.com> Cc: <stable@vger.kernel.org> # 3.1+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2015-03-19 23:26:41 -07:00
Bart Van Assche	7544e59734	target: Fix reference leak in target_get_sess_cmd() error path This patch fixes a se_cmd->cmd_kref leak buf when se_sess->sess_tearing_down is true within target_get_sess_cmd() submission path code. This se_cmd reference leak can occur during active session shutdown when ack_kref=1 is passed by target_submit_cmd_[map_sgls,tmr]() callers. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: <stable@vger.kernel.org> # 3.6+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2015-03-19 23:26:31 -07:00
Bart Van Assche	2f450cc1fb	loop/usb/vhost-scsi/xen-scsiback: Fix use of __transport_register_session This patch changes loopback, usb-gadget, vhost-scsi and xen-scsiback fabric code to invoke transport_register_session() instead of the unprotected flavour, to ensure se_tpg->session_lock is taken when adding new session list nodes to se_tpg->tpg_sess_list. Note that since these four fabric drivers already hold their own internal TPG mutexes when accessing se_tpg->tpg_sess_list, and consist of a single se_session created through configfs attribute access, no list corruption can currently occur. So for correctness sake, go ahead and use the se_tpg->session_lock protected version for these four fabric drivers. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2015-03-19 23:15:14 -07:00
Bart Van Assche	75c3d0bf9c	tcm_qla2xxx: Fix incorrect use of __transport_register_session This patch fixes the incorrect use of __transport_register_session() in tcm_qla2xxx_check_initiator_node_acl() code, that does not perform explicit se_tpg->session_lock when accessing se_tpg->tpg_sess_list to add new se_sess nodes. Given that tcm_qla2xxx_check_initiator_node_acl() is not called with qla_hw->hardware_lock held for all accesses of ->tpg_sess_list, the code should be using transport_register_session() instead. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Giridhar Malavali <giridhar.malavali@qlogic.com> Cc: Quinn Tran <quinn.tran@qlogic.com> Cc: <stable@vger.kernel.org> # 3.5+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2015-03-19 23:14:14 -07:00
Nicholas Bellinger	f068fbc82e	iscsi-target: Avoid early conn_logout_comp for iser connections This patch fixes a iser specific logout bug where early complete() of conn->conn_logout_comp in iscsit_close_connection() was causing isert_wait4logout() to complete too soon, triggering a use after free NULL pointer dereference of iscsi_conn memory. The complete() was originally added for traditional iscsi-target when a ISCSI_LOGOUT_OP failed in iscsi_target_rx_opcode(), but given iser-target does not wait in logout failure, this special case needs to be avoided. Reported-by: Sagi Grimberg <sagig@mellanox.com> Cc: Sagi Grimberg <sagig@mellanox.com> Cc: Slava Shwartsman <valyushash@gmail.com> Cc: <stable@vger.kernel.org> # v3.10+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2015-03-19 23:10:01 -07:00

... 2 3 4 5 6 ...

507948 Коммитов Все ветки Поиск

507948 Коммитов

Все ветки