WSL2-Linux-Kernel

The source for the Linux kernel used in Windows Subsystem for Linux 2 (WSL2)

Перейти к файлу

David S. Miller 2d1b138505 Merge branch 'Handle-multiple-received-packets-at-each-stage' Edward Cree says: ==================== Handle multiple received packets at each stage This patch series adds the capability for the network stack to receive a list of packets and process them as a unit, rather than handling each packet singly in sequence. This is done by factoring out the existing datapath code at each layer and wrapping it in list handling code. The motivation for this change is twofold: * Instruction cache locality. Currently, running the entire network stack receive path on a packet involves more code than will fit in the lowest-level icache, meaning that when the next packet is handled, the code has to be reloaded from more distant caches. By handling packets in "row-major order", we ensure that the code at each layer is hot for most of the list. (There is a corresponding downside in _data_ cache locality, since we are now touching every packet at every layer, but in practice there is easily enough room in dcache to hold one cacheline of each of the 64 packets in a NAPI poll.) * Reduction of indirect calls. Owing to Spectre mitigations, indirect function calls are now more expensive than ever; they are also heavily used in the network stack's architecture (see [1]). By replacing 64 indirect calls to the next-layer per-packet function with a single indirect call to the next-layer list function, we can save CPU cycles. Drivers pass an SKB list to the stack at the end of the NAPI poll; this gives a natural batch size (the NAPI poll weight) and avoids waiting at the software level for further packets to make a larger batch (which would add latency). It also means that the batch size is automatically tuned by the existing interrupt moderation mechanism. The stack then runs each layer of processing over all the packets in the list before proceeding to the next layer. Where the 'next layer' (or the context in which it must run) differs among the packets, the stack splits the list; this 'late demux' means that packets which differ only in later headers (e.g. same L2/L3 but different L4) can traverse the early part of the stack together. Also, where the next layer is not (yet) list-aware, the stack can revert to calling the rest of the stack in a loop; this allows gradual/creeping listification, with no 'flag day' patch needed to listify everything. Patches 1-2 simply place received packets on a list during the event processing loop on the sfc EF10 architecture, then call the normal stack for each packet singly at the end of the NAPI poll. (Analogues of patch #2 for other NIC drivers should be fairly straightforward.) Patches 3-9 extend the list processing as far as the IP receive handler. Patches 1-2 alone give about a 10% improvement in packet rate in the baseline test; adding patches 3-9 raises this to around 25%. Performance measurements were made with NetPerf UDP_STREAM, using 1-byte packets and a single core to handle interrupts on the RX side; this was in order to measure as simply as possible the packet rate handled by a single core. Figures are in Mbit/s; divide by 8 to obtain Mpps. The setup was tuned for maximum reproducibility, rather than raw performance. Full details and more results (both with and without retpolines) from a previous version of the patch series are presented in [2]. The baseline test uses four streams, and multiple RXQs all bound to a single CPU (the netperf binary is bound to a neighbouring CPU). These tests were run with retpolines. net-next: 6.91 Mb/s (datum) after 9: 8.46 Mb/s (+22.5%) Note however that these results are not robust; changes in the parameters of the test sometimes shrink the gain to single-digit percentages. For instance, when using only a single RXQ, only a 4% gain was seen. One test variation was the use of software filtering/firewall rules. Adding a single iptables rule (UDP port drop on a port range not matching the test traffic), thus making the netfilter hook have work to do, reduced baseline performance but showed a similar gain from the patches: net-next: 5.02 Mb/s (datum) after 9: 6.78 Mb/s (+35.1%) Similarly, testing with a set of TC flower filters (kindly supplied by Cong Wang) gave the following: net-next: 6.83 Mb/s (datum) after 9: 8.86 Mb/s (+29.7%) These data suggest that the batching approach remains effective in the presence of software switching rules, and perhaps even improves the performance of those rules by allowing them and their codepaths to stay in cache between packets. Changes from v3: * Fixed build error when CONFIG_NETFILTER=n (thanks kbuild). Changes from v2: * Used standard list handling (and skb->list) instead of the skb-queue functions (that use skb->next, skb->prev). - As part of this, changed from a "dequeue, process, enqueue" model to using list_for_each_safe, list_del, and (new) list_cut_before. * Altered __netif_receive_skb_core() changes in patch 6 as per Willem de Bruijn's suggestions (separate *ppt_prev from pt_prev; renaming). * Removed patches to Generic XDP, since they were producing no benefit. I may revisit them later. * Removed RFC tags. Changes from v1: * Rebased across 2 years' net-next movement (surprisingly straightforward). - Added Generic XDP handling to netif_receive_skb_list_internal() - Dealt with changes to PFMEMALLOC setting APIs * General cleanup of code and comments. * Skipped function calls for empty lists at various points in the stack (patch #9). * Added listified Generic XDP handling (patches 10-12), though it doesn't seem to help (see above). * Extended testing to cover software firewalls / netfilter etc. [1] http://vger.kernel.org/netconf2018_files/DavidMiller_netconf2018.pdf [2] http://vger.kernel.org/netconf2018_files/EdwardCree_netconf2018.pdf ==================== Signed-off-by: David S. Miller <davem@davemloft.net>		2018-07-04 14:06:20 +09:00
Documentation	net: dsa: Add DT bindings for Vitesse VSC73xx switches	2018-07-04 11:30:01 +09:00
LICENSES	LICENSES: Add Linux-OpenIB license text	2018-04-27 16:41:53 -06:00
arch	Merge ra.kernel.org:/pub/scm/linux/kernel/git/davem/net	2018-07-03 10:29:26 +09:00
block	for-linus-20180629	2018-06-30 10:47:46 -07:00
certs	certs/blacklist: fix const confusion	2018-06-26 09:43:03 -07:00
crypto	Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL	2018-06-28 10:40:47 -07:00
drivers	sfc: batch up RX delivery	2018-07-04 14:06:19 +09:00
firmware	kbuild: remove all dummy assignments to obj-	2017-11-18 11:46:06 +09:00
fs	for-4.18-rc2-tag	2018-07-01 12:38:16 -07:00
include	net: ipv4: listified version of ip_rcv	2018-07-04 14:06:20 +09:00
init	Kbuild fixes for v4.18	2018-06-30 13:05:30 -07:00
ipc	rhashtable: split rhashtable.h	2018-06-22 13:43:27 +09:00
kernel	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2018-07-02 11:18:28 -07:00
lib	Merge ra.kernel.org:/pub/scm/linux/kernel/git/davem/net	2018-07-03 10:29:26 +09:00
mm	slub: fix failure when we delete and create a slab cache	2018-06-28 11:16:44 -07:00
net	net: don't bother calling list RX functions on empty lists	2018-07-04 14:06:20 +09:00
samples	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next	2018-07-04 08:53:53 +09:00
scripts	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2018-07-02 11:18:28 -07:00
security	selinux/stable-4.18 PR 20180629	2018-06-30 11:15:12 -07:00
sound	ALSA: seq: Fix UBSAN warning at SNDRV_SEQ_IOCTL_QUERY_NEXT_CLIENT ioctl	2018-06-25 11:18:04 +02:00
tools	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next	2018-07-04 08:53:53 +09:00
usr	kbuild: rename built-in.o to built-in.a	2018-03-26 02:01:19 +09:00
virt	KVM: arm64: Prevent KVM_COMPAT from being selected	2018-06-21 17:17:50 +01:00
.clang-format	clang-format: add configuration file	2018-04-11 10:28:35 -07:00
.cocciconfig	scripts: add Linux .cocciconfig for coccinelle	2016-07-22 12:13:39 +02:00
.get_maintainer.ignore	Add hch to .get_maintainer.ignore	2015-08-21 14:30:10 -07:00
.gitattributes	.gitattributes: set git diff driver for C source code files	2016-10-07 18:46:30 -07:00
.gitignore	Kbuild updates for v4.17 (2nd)	2018-04-15 17:21:30 -07:00
.mailmap	Merge branch 'asoc-4.17' into asoc-4.18 for compress dependencies	2018-04-26 12:24:28 +01:00
COPYING	COPYING: use the new text with points to the license files	2018-03-23 12:41:45 -06:00
CREDITS	MAINTAINERS/CREDITS: Drop METAG ARCHITECTURE	2018-03-05 16:34:24 +00:00
Kbuild	Kbuild updates for v4.15	2017-11-17 17:45:29 -08:00
Kconfig	kconfig: add basic helper macros to scripts/Kconfig.include	2018-05-29 03:31:19 +09:00
MAINTAINERS	Merge ra.kernel.org:/pub/scm/linux/kernel/git/davem/net	2018-07-03 10:29:26 +09:00
Makefile	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2018-07-02 11:18:28 -07:00
README	Docs: Added a pointer to the formatted docs to README	2018-03-21 09:02:53 -06:00

README

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.
See Documentation/00-INDEX for a list of what is contained in each file.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.