WSL2-Linux-Kernel

История

Jakub Sitnicki d51c5907e9 net, gro: Set inner transport header offset in tcp/udp GRO hook GSO expects inner transport header offset to be valid when skb->encapsulation flag is set. GSO uses this value to calculate the length of an individual segment of a GSO packet in skb_gso_transport_seglen(). However, tcp/udp gro_complete callbacks don't update the skb->inner_transport_header when processing an encapsulated TCP/UDP segment. As a result a GRO skb has ->inner_transport_header set to a value carried over from earlier skb processing. This can have mild to tragic consequences. From miscalculating the GSO segment length to triggering a page fault [1], when trying to read TCP/UDP header at an address past the skb->data page. The latter scenario leads to an oops report like so: BUG: unable to handle page fault for address: ffff9fa7ec00d008 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 123f201067 P4D 123f201067 PUD 123f209067 PMD 0 Oops: 0000 [#1] SMP NOPTI CPU: 44 PID: 0 Comm: swapper/44 Not tainted 5.4.53-cloudflare-2020.7.21 #1 Hardware name: HYVE EDGE-METAL-GEN10/HS-1811DLite1, BIOS V2.15 02/21/2020 RIP: 0010:skb_gso_transport_seglen+0x44/0xa0 Code: c0 41 83 e0 11 f6 87 81 00 00 00 20 74 30 0f b7 87 aa 00 00 00 0f [...] RSP: 0018:ffffad8640bacbb8 EFLAGS: 00010202 RAX: 000000000000feda RBX: ffff9fcc8d31bc00 RCX: ffff9fa7ec00cffc RDX: ffff9fa7ebffdec0 RSI: 000000000000feda RDI: 0000000000000122 RBP: 00000000000005c4 R08: 0000000000000001 R09: 0000000000000000 R10: ffff9fe588ae3800 R11: ffff9fe011fc92f0 R12: ffff9fcc8d31bc00 R13: ffff9fe0119d4300 R14: 00000000000005c4 R15: ffff9fba57d70900 FS: 0000000000000000(0000) GS:ffff9fe68df00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffff9fa7ec00d008 CR3: 0000003e99b1c000 CR4: 0000000000340ee0 Call Trace: <IRQ> skb_gso_validate_network_len+0x11/0x70 __ip_finish_output+0x109/0x1c0 ip_sublist_rcv_finish+0x57/0x70 ip_sublist_rcv+0x2aa/0x2d0 ? ip_rcv_finish_core.constprop.0+0x390/0x390 ip_list_rcv+0x12b/0x14f __netif_receive_skb_list_core+0x2a9/0x2d0 netif_receive_skb_list_internal+0x1b5/0x2e0 napi_complete_done+0x93/0x140 veth_poll+0xc0/0x19f [veth] ? mlx5e_napi_poll+0x221/0x610 [mlx5_core] net_rx_action+0x1f8/0x790 __do_softirq+0xe1/0x2bf irq_exit+0x8e/0xc0 do_IRQ+0x58/0xe0 common_interrupt+0xf/0xf </IRQ> The bug can be observed in a simple setup where we send IP/GRE/IP/TCP packets into a netns over a veth pair. Inside the netns, packets are forwarded to dummy device: trafgen -> [veth A]--[veth B] -forward-> [dummy] For veth B to GRO aggregate packets on receive, it needs to have an XDP program attached (for example, a trivial XDP_PASS). Additionally, for UDP, we need to enable GSO_UDP_L4 feature on the device: ip netns exec A ethtool -K AB rx-udp-gro-forwarding on The last component is an artificial delay to increase the chances of GRO batching happening: ip netns exec A tc qdisc add dev AB root \ netem delay 200us slot 5ms 10ms packets 2 bytes 64k With such a setup in place, the bug can be observed by tracing the skb outer and inner offsets when GSO skb is transmitted from the dummy device: tcp: FUNC DEV SKB_LEN NH TH ENC INH ITH GSO_SIZE GSO_TYPE ip_finish_output dumB 2830 270 290 1 294 254 1383 (tcpv4,gre,) ^^^ udp: FUNC DEV SKB_LEN NH TH ENC INH ITH GSO_SIZE GSO_TYPE ip_finish_output dumB 2818 270 290 1 294 254 1383 (gre,udp_l4,) ^^^ Fix it by updating the inner transport header offset in tcp/udp gro_complete callbacks, similar to how {inet,ipv6}_gro_complete callbacks update the inner network header offset, when skb->encapsulation flag is set. [1] https://lore.kernel.org/netdev/CAKxSbF01cLpZem2GFaUaifh0S-5WYViZemTicAg7FCHOnh6kug@mail.gmail.com/ Fixes: `bf296b125b` ("tcp: Add GRO support") Fixes: `f993bc25e5` ("net: core: handle encapsulation offloads when computing segment lengths") Fixes: `e20cf8d3f1` ("udp: implement GRO for plain UDP sockets.") Reported-by: Alex Forster <aforster@cloudflare.com> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>		2021-08-02 10:20:56 +01:00
..
bpfilter	net: Revert "net: optimize the sockptr_t for unified kernel/user address spaces"	2020-08-10 12:06:44 -07:00
netfilter	netfilter: nf_tables: add and use nft_sk helper	2021-05-29 01:04:53 +02:00
Kconfig	net: ipv4: remove duplicate "the the" phrase in Kconfig text	2020-08-18 16:02:16 -07:00
Makefile	bpf: Clean up sockmap related Kconfigs	2021-02-26 12:28:03 -08:00
af_inet.c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	2021-06-18 19:47:02 -07:00
ah4.c	Networking changes for 5.14.	2021-06-30 15:51:09 -07:00
arp.c	net: Exempt multicast addresses from five-second neighbor lifetime	2020-11-13 14:24:39 -08:00
bpf_tcp_ca.c	bpf: Limit static tcp-cc functions in the .BTF_ids list to x86	2021-05-11 23:23:07 +02:00
cipso_ipv4.c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	2021-06-18 19:47:02 -07:00
datagram.c	inet: stop leaking jiffies on the wire	2019-11-01 14:57:52 -07:00
devinet.c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	2021-06-18 19:47:02 -07:00
esp4.c	Networking changes for 5.14.	2021-06-30 15:51:09 -07:00
esp4_offload.c	xfrm: remove description from xfrm_type struct	2021-06-09 09:38:52 +02:00
fib_frontend.c	net: Use nlmsg_unicast() instead of netlink_unicast()	2021-07-13 09:28:29 -07:00
fib_lookup.h	ipv4: Fix spelling mistakes	2021-06-07 14:08:30 -07:00
fib_notifier.c	net: fib_notifier: propagate extack down to the notifier block callback	2019-10-04 11:10:56 -07:00
fib_rules.c	fib: use indirect call wrappers in the most common fib_rules_ops	2020-07-28 17:42:31 -07:00
fib_semantics.c	ipv4: Fix fall-through warnings for Clang	2021-05-17 19:29:10 -05:00
fib_trie.c	IPv4: Extend 'fib_notify_on_flag_change' sysctl	2021-02-08 16:47:03 -08:00
fou.c	genetlink: move to smaller ops wherever possible	2020-10-02 19:11:11 -07:00
gre_demux.c	net: Remove the member netns_ok	2021-05-17 15:29:35 -07:00
gre_offload.c	ip_gre: add csum offload support for gre header	2021-01-29 20:39:14 -08:00
icmp.c	ipv6: ICMPV6: add response to ICMPV6 RFC 8335 PROBE messages	2021-06-28 14:29:45 -07:00
igmp.c	net: ipv4: fix memory leak in ip_mc_add1_src	2021-06-16 12:41:01 -07:00
inet_connection_sock.c	tcp: Add stats for socket migration.	2021-06-23 12:56:08 -07:00
inet_diag.c	net: Use nlmsg_unicast() instead of netlink_unicast()	2021-07-13 09:28:29 -07:00
inet_fragment.c	inet: frags: batch fqdir destroy works	2020-12-12 15:08:54 -08:00
inet_hashtables.c	tcp: Keep TCP_CLOSE sockets in the reuseport group.	2021-06-15 18:01:05 +02:00
inet_timewait_sock.c	net: Use generic ns_common::count	2020-08-19 14:06:36 +02:00
inetpeer.c	inetpeer: use div64_ul() and clamp_val() calculate inet_peer_threshold	2021-03-01 13:32:12 -08:00
ip_forward.c	ipv4: Revert removal of rt_uses_gateway	2019-09-20 18:23:33 -07:00
ip_fragment.c	inet: frags: re-introduce skb coalescing for local delivery	2019-08-08 15:55:10 -07:00
ip_gre.c	gre: let mac_header point to outer header only when necessary	2021-06-28 12:44:17 -07:00
ip_input.c	net: use indirect call helpers for dst_input	2021-02-03 14:51:39 -08:00
ip_options.c	net: clean up codestyle for net/ipv4	2020-08-25 06:28:02 -07:00
ip_output.c	net: ip: avoid OOM kills with large UDP sends over loopback	2021-06-24 11:17:21 -07:00
ip_sockglue.c	net: Remove duplicated midx check against 0	2020-08-25 06:23:59 -07:00
ip_tunnel.c	net: Set true network header for ECN decapsulation	2021-07-23 16:38:57 +01:00
ip_tunnel_core.c	net: ip_tunnel: clean up endianness conversions	2021-01-08 19:25:35 -08:00
ip_vti.c	ipv4: Fix fall-through warnings for Clang	2021-05-17 19:29:10 -05:00
ipcomp.c	Networking changes for 5.14.	2021-06-30 15:51:09 -07:00
ipconfig.c	net: ipconfig: Don't override command-line hostnames or domains	2021-06-02 13:27:03 -07:00
ipip.c	ipip: allow redirecting ipip and mplsip packets to eth devices	2021-06-28 12:44:17 -07:00
ipmr.c	ipmr: Fix indentation issue	2021-07-07 20:52:25 -07:00
ipmr_base.c	net: fib_notifier: propagate extack down to the notifier block callback	2019-10-04 11:10:56 -07:00
metrics.c	treewide: rename nla_strlcpy to nla_strscpy.	2020-11-16 08:08:54 -08:00
netfilter.c	netfilter: Dissect flow after packet mangling	2021-04-18 22:04:16 +02:00
netlink.c	treewide: Add SPDX license identifier for missed files	2019-05-21 10:50:45 +02:00
nexthop.c	nexthop: Restart nexthop dump based on last dumped nexthop identifier	2021-04-19 15:20:34 -07:00
ping.c	net: sock: introduce sk_error_report	2021-06-29 11:28:21 -07:00
proc.c	tcp: Add stats for socket migration.	2021-06-23 12:56:08 -07:00
protocol.c	net: Remove the member netns_ok	2021-05-17 15:29:35 -07:00
raw.c	net: sock: introduce sk_error_report	2021-06-29 11:28:21 -07:00
raw_diag.c	net: Use nlmsg_unicast() instead of netlink_unicast()	2021-07-13 09:28:29 -07:00
route.c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	2021-06-29 15:45:27 -07:00
syncookies.c	selinux/stable-5.11 PR 20201214	2020-12-16 11:01:04 -08:00
sysctl_net_ipv4.c	net: Introduce net.ipv4.tcp_migrate_req.	2021-06-15 18:01:05 +02:00
tcp.c	tcp: call sk_wmem_schedule before sk_mem_charge in zerocopy path	2021-07-09 11:25:24 -07:00
tcp_bbr.c	tcp: only postpone PROBE_RTT if RTT is < current min_rtt estimate	2020-11-17 11:03:22 -08:00
tcp_bic.c	tcp: fix stretch ACK bugs in BIC	2020-03-16 18:26:54 -07:00
tcp_bpf.c	bpf, sockmap, tcp: sk_prot needs inuse_idx set for proc stats	2021-07-15 19:54:22 +02:00
tcp_cdg.c	treewide: Add SPDX license identifier for more missed files	2019-05-21 10:50:45 +02:00
tcp_cong.c	net: Only allow init netns to set default tcp cong to a restricted algo	2021-05-04 11:58:28 -07:00
tcp_cubic.c	tcp: Rename bictcp function prefix to cubictcp	2021-03-26 20:41:51 -07:00
tcp_dctcp.c	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152	2019-05-30 11:26:32 -07:00
tcp_dctcp.h	tcp: refactor DCTCP ECN ACK handling	2018-10-10 22:26:00 -07:00
tcp_diag.c	inet_diag: Move the INET_DIAG_REQ_BYTECODE nlattr to cb->data	2020-02-27 18:50:19 -08:00
tcp_fastopen.c	tcp: disable TFO blackhole logic by default	2021-07-21 22:50:31 -07:00
tcp_highspeed.c	Replace HTTP links with HTTPS ones: IPv*	2020-07-06 13:23:03 -07:00
tcp_htcp.c	Replace HTTP links with HTTPS ones: IPv*	2020-07-06 13:23:03 -07:00
tcp_hybla.c	treewide: Add SPDX license identifier for more missed files	2019-05-21 10:50:45 +02:00
tcp_illinois.c	treewide: Add SPDX license identifier for more missed files	2019-05-21 10:50:45 +02:00
tcp_input.c	mptcp: avoid processing packet if a subflow reset	2021-07-09 18:38:53 -07:00
tcp_ipv4.c	tcp: disable TFO blackhole logic by default	2021-07-21 22:50:31 -07:00
tcp_lp.c	ipv4: tcp_lp.c: Couple of typo fixes	2021-03-28 17:31:13 -07:00
tcp_metrics.c	fixes-v5.11	2020-12-14 16:40:27 -08:00
tcp_minisocks.c	tcp: Add stats for socket migration.	2021-06-23 12:56:08 -07:00
tcp_nv.c	treewide: Add SPDX license identifier for more missed files	2019-05-21 10:50:45 +02:00
tcp_offload.c	net, gro: Set inner transport header offset in tcp/udp GRO hook	2021-08-02 10:20:56 +01:00
tcp_output.c	ipv6: tcp: drop silly ICMPv6 packet too big messages	2021-07-08 12:27:08 -07:00
tcp_rate.c	treewide: Add SPDX license identifier for missed files	2019-05-21 10:50:45 +02:00
tcp_recovery.c	tcp: fix TLP timer not set when CA_STATE changes from DISORDER to OPEN	2021-01-23 21:33:01 -08:00
tcp_scalable.c	net: ipv4: delete repeated words	2020-08-24 17:31:20 -07:00
tcp_timer.c	net: sock: introduce sk_error_report	2021-06-29 11:28:21 -07:00
tcp_ulp.c	bpf: sockmap: Only check ULP for TCP sockets	2020-03-09 22:34:58 +01:00
tcp_vegas.c	tcp: use semicolons rather than commas to separate statements	2020-10-13 17:11:52 -07:00
tcp_vegas.h	License cleanup: add SPDX GPL-2.0 license identifier to files with no license	2017-11-02 11:10:55 +01:00
tcp_veno.c	Replace HTTP links with HTTPS ones: IPv*	2020-07-06 13:23:03 -07:00
tcp_westwood.c	treewide: Add SPDX license identifier for more missed files	2019-05-21 10:50:45 +02:00
tcp_yeah.c	tcp_yeah: check struct yeah size at compile time	2021-06-29 11:54:36 -07:00
tunnel4.c	net: Remove the member netns_ok	2021-05-17 15:29:35 -07:00
udp.c	udp: check encap socket in __udp_lib_err	2021-07-21 08:49:31 -07:00
udp_bpf.c	bpf, sockmap, udp: sk_prot needs inuse_idx set for proc stats	2021-07-15 19:54:36 +02:00
udp_diag.c	net: Use nlmsg_unicast() instead of netlink_unicast()	2021-07-13 09:28:29 -07:00
udp_impl.h	net: pass a sockptr_t into ->setsockopt	2020-07-24 15:41:54 -07:00
udp_offload.c	net, gro: Set inner transport header offset in tcp/udp GRO hook	2021-08-02 10:20:56 +01:00
udp_tunnel_core.c	udp_tunnel: reshuffle NETIF_F_RX_UDP_TUNNEL_PORT checks	2021-01-07 12:53:29 -08:00
udp_tunnel_nic.c	udp_tunnel: add the ability to share port tables	2020-09-28 12:50:12 -07:00
udp_tunnel_stub.c	udp_tunnel: add central NIC RX port offload infrastructure	2020-07-10 13:54:00 -07:00
udplite.c	net: Remove the member netns_ok	2021-05-17 15:29:35 -07:00
xfrm4_input.c	xfrm: state: remove extract_input indirection from xfrm_state_afinfo	2020-05-06 09:40:08 +02:00
xfrm4_output.c	xfrm: fix unused variable warning if CONFIG_NETFILTER=n	2020-05-11 15:12:27 +02:00
xfrm4_policy.c	net: add bool confirm_neigh parameter for dst_ops.update_pmtu	2019-12-24 22:28:54 -08:00
xfrm4_protocol.c	net: Remove the member netns_ok	2021-05-17 15:29:35 -07:00
xfrm4_state.c	xfrm: remove output_finish indirection from xfrm_state_afinfo	2020-05-06 09:40:08 +02:00
xfrm4_tunnel.c	xfrm: remove description from xfrm_type struct	2021-06-09 09:38:52 +02:00