WSL2-Linux-Kernel/net/core
Yan Zhai 2f1d402dcc net: report RCU QS on threaded NAPI repolling
[ Upstream commit d6dbbb11247c71203785a2c9da474c36f4b19eae ]

NAPI threads can keep polling packets under load. Currently it is only
calling cond_resched() before repolling, but it is not sufficient to
clear out the holdout of RCU tasks, which prevent BPF tracing programs
from detaching for long period. This can be reproduced easily with
following set up:

ip netns add test1
ip netns add test2

ip -n test1 link add veth1 type veth peer name veth2 netns test2

ip -n test1 link set veth1 up
ip -n test1 link set lo up
ip -n test2 link set veth2 up
ip -n test2 link set lo up

ip -n test1 addr add 192.168.1.2/31 dev veth1
ip -n test1 addr add 1.1.1.1/32 dev lo
ip -n test2 addr add 192.168.1.3/31 dev veth2
ip -n test2 addr add 2.2.2.2/31 dev lo

ip -n test1 route add default via 192.168.1.3
ip -n test2 route add default via 192.168.1.2

for i in `seq 10 210`; do
 for j in `seq 10 210`; do
    ip netns exec test2 iptables -I INPUT -s 3.3.$i.$j -p udp --dport 5201
 done
done

ip netns exec test2 ethtool -K veth2 gro on
ip netns exec test2 bash -c 'echo 1 > /sys/class/net/veth2/threaded'
ip netns exec test1 ethtool -K veth1 tso off

Then run an iperf3 client/server and a bpftrace script can trigger it:

ip netns exec test2 iperf3 -s -B 2.2.2.2 >/dev/null&
ip netns exec test1 iperf3 -c 2.2.2.2 -B 1.1.1.1 -u -l 1500 -b 3g -t 100 >/dev/null&
bpftrace -e 'kfunc:__napi_poll{@=count();} interval:s:1{exit();}'

Report RCU quiescent states periodically will resolve the issue.

Fixes: 29863d41bb ("net: implement threaded-able napi poll loop support")
Reviewed-by: Jesper Dangaard Brouer <hawk@kernel.org>
Signed-off-by: Yan Zhai <yan@cloudflare.com>
Acked-by: Paul E. McKenney <paulmck@kernel.org>
Acked-by: Jesper Dangaard Brouer <hawk@kernel.org>
Link: https://lore.kernel.org/r/4c3b0d3f32d3b18949d75b18e5e1d9f13a24f025.1710877680.git.yan@cloudflare.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-03-26 18:21:37 -04:00
..
Makefile
bpf_sk_storage.c bpf: Add length check for SK_DIAG_BPF_STORAGE_REQ_MAP_FD parsing 2023-08-11 15:13:50 +02:00
datagram.c net: datagram: fix data-races in datagram_poll() 2023-05-24 17:36:42 +01:00
datagram.h
dev.c net: report RCU QS on threaded NAPI repolling 2024-03-26 18:21:37 -04:00
dev_addr_lists.c
dev_ioctl.c net: dev: Convert sa_data to flexible array in struct sockaddr 2024-03-01 13:21:59 +01:00
devlink.c devlink: report devlink_port_type_warn source device 2024-03-01 13:21:55 +01:00
drop_monitor.c drop_monitor: Require 'CAP_SYS_ADMIN' when joining "events" group 2023-12-13 18:36:38 +01:00
dst.c ipv6: remove max_size check inline with ipv4 2024-01-15 18:51:25 +01:00
dst_cache.c
failover.c
fib_notifier.c
fib_rules.c
filter.c bpf: net: Change sk_getsockopt() to take the sockptr_t argument 2024-03-26 18:21:23 -04:00
flow_dissector.c net/core: Fix ETH_P_1588 flow dissector 2023-10-06 13:18:05 +02:00
flow_offload.c
gen_estimator.c
gen_stats.c
gro_cells.c
hwbm.c
link_watch.c
lwt_bpf.c lwt: Fix return values of BPF xmit ops 2023-09-19 12:22:33 +02:00
lwtunnel.c
neighbour.c neighbour: Don't let neigh_forced_gc() disable preemption for long 2024-01-25 14:52:29 -08:00
net-procfs.c
net-sysfs.c
net-sysfs.h
net-traces.c
net_namespace.c net: fix UaF in netns ops registration error path 2023-02-01 08:27:26 +01:00
netclassid_cgroup.c
netevent.c
netpoll.c net: move from strlcpy with unused retval to strscpy 2023-10-25 11:59:01 +02:00
netprio_cgroup.c
of_net.c of: net: add a helper for loading netdev->dev_addr 2023-07-27 08:46:59 +02:00
page_pool.c page_pool: fix inconsistency for page_pool_ring_[un]lock() 2023-06-05 09:21:22 +02:00
pktgen.c net: pktgen: Fix interface flags printing 2023-10-25 11:58:58 +02:00
ptp_classifier.c
request_sock.c tcp: make sure init the accept_queue's spinlocks once 2024-02-23 08:54:27 +01:00
rtnetlink.c rtnetlink: fix error logic of IFLA_BRIDGE_FLAGS writing back 2024-03-06 14:38:46 +00:00
scm.c io_uring/unix: drop usage of io_uring socket 2024-03-26 18:21:11 -04:00
secure_seq.c
selftests.c
skbuff.c net: prevent mss overflow in skb_segment() 2024-02-23 08:55:14 +01:00
skmsg.c bpf, sockmap: Fix bug that strp_done cannot be called 2023-08-16 18:22:00 +02:00
sock.c bpf: net: Change sk_getsockopt() to take the sockptr_t argument 2024-03-26 18:21:23 -04:00
sock_destructor.h
sock_diag.c sock_diag: annotate data-races around sock_diag_handlers[family] 2024-03-26 18:21:17 -04:00
sock_map.c bpf, sockmap: Reject sk_msg egress redirects to non-TCP sockets 2023-10-10 21:59:07 +02:00
sock_reuseport.c
stream.c net: deal with most data-races in sk_wait_event() 2023-05-24 17:36:42 +01:00
sysctl_net_core.c
timestamping.c
tso.c
utils.c
xdp.c xdp: xdp_mem_allocator can be NULL in trace_mem_connect(). 2023-06-05 09:21:23 +02:00