WSL2-Linux-Kernel/net
Varad Gautam d7b0408934 xfrm: policy: Read seqcount outside of rcu-read side in xfrm_policy_lookup_bytype
xfrm_policy_lookup_bytype loops on seqcount mutex xfrm_policy_hash_generation
within an RCU read side critical section. Although ill advised, this is fine if
the loop is bounded.

xfrm_policy_hash_generation wraps mutex hash_resize_mutex, which is used to
serialize writers (xfrm_hash_resize, xfrm_hash_rebuild). This is fine too.

On PREEMPT_RT=y, the read_seqcount_begin call within xfrm_policy_lookup_bytype
emits a mutex lock/unlock for hash_resize_mutex. Mutex locking is fine, since
RCU read side critical sections are allowed to sleep with PREEMPT_RT.

xfrm_hash_resize can, however, block on synchronize_rcu while holding
hash_resize_mutex.

This leads to the following situation on PREEMPT_RT, where the writer is
blocked on RCU grace period expiry, while the reader is blocked on a lock held
by the writer:

Thead 1 (xfrm_hash_resize)	Thread 2 (xfrm_policy_lookup_bytype)

				rcu_read_lock();
mutex_lock(&hash_resize_mutex);
				read_seqcount_begin(&xfrm_policy_hash_generation);
				mutex_lock(&hash_resize_mutex); // block
xfrm_bydst_resize();
synchronize_rcu(); // block
		<RCU stalls in xfrm_policy_lookup_bytype>

Move the read_seqcount_begin call outside of the RCU read side critical section,
and do an rcu_read_unlock/retry if we got stale data within the critical section.

On non-PREEMPT_RT, this shortens the time spent within RCU read side critical
section in case the seqcount needs a retry, and avoids unbounded looping.

Fixes: 77cc278f7b ("xfrm: policy: Use sequence counters with associated lock")
Signed-off-by: Varad Gautam <varad.gautam@suse.com>
Cc: linux-rt-users <linux-rt-users@vger.kernel.org>
Cc: netdev@vger.kernel.org
Cc: stable@vger.kernel.org # v4.9
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Florian Westphal <fw@strlen.de>
Cc: "Ahmed S. Darwish" <a.darwish@linutronix.de>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: Ahmed S. Darwish <a.darwish@linutronix.de>
2021-06-01 07:48:46 +02:00
..
6lowpan
9p net: 9p: advance iov on empty read 2021-03-03 16:57:59 -08:00
802
8021q Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-01-14 18:34:50 -08:00
appletalk appletalk: Fix skb allocation size in loopback case 2021-02-12 16:40:28 -08:00
atm net: atm: pppoatm: use new API for wakeup tasklet 2021-01-29 18:24:05 -08:00
ax25
batman-adv batman-adv: initialize "struct batadv_tvlv_tt_vlan_data"->reserved field 2021-04-05 15:06:03 -07:00
bluetooth Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kern 2021-02-11 14:59:01 -08:00
bpf Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-01-20 12:16:11 -08:00
bpfilter net: remove redundant 'depends on NET' 2021-01-27 17:04:12 -08:00
bridge netfilter: bridge: add pre_exit hooks for ebtable unregistration 2021-04-10 21:16:54 +02:00
caif net: caif: Use netif_rx_any_context(). 2021-02-15 13:21:48 -08:00
can can: isotp: fix msg_namelen values depending on CAN_REQUIRED_SIZE 2021-03-29 09:51:43 +02:00
ceph libceph: remove osdtimeout option entirely 2021-02-16 12:09:52 +01:00
core gro: ensure frag0 meets IP header alignment 2021-04-13 15:09:31 -07:00
dcb net: dcb: use obj-$(CONFIG_DCB) form in net/Makefile 2021-01-27 17:03:52 -08:00
dccp ipv6: weaken the v4mapped source check 2021-03-18 11:19:23 -07:00
decnet net: decnet: fix netdev refcount leaking on error path 2021-01-27 17:33:46 -08:00
dns_resolver net: remove redundant 'depends on NET' 2021-01-27 17:04:12 -08:00
dsa net: dsa: Fix type was not set for devlink port 2021-03-29 13:49:04 -07:00
ethernet
ethtool ethtool: pause: make sure we init driver stats 2021-04-14 13:03:06 -07:00
hsr net: hsr: Reset MAC header for Tx path 2021-04-07 14:25:12 -07:00
ieee802154 net: ieee802154: stop dump llsec params for monitors 2021-04-06 22:34:38 +02:00
ife net: remove redundant 'depends on NET' 2021-01-27 17:04:12 -08:00
ipv4 xfrm: xfrm_state_mtu should return at least 1280 for ipv6 2021-04-19 12:43:50 +02:00
ipv6 xfrm: xfrm_state_mtu should return at least 1280 for ipv6 2021-04-19 12:43:50 +02:00
iucv net/af_iucv: build SG skbs for TRANS_HIPER sockets 2021-01-28 20:36:22 -08:00
kcm net: group skb_shinfo zerocopy related bits together. 2021-01-07 16:08:37 -08:00
key af_key: relax availability checks for skb size calculation 2021-01-04 10:05:50 +01:00
l2tp net: l2tp: reduce log level of messages in receive path, add counter instead 2021-03-03 16:55:02 -08:00
l3mdev net: l3mdev: use obj-$(CONFIG_NET_L3_MASTER_DEV) form in net/Makefile 2021-01-27 17:03:52 -08:00
lapb net: lapb: Copy the skb before sending a packet 2021-02-02 08:40:48 -08:00
llc net: remove redundant 'depends on NET' 2021-01-27 17:04:12 -08:00
mac80211 mac80211: fix time-is-after bug in mlme 2021-04-08 10:14:53 +02:00
mac802154 net: mac802154: Fix general protection fault 2021-04-06 22:42:16 +02:00
mpls net: avoid infinite loop in mpls_gso_segment when mpls_hlen == 0 2021-03-09 16:12:20 -08:00
mptcp mptcp: revert "mptcp: provide subflow aware release function" 2021-04-01 16:02:50 -07:00
ncsi net/ncsi: Avoid channel_monitor hrtimer deadlock 2021-03-30 13:16:23 -07:00
netfilter netfilter: nftables: clone set element expression template 2021-04-13 00:19:05 +02:00
netlabel cipso,calipso: resolve a number of problems with the DOI refcounts 2021-03-04 15:26:57 -08:00
netlink netlink: don't call ->netlink_bind with table lock held 2021-04-16 17:01:04 -07:00
netrom
nfc nfc: Avoid endless loops caused by repeated llcp_sock_connect() 2021-03-25 17:02:01 -07:00
nsh
openvswitch openvswitch: fix send of uninitialized stack memory in ct limit reply 2021-04-05 12:54:42 -07:00
packet net/packet: Improve the comment about LL header visibility criteria 2021-02-06 14:59:28 -08:00
phonet
psample net: psample: Fix netlink skb length with tunnel info 2021-02-25 09:49:46 -08:00
qrtr net: qrtr: Fix memory leak on qrtr_tx_wait failure 2021-03-30 13:48:29 -07:00
rds net/rds: Avoid potential use after free in rds_send_remove_from_sock 2021-04-07 14:01:24 -07:00
rfkill rfkill: revert back to old userspace API by default 2021-04-08 10:14:45 +02:00
rose
rxrpc rxrpc: Fix dependency on IPv6 in udp tunnel config 2021-02-12 16:42:05 -08:00
sched net: sched: sch_teql: fix null-pointer dereference 2021-04-08 14:14:42 -07:00
sctp net/sctp: fix race condition in sctp_destroy_sock 2021-04-13 14:59:46 -07:00
smc net/smc: use memcpy instead of snprintf to avoid out of bounds read 2021-01-12 20:22:01 -08:00
strparser
sunrpc Miscellaneous NFSD fixes for v5.12-rc. 2021-03-16 10:22:50 -07:00
switchdev net: bridge: propagate extack through switchdev_port_attr_set 2021-02-14 17:38:11 -08:00
tipc net: tipc: Fix spelling errors in net/tipc module 2021-04-07 14:29:29 -07:00
tls net/tls: Select SOCK_RX_QUEUE_MAPPING from TLS_DEVICE 2021-02-11 19:08:06 -08:00
unix af_unix: handle idmapped mounts 2021-01-24 14:27:18 +01:00
vmw_vsock selinux: vsock: Set SID for socket returned by accept() 2021-03-19 13:46:55 -07:00
wireless nl80211: fix beacon head validation 2021-04-08 16:43:05 +02:00
x25 net: x25: Remove unimplemented X.25-over-LLC code stubs 2020-12-12 17:15:33 -08:00
xdp xsk: Fold xp_assign_dev and __xp_assign_dev 2021-01-25 23:56:33 +01:00
xfrm xfrm: policy: Read seqcount outside of rcu-read side in xfrm_policy_lookup_bytype 2021-06-01 07:48:46 +02:00
Kconfig net/sock: Add kernel config SOCK_RX_QUEUE_MAPPING 2021-02-11 19:08:06 -08:00
Makefile net: l3mdev: use obj-$(CONFIG_NET_L3_MASTER_DEV) form in net/Makefile 2021-01-27 17:03:52 -08:00
compat.c
devres.c
socket.c io_uring-worker.v3-2021-02-25 2021-02-27 08:29:02 -08:00
sysctl_net.c