WSL2-Linux-Kernel/net/ipv4
Neal Cardwell a96b1d33ec tcp: fix early ETIMEDOUT after spurious non-SACK RTO
[ Upstream commit 686dc2db2a ]

Fix a bug reported and analyzed by Nagaraj Arankal, where the handling
of a spurious non-SACK RTO could cause a connection to fail to clear
retrans_stamp, causing a later RTO to very prematurely time out the
connection with ETIMEDOUT.

Here is the buggy scenario, expanding upon Nagaraj Arankal's excellent
report:

(*1) Send one data packet on a non-SACK connection

(*2) Because no ACK packet is received, the packet is retransmitted
     and we enter CA_Loss; but this retransmission is spurious.

(*3) The ACK for the original data is received. The transmitted packet
     is acknowledged.  The TCP timestamp is before the retrans_stamp,
     so tcp_may_undo() returns true, and tcp_try_undo_loss() returns
     true without changing state to Open (because tcp_is_sack() is
     false), and tcp_process_loss() returns without calling
     tcp_try_undo_recovery().  Normally after undoing a CA_Loss
     episode, tcp_fastretrans_alert() would see that the connection
     has returned to CA_Open and fall through and call
     tcp_try_to_open(), which would set retrans_stamp to 0.  However,
     for non-SACK connections we hold the connection in CA_Loss, so do
     not fall through to call tcp_try_to_open() and do not set
     retrans_stamp to 0. So retrans_stamp is (erroneously) still
     non-zero.

     At this point the first "retransmission event" has passed and
     been recovered from. Any future retransmission is a completely
     new "event". However, retrans_stamp is erroneously still
     set. (And we are still in CA_Loss, which is correct.)

(*4) After 16 minutes (to correspond with tcp_retries2=15), a new data
     packet is sent. Note: No data is transmitted between (*3) and
     (*4) and we disabled keep alives.

     The socket's timeout SHOULD be calculated from this point in
     time, but instead it's calculated from the prior "event" 16
     minutes ago (step (*2)).

(*5) Because no ACK packet is received, the packet is retransmitted.

(*6) At the time of the 2nd retransmission, the socket returns
     ETIMEDOUT, prematurely, because retrans_stamp is (erroneously)
     too far in the past (set at the time of (*2)).

This commit fixes this bug by ensuring that we reuse in
tcp_try_undo_loss() the same careful logic for non-SACK connections
that we have in tcp_try_undo_recovery(). To avoid duplicating logic,
we factor out that logic into a new
tcp_is_non_sack_preventing_reopen() helper and call that helper from
both undo functions.

Fixes: da34ac7626 ("tcp: only undo on partial ACKs in CA_Loss")
Reported-by: Nagaraj Arankal <nagaraj.p.arankal@hpe.com>
Link: https://lore.kernel.org/all/SJ0PR84MB1847BE6C24D274C46A1B9B0EB27A9@SJ0PR84MB1847.NAMPRD84.PROD.OUTLOOK.COM/
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20220903121023.866900-1-ncardwell.kernel@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-09-15 11:30:06 +02:00
..
bpfilter net: Revert "net: optimize the sockptr_t for unified kernel/user address spaces" 2020-08-10 12:06:44 -07:00
netfilter ip: Fix data-races around sysctl_ip_default_ttl. 2022-07-29 17:25:09 +02:00
Kconfig net: ipv4: remove duplicate "the the" phrase in Kconfig text 2020-08-18 16:02:16 -07:00
Makefile bpf: Clean up sockmap related Kconfigs 2021-02-26 12:28:03 -08:00
af_inet.c tcp: Fix data-races around sysctl_tcp_fastopen. 2022-07-29 17:25:19 +02:00
ah4.c Networking changes for 5.14. 2021-06-30 15:51:09 -07:00
arp.c ipv4: Invalidate neighbour for broadcast address upon address addition 2022-04-13 20:59:05 +02:00
bpf_tcp_ca.c bpf: Forbid bpf_ktime_get_coarse_ns and bpf_timer_* in tracing progs 2021-11-25 09:49:07 +01:00
cipso_ipv4.c cipso: Fix data-races around sysctl. 2022-07-21 21:24:21 +02:00
datagram.c
devinet.c net: Fix data-races around sysctl_devconf_inherit_init_net. 2022-08-31 17:16:44 +02:00
esp4.c esp: limit skb_page_frag_refill use to a single page 2022-04-27 14:38:52 +02:00
esp4_offload.c esp: Fix BEET mode inter address family tunneling on GSO 2022-03-16 14:23:36 +01:00
fib_frontend.c ip: fix triggering of 'icmp redirect' 2022-09-08 12:28:07 +02:00
fib_lookup.h ipv4: fix data races in fib_alias_hw_flags_set 2022-02-23 12:03:10 +01:00
fib_notifier.c
fib_rules.c ipv4: convert fib_num_tclassid_users to atomic_t 2021-12-08 09:04:49 +01:00
fib_semantics.c ipv4: Fix a data-race around sysctl_fib_multipath_use_neigh. 2022-07-29 17:25:21 +02:00
fib_trie.c ipv4: Fix data-races around sysctl_fib_notify_on_flag_change. 2022-08-03 12:03:52 +02:00
fou.c fou: remove sparse errors 2021-08-31 12:03:33 +01:00
gre_demux.c net: Remove the member netns_ok 2021-05-17 15:29:35 -07:00
gre_offload.c ip_gre: add csum offload support for gre header 2021-01-29 20:39:14 -08:00
icmp.c ip: Fix data-races around sysctl_ip_no_pmtu_disc. 2022-07-29 17:25:13 +02:00
igmp.c igmp: Fix data-races around sysctl_igmp_qrv. 2022-08-03 12:03:48 +02:00
inet_connection_sock.c tcp: Fix data-races around sysctl_tcp_syn(ack)?_retries. 2022-07-29 17:25:18 +02:00
inet_diag.c inet_diag: fix kernel-infoleak for UDP sockets 2021-12-22 09:32:40 +01:00
inet_fragment.c inet: frags: annotate races around fqdir->dead and fqdir->high_thresh 2022-01-27 11:05:35 +01:00
inet_hashtables.c ipv6: add READ_ONCE(sk->sk_bound_dev_if) in INET6_MATCH() 2022-08-17 14:23:36 +02:00
inet_timewait_sock.c net: Use generic ns_common::count 2020-08-19 14:06:36 +02:00
inetpeer.c inetpeer: Fix data-races around sysctl. 2022-07-21 21:24:21 +02:00
ip_forward.c ip: Fix data-races around sysctl_ip_fwd_update_priority. 2022-07-29 17:25:13 +02:00
ip_fragment.c inet: frags: annotate races around fqdir->dead and fqdir->high_thresh 2022-01-27 11:05:35 +01:00
ip_gre.c erspan: do not assume transport header is always set 2022-06-29 09:03:24 +02:00
ip_input.c net: ipv4: use kfree_skb_reason() in ip_rcv_finish_core() 2022-07-29 17:25:16 +02:00
ip_options.c net: clean up codestyle for net/ipv4 2020-08-25 06:28:02 -07:00
ip_output.c net: Fix data-races around sysctl_[rw]mem_(max|default). 2022-08-31 17:16:42 +02:00
ip_sockglue.c net: Fix data-races around sysctl_optmem_max. 2022-08-31 17:16:43 +02:00
ip_tunnel.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-07-31 09:14:46 -07:00
ip_tunnel_core.c tunnels: do not assume mac header is set in skb_tunnel_check_pmtu() 2022-07-07 17:53:29 +02:00
ip_vti.c ip_tunnel: use ndo_siocdevprivate 2021-07-27 20:11:44 +01:00
ipcomp.c Networking changes for 5.14. 2021-06-30 15:51:09 -07:00
ipconfig.c net: ipconfig: Don't override command-line hostnames or domains 2021-06-02 13:27:03 -07:00
ipip.c ip_tunnel: use ndo_siocdevprivate 2021-07-27 20:11:44 +01:00
ipmr.c ipmr,ip6mr: acquire RTNL before calling ip[6]mr_free_table() on failure path 2022-02-16 12:56:29 +01:00
ipmr_base.c
metrics.c treewide: rename nla_strlcpy to nla_strscpy. 2020-11-16 08:08:54 -08:00
netfilter.c netfilter: Dissect flow after packet mangling 2021-04-18 22:04:16 +02:00
netlink.c
nexthop.c nexthop: Fix data-races around nexthop_compat_mode. 2022-07-21 21:24:28 +02:00
ping.c ping: fix address binding wrt vrf 2022-05-18 10:26:57 +02:00
proc.c ip: Fix data-races around sysctl_ip_default_ttl. 2022-07-29 17:25:09 +02:00
protocol.c net: Remove the member netns_ok 2021-05-17 15:29:35 -07:00
raw.c ipv4: raw: lock the socket in raw_bind() 2022-02-01 17:27:14 +01:00
raw_diag.c net: Use nlmsg_unicast() instead of netlink_unicast() 2021-07-13 09:28:29 -07:00
route.c ipv4: Fix data-races around sysctl_fib_multipath_hash_fields. 2022-07-29 17:25:21 +02:00
syncookies.c tcp: Fix data-races around sysctl knobs related to SYN option. 2022-07-29 17:25:22 +02:00
sysctl_net_ipv4.c ip: Fix data-races around sysctl_ip_prot_sock. 2022-07-29 17:25:22 +02:00
tcp.c tcp: TX zerocopy should not sense pfmemalloc status 2022-09-15 11:30:05 +02:00
tcp_bbr.c tcp: add accessors to read/set tp->snd_cwnd 2022-06-14 18:36:11 +02:00
tcp_bic.c tcp: add accessors to read/set tp->snd_cwnd 2022-06-14 18:36:11 +02:00
tcp_bpf.c sock: redo the psock vs ULP protection check 2022-06-29 09:03:25 +02:00
tcp_cdg.c tcp: add accessors to read/set tp->snd_cwnd 2022-06-14 18:36:11 +02:00
tcp_cong.c tcp: add accessors to read/set tp->snd_cwnd 2022-06-14 18:36:11 +02:00
tcp_cubic.c tcp: add accessors to read/set tp->snd_cwnd 2022-06-14 18:36:11 +02:00
tcp_dctcp.c tcp: add accessors to read/set tp->snd_cwnd 2022-06-14 18:36:11 +02:00
tcp_dctcp.h
tcp_diag.c inet_diag: Move the INET_DIAG_REQ_BYTECODE nlattr to cb->data 2020-02-27 18:50:19 -08:00
tcp_fastopen.c tcp: Fix data-races around sysctl_tcp_fastopen_blackhole_timeout. 2022-07-29 17:25:19 +02:00
tcp_highspeed.c tcp: add accessors to read/set tp->snd_cwnd 2022-06-14 18:36:11 +02:00
tcp_htcp.c tcp: add accessors to read/set tp->snd_cwnd 2022-06-14 18:36:11 +02:00
tcp_hybla.c tcp: add accessors to read/set tp->snd_cwnd 2022-06-14 18:36:11 +02:00
tcp_illinois.c tcp: add accessors to read/set tp->snd_cwnd 2022-06-14 18:36:11 +02:00
tcp_input.c tcp: fix early ETIMEDOUT after spurious non-SACK RTO 2022-09-15 11:30:06 +02:00
tcp_ipv4.c tcp: Fix data-races around sysctl_tcp_reflect_tos. 2022-08-03 12:03:52 +02:00
tcp_lp.c tcp: add accessors to read/set tp->snd_cwnd 2022-06-14 18:36:11 +02:00
tcp_metrics.c tcp: Fix data-races around sysctl_tcp_no_ssthresh_metrics_save. 2022-08-03 12:03:45 +02:00
tcp_minisocks.c tcp: Fix a data-race around sysctl_tcp_abort_on_overflow. 2022-07-29 17:25:23 +02:00
tcp_nv.c tcp: add accessors to read/set tp->snd_cwnd 2022-06-14 18:36:11 +02:00
tcp_offload.c net, gro: Set inner transport header offset in tcp/udp GRO hook 2021-08-02 10:20:56 +01:00
tcp_output.c net: Fix data-races around sysctl_[rw]mem_(max|default). 2022-08-31 17:16:42 +02:00
tcp_rate.c tcp: add accessors to read/set tp->snd_cwnd 2022-06-14 18:36:11 +02:00
tcp_recovery.c tcp: Fix data-races around sysctl_tcp_recovery. 2022-07-29 17:25:22 +02:00
tcp_scalable.c tcp: add accessors to read/set tp->snd_cwnd 2022-06-14 18:36:11 +02:00
tcp_timer.c tcp: Fix a data-race around sysctl_tcp_thin_linear_timeouts. 2022-07-29 17:25:22 +02:00
tcp_ulp.c bpf: sockmap: Only check ULP for TCP sockets 2020-03-09 22:34:58 +01:00
tcp_vegas.c tcp: add accessors to read/set tp->snd_cwnd 2022-06-14 18:36:11 +02:00
tcp_vegas.h
tcp_veno.c tcp: add accessors to read/set tp->snd_cwnd 2022-06-14 18:36:11 +02:00
tcp_westwood.c tcp: add accessors to read/set tp->snd_cwnd 2022-06-14 18:36:11 +02:00
tcp_yeah.c tcp: add accessors to read/set tp->snd_cwnd 2022-06-14 18:36:11 +02:00
tunnel4.c net: Remove the member netns_ok 2021-05-17 15:29:35 -07:00
udp.c rxrpc: Fix ICMP/ICMP6 error handling 2022-09-15 11:30:05 +02:00
udp_bpf.c net: Implement ->sock_is_readable() for UDP and AF_UNIX 2021-10-26 12:29:33 -07:00
udp_diag.c net: Use nlmsg_unicast() instead of netlink_unicast() 2021-07-13 09:28:29 -07:00
udp_impl.h net: pass a sockptr_t into ->setsockopt 2020-07-24 15:41:54 -07:00
udp_offload.c fou: remove sparse errors 2021-08-31 12:03:33 +01:00
udp_tunnel_core.c rxrpc: Fix ICMP/ICMP6 error handling 2022-09-15 11:30:05 +02:00
udp_tunnel_nic.c udp_tunnel: Fix end of loop test in udp_tunnel_nic_unregister() 2022-03-02 11:47:59 +01:00
udp_tunnel_stub.c udp_tunnel: add central NIC RX port offload infrastructure 2020-07-10 13:54:00 -07:00
udplite.c net: Remove the member netns_ok 2021-05-17 15:29:35 -07:00
xfrm4_input.c xfrm: state: remove extract_input indirection from xfrm_state_afinfo 2020-05-06 09:40:08 +02:00
xfrm4_output.c xfrm: fix unused variable warning if CONFIG_NETFILTER=n 2020-05-11 15:12:27 +02:00
xfrm4_policy.c
xfrm4_protocol.c net: xfrm: unexport __init-annotated xfrm4_protocol_init() 2022-06-14 18:36:18 +02:00
xfrm4_state.c xfrm: remove output_finish indirection from xfrm_state_afinfo 2020-05-06 09:40:08 +02:00
xfrm4_tunnel.c xfrm: remove description from xfrm_type struct 2021-06-09 09:38:52 +02:00