WSL2-Linux-Kernel/net/ipv4
Eric Dumazet 0d4e0afdd6 tcp: do not accept ACK of bytes we never sent
[ Upstream commit 3d501dd326fb1c73f1b8206d4c6e1d7b15c07e27 ]

This patch is based on a detailed report and ideas from Yepeng Pan
and Christian Rossow.

ACK seq validation is currently following RFC 5961 5.2 guidelines:

   The ACK value is considered acceptable only if
   it is in the range of ((SND.UNA - MAX.SND.WND) <= SEG.ACK <=
   SND.NXT).  All incoming segments whose ACK value doesn't satisfy the
   above condition MUST be discarded and an ACK sent back.  It needs to
   be noted that RFC 793 on page 72 (fifth check) says: "If the ACK is a
   duplicate (SEG.ACK < SND.UNA), it can be ignored.  If the ACK
   acknowledges something not yet sent (SEG.ACK > SND.NXT) then send an
   ACK, drop the segment, and return".  The "ignored" above implies that
   the processing of the incoming data segment continues, which means
   the ACK value is treated as acceptable.  This mitigation makes the
   ACK check more stringent since any ACK < SND.UNA wouldn't be
   accepted, instead only ACKs that are in the range ((SND.UNA -
   MAX.SND.WND) <= SEG.ACK <= SND.NXT) get through.

This can be refined for new (and possibly spoofed) flows,
by not accepting ACK for bytes that were never sent.

This greatly improves TCP security at a little cost.

I added a Fixes: tag to make sure this patch will reach stable trees,
even if the 'blamed' patch was adhering to the RFC.

tp->bytes_acked was added in linux-4.2

Following packetdrill test (courtesy of Yepeng Pan) shows
the issue at hand:

0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
+0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
+0 bind(3, ..., ...) = 0
+0 listen(3, 1024) = 0

// ---------------- Handshake ------------------- //

// when window scale is set to 14 the window size can be extended to
// 65535 * (2^14) = 1073725440. Linux would accept an ACK packet
// with ack number in (Server_ISN+1-1073725440. Server_ISN+1)
// ,though this ack number acknowledges some data never
// sent by the server.

+0 < S 0:0(0) win 65535 <mss 1400,nop,wscale 14>
+0 > S. 0:0(0) ack 1 <...>
+0 < . 1:1(0) ack 1 win 65535
+0 accept(3, ..., ...) = 4

// For the established connection, we send an ACK packet,
// the ack packet uses ack number 1 - 1073725300 + 2^32,
// where 2^32 is used to wrap around.
// Note: we used 1073725300 instead of 1073725440 to avoid possible
// edge cases.
// 1 - 1073725300 + 2^32 = 3221241997

// Oops, old kernels happily accept this packet.
+0 < . 1:1001(1000) ack 3221241997 win 65535

// After the kernel fix the following will be replaced by a challenge ACK,
// and prior malicious frame would be dropped.
+0 > . 1:1(0) ack 1001

Fixes: 354e4aa391 ("tcp: RFC 5961 5.2 Blind Data Injection Attack Mitigation")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Yepeng Pan <yepeng.pan@cispa.de>
Reported-by: Christian Rossow <rossow@cispa.de>
Acked-by: Neal Cardwell <ncardwell@google.com>
Link: https://lore.kernel.org/r/20231205161841.2702925-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-12-13 18:36:37 +01:00
..
bpfilter
netfilter
Kconfig
Makefile
af_inet.c
ah4.c
arp.c
bpf_tcp_ca.c
cipso_ipv4.c
datagram.c
devinet.c
esp4.c net: ipv4: fix return value check in esp_remove_trailer 2023-10-25 11:58:56 +02:00
esp4_offload.c
fib_frontend.c
fib_lookup.h
fib_notifier.c
fib_rules.c
fib_semantics.c ipv4/fib: send notify when delete source address routes 2023-10-25 11:59:00 +02:00
fib_trie.c ipv4/fib: send notify when delete source address routes 2023-10-25 11:59:00 +02:00
fou.c
gre_demux.c
gre_offload.c
icmp.c
igmp.c ipv4: igmp: fix refcnt uaf issue when receiving igmp query packet 2023-12-08 08:48:02 +01:00
inet_connection_sock.c
inet_diag.c net: inet: Retire port only listening_hash 2023-11-28 16:56:22 +00:00
inet_fragment.c
inet_hashtables.c net: set SOCK_RCU_FREE before inserting socket into hashtable 2023-11-28 16:56:22 +00:00
inet_timewait_sock.c
inetpeer.c
ip_forward.c
ip_fragment.c
ip_gre.c ipv4: ip_gre: Avoid skb_pull() failure in ipgre_xmit() 2023-12-13 18:36:36 +01:00
ip_input.c
ip_options.c
ip_output.c
ip_sockglue.c
ip_tunnel.c
ip_tunnel_core.c
ip_vti.c
ipcomp.c
ipconfig.c
ipip.c
ipmr.c
ipmr_base.c
metrics.c
netfilter.c
netlink.c
nexthop.c
ping.c
proc.c
protocol.c
raw.c
raw_diag.c
route.c ipv4: Correct/silence an endian warning in __ip_do_redirect 2023-12-03 07:31:22 +01:00
syncookies.c tcp: fix cookie_init_timestamp() overflows 2023-11-20 11:08:16 +01:00
sysctl_net_ipv4.c
tcp.c net: inet: Retire port only listening_hash 2023-11-28 16:56:22 +00:00
tcp_bbr.c
tcp_bic.c
tcp_bpf.c
tcp_cdg.c
tcp_cong.c
tcp_cubic.c
tcp_dctcp.c
tcp_dctcp.h
tcp_diag.c
tcp_fastopen.c
tcp_highspeed.c
tcp_htcp.c
tcp_hybla.c
tcp_illinois.c
tcp_input.c tcp: do not accept ACK of bytes we never sent 2023-12-13 18:36:37 +01:00
tcp_ipv4.c net: inet: Retire port only listening_hash 2023-11-28 16:56:22 +00:00
tcp_lp.c
tcp_metrics.c tcp_metrics: do not create an entry from tcp_init_metrics() 2023-11-20 11:08:15 +01:00
tcp_minisocks.c
tcp_nv.c
tcp_offload.c
tcp_output.c net: annotate data-races around sk->sk_dst_pending_confirm 2023-11-28 16:56:16 +00:00
tcp_rate.c
tcp_recovery.c tcp: fix excessive TLP and RACK timeouts from HZ rounding 2023-10-25 11:58:57 +02:00
tcp_scalable.c
tcp_timer.c
tcp_ulp.c
tcp_vegas.c
tcp_vegas.h
tcp_veno.c
tcp_westwood.c
tcp_yeah.c
tunnel4.c
udp.c udp: add missing WRITE_ONCE() around up->encap_rcv 2023-11-20 11:08:14 +01:00
udp_bpf.c
udp_diag.c
udp_impl.h
udp_offload.c
udp_tunnel_core.c
udp_tunnel_nic.c
udp_tunnel_stub.c
udplite.c
xfrm4_input.c
xfrm4_output.c
xfrm4_policy.c
xfrm4_protocol.c
xfrm4_state.c
xfrm4_tunnel.c