Skip to content

Commit e6f189e

Browse files
nealcardwellNipaLocal
authored and
NipaLocal
committed
tcp: fix tcp_packet_delayed() for tcp_is_non_sack_preventing_reopen() behavior
After the following commit from 2024: commit e37ab73 ("tcp: fix to allow timestamp undo if no retransmits were sent") ...there was buggy behavior where TCP connections without SACK support could easily see erroneous undo events at the end of fast recovery or RTO recovery episodes. The erroneous undo events could cause those connections to suffer repeated loss recovery episodes and high retransmit rates. The problem was an interaction between the non-SACK behavior on these connections and the undo logic. The problem is that, for non-SACK connections at the end of a loss recovery episode, if snd_una == high_seq, then tcp_is_non_sack_preventing_reopen() holds steady in CA_Recovery or CA_Loss, but clears tp->retrans_stamp to 0. Then upon the next ACK the "tcp: fix to allow timestamp undo if no retransmits were sent" logic saw the tp->retrans_stamp at 0 and erroneously concluded that no data was retransmitted, and erroneously performed an undo of the cwnd reduction, restoring cwnd immediately to the value it had before loss recovery. This caused an immediate burst of traffic and build-up of queues and likely another immediate loss recovery episode. This commit fixes tcp_packet_delayed() to ignore zero retrans_stamp values for non-SACK connections when snd_una is at or above high_seq, because tcp_is_non_sack_preventing_reopen() clears retrans_stamp in this case, so it's not a valid signal that we can undo. Note that the commit named in the Fixes footer restored long-present behavior from roughly 2005-2019, so apparently this bug was present for a while during that era, and this was simply not caught. Fixes: e37ab73 ("tcp: fix to allow timestamp undo if no retransmits were sent") Reported-by: Eric Wheeler <netdev@lists.ewheeler.net> Closes: https://lore.kernel.org/netdev/64ea9333-e7f9-0df-b0f2-8d566143acab@ewheeler.net/ Signed-off-by: Neal Cardwell <ncardwell@google.com> Co-developed-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: NipaLocal <nipa@local>
1 parent 0152e98 commit e6f189e

File tree

1 file changed

+25
-12
lines changed

1 file changed

+25
-12
lines changed

net/ipv4/tcp_input.c

Lines changed: 25 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -2863,20 +2863,33 @@ static inline bool tcp_packet_delayed(const struct tcp_sock *tp)
28632863
{
28642864
const struct sock *sk = (const struct sock *)tp;
28652865

2866-
if (tp->retrans_stamp &&
2867-
tcp_tsopt_ecr_before(tp, tp->retrans_stamp))
2868-
return true; /* got echoed TS before first retransmission */
2869-
2870-
/* Check if nothing was retransmitted (retrans_stamp==0), which may
2871-
* happen in fast recovery due to TSQ. But we ignore zero retrans_stamp
2872-
* in TCP_SYN_SENT, since when we set FLAG_SYN_ACKED we also clear
2873-
* retrans_stamp even if we had retransmitted the SYN.
2866+
/* Received an echoed timestamp before the first retransmission? */
2867+
if (tp->retrans_stamp)
2868+
return tcp_tsopt_ecr_before(tp, tp->retrans_stamp);
2869+
2870+
/* We set tp->retrans_stamp upon the first retransmission of a loss
2871+
* recovery episode, so normally if tp->retrans_stamp is 0 then no
2872+
* retransmission has happened yet (likely due to TSQ, which can cause
2873+
* fast retransmits to be delayed). So if snd_una advanced while
2874+
* (tp->retrans_stamp is 0 then apparently a packet was merely delayed,
2875+
* not lost. But there are exceptions where we retransmit but then
2876+
* clear tp->retrans_stamp, so we check for those exceptions.
28742877
*/
2875-
if (!tp->retrans_stamp && /* no record of a retransmit/SYN? */
2876-
sk->sk_state != TCP_SYN_SENT) /* not the FLAG_SYN_ACKED case? */
2877-
return true; /* nothing was retransmitted */
28782878

2879-
return false;
2879+
/* (1) For non-SACK connections, tcp_is_non_sack_preventing_reopen()
2880+
* clears tp->retrans_stamp when snd_una == high_seq.
2881+
*/
2882+
if (!tcp_is_sack(tp) && !before(tp->snd_una, tp->high_seq))
2883+
return false;
2884+
2885+
/* (2) In TCP_SYN_SENT tcp_clean_rtx_queue() clears tp->retrans_stamp
2886+
* when setting FLAG_SYN_ACKED is set, even if the SYN was
2887+
* retransmitted.
2888+
*/
2889+
if (sk->sk_state == TCP_SYN_SENT)
2890+
return false;
2891+
2892+
return true; /* tp->retrans_stamp is zero; no retransmit yet */
28802893
}
28812894

28822895
/* Undo procedures. */

0 commit comments

Comments
 (0)