Cloudflare Targets 'Silent Drops'

Cloudflare just rolled out Dynamic Path MTU Discovery for its One Client to fix the maddening "silent drop" problem. This issue causes packets to be dropped without notification on complex networks, leading to hard-to-diagnose failures. The new feature allows the client to adapt to network conditions automatically, boosting reliability for distributed applications.

The "silent drop" problem stems from a failure in a decades-old mechanism called Path MTU Discovery (PMTUD). Originally defined in RFC 1191, PMTUD relies on ICMP messages to dynamically determine the maximum transmission unit (MTU) size along a network path to avoid IP packet fragmentation. This process is crucial for efficient data transfer, especially over complex network paths involving tunnels or VPNs. The issue arises when a router or firewall on the path drops a packet that is too large but fails to send back the required "Fragmentation Needed" ICMP message (Type 3, Code 4). This creates a "black hole" where the sending server continuously retransmits the large packet, unaware of the size restriction, leading to application hangs and timeouts. This is particularly problematic for IPv6, which does not allow for fragmentation by routers and relies entirely on the endpoint to adjust packet size. Historically, a common workaround has been to manually configure a lower MTU on servers or use MSS clamping on routers, which adjusts the maximum segment size during the TCP handshake. However, these solutions are often inefficient and difficult to manage at scale. Some operating systems also have features to detect MTU black holes, but these can misinterpret congestion-based packet loss as an MTU issue. Cloudflare's implementation circumvents the reliance on ICMP by using a method based on RFC 8899, Datagram Packetization Layer Path MTU Discovery. This approach, built on top of the QUIC protocol, allows the Cloudflare One Client to proactively send probes of varying sizes to the Cloudflare edge. This active probing mechanism enables the client to determine the true path capacity without waiting for potentially blocked ICMP feedback. By actively testing the network path, the client can dynamically adjust its virtual interface MTU on the fly. This ensures seamless transitions for users moving between networks with different MTU limitations, such as from a corporate Wi-Fi network (typically 1500-byte MTU) to a cellular network, which may have a lower MTU. The result is a more resilient connection for applications sensitive to packet loss, like large file uploads, video calls, or SSH sessions. This client-side intelligence is a significant shift from the traditional passive, ICMP-dependent approach to MTU discovery. The development, led by Koko Uko, Rhett Griggs, and Todd Murray, addresses a long-standing and frustrating networking problem that often manifests as mysterious connection failures. The feature requires an MTU of at least 1281 bytes to function and is enabled via a mobile device management (MDM) setting.

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.