Google denylist at 100M scale

- Arnav Gupta’s May 2026 social write-up described a Google-style system-design problem: a distributed denylist that propagates blocks worldwide in under one minute. - The central constraint was scale — 100 million-plus entries with local cache reads, pub/sub fanout, invalidation, and bounded propagation windows rather than instant consistency. - The next step is the source thread on X, where Gupta walks through storage, cache refresh, and propagation trade-offs.

Arnav Gupta, who posts on X as championswimmer, published a May 2026 thread examining a Google-style system-design question: how to build a distributed denylist or blocklist that can push updates globally in less than a minute at 100 million-plus scale. The scenario is familiar in infrastructure interviews because it forces candidates to balance fast reads, broad distribution and incomplete consistency. Gupta tied the problem to a recent discussion around Google API key security, where fast revocation matters if a key is exposed. Google Cloud, in a May 21 post by developer relations engineer Leonid Yankulin, warned that unrestricted or stolen API keys can expand an attacker’s reach and urged users to limit services and applications tied to each key. ### Why would a denylist problem come up in a Google-style interview? Google-style system design questions often start with a simple product rule — “block this key,” “ban this IP,” “revoke this token” — and then widen into a global distribution problem. Gupta’s thread framed the denylist as a read-heavy system where every serving node may need to answer “is this entity blocked?” in milliseconds, while writes are much rarer but far more urgent. The design pressure comes from geography and scale. (cloud.google.com) A single central database can hold source-of-truth state, but it cannot sit on the hot path for every request from every region without adding latency and creating a bottleneck. Gupta’s walkthrough therefore centered on local copies, cache invalidation and event propagation rather than direct global reads from one store, according to the thread cited in the briefing. ### If reads are local, where does the truth actually live? The source-of-truth layer in this kind of system is usually a durable replicated store that records the canonical status of each key, token, user or IP. Gupta’s outline, as summarized in the briefing, treated that layer as authoritative for writes and audits, while edge or regional services serve reads from local memory or nearby caches. That split matters because the workload is asymmetric. A denylist check may happen on every incoming request, but an add or remove event happens only when abuse is detected, a policy changes, or a credential is revoked. Keeping the write path centralized and the read path distributed is the standard way to absorb that imbalance, and Gupta’s thread focused on storage plus propagation rather than on complex query logic. ### How do updates reach the world in under a minute? Pub/sub fanout is the usual answer when a system needs sub-minute propagation without polling every database replica continuously. Gupta’s thread described a model in which a denylist change is written once, then published as an event that regional services consume to update local caches or in-memory filters. The trade-off is that propagation is fast, not instantaneous. A service can promise a target window — for example, under 60 seconds in normal conditions — but not perfect simultaneity across every machine. Gupta’s write-up highlighted those bounded windows and the possibility of temporary divergence between regions, according to the source briefing. ### Why are caches and invalidation the hard part? Local caches make the denylist usable at 100 million-plus scale because they keep the check close to the request path. But caches create two hard questions: how long data can stay stale, and what happens when an invalidation message is delayed or dropped. Gupta’s design focused on explicit invalidation and refresh behavior rather than long-lived blind caching. In practice, systems often combine event-driven updates with periodic reconciliation so a missed message does not leave a stale block status in place indefinitely. That is the operational difference between a fast demo design and one meant to survive message loss, restarts and regional lag. ### What did API key security have to do with the example? Google Cloud’s May 21 API key guidance gave the denylist scenario a current security hook. Yankulin wrote that API keys are easy to use unsafely, that a hijacked key can be abused at the owner’s expense, and that new keys should be restricted to specific services and client applications. A denylist system is one of the controls that follows from that risk model. If a key is exposed, operators may need to revoke or block it globally before abuse spreads across regions or products. The faster that block propagates, the smaller the window in which a stolen credential remains usable — but the cost of shrinking that window is more infrastructure for fanout, reconciliation and local state management. Arnav Gupta’s thread remains the clearest next stop for the full design walk-through, including its storage choices, invalidation path and propagation window assumptions, as posted on X on May 23, 2026. (cloud.google.com)

Google denylist at 100M scale

Get your own daily briefing