27 core system-design problems

- A widely shared thread listed 27 system-design problems targeting senior interview levels. - It covers hot topics like cache stampedes, distributed rate limiting, leader election, and exactly-once processing. - The list serves as a focused study map for candidates aiming at Google L5, Meta E5, or Staff-level roles. (x.com)

A widely shared April 14 thread turned senior-level system design prep into a 27-problem checklist, giving candidates one compact map instead of a pile of scattered notes. (unrollnow.com) The post came from Puneet Patwari, who writes under @system_monarch, and said the sequence was based on “60+ interviews” across Amazon, Google, Uber, Salesforce, Walmart, Confluence and Deliveroo. The thread was published on April 14, 2026 and organized topics from foundations through reliability and correctness. (unrollnow.com) System design interviews ask engineers to sketch how a large service works under load: where requests enter, where data is stored, where traffic spikes break things, and which trade-offs are acceptable. The long-running open-source System Design Primer on GitHub describes that interview category as a standard part of hiring at many tech companies and now has about 343,000 stars. (github.com) Patwari’s checklist starts below the buzzwords. It opens with networking and latency, operating-system and concurrency basics, and data formats and application programming interfaces before moving into request flow, storage choices, caching, scale, and reliability. (unrollnow.com) That ordering tracks how many interview guides frame senior hiring. Google system design interviews are typically used for software engineering roles at level L5 and up, while current Meta E5 guides describe system design as a core part of the senior-engineer loop. (igotanoffer.com) (hellointerview.com) The problems that get the most attention are the ones that show up after a system is already popular. A cache stampede, also called a thundering herd, happens when a hot cached item expires and many requests hit the database at once; Redis describes locks and “promises” as one way to keep one miss from becoming many. (redis.io) A distributed rate limiter is a traffic cop spread across many servers, so every machine has to count requests without letting users slip past limits by hopping between nodes. One current interview guide models that problem at 1 million requests per second, with less than 10 milliseconds of overhead per check and eventual consistency accepted across nodes. (hellointerview.com) Leader election is the question behind “which server is in charge right now.” In practice, systems often use a consensus mechanism such as Raft so one node can coordinate work without two nodes acting as leader during a network split. (systeminternals.dev) Exactly-once processing is the hardest phrase on many of these lists because retries are cheap but duplicates are expensive. Kafka’s exactly-once semantics, introduced with idempotence and transactions in version 0.11, were built to keep stream-processing pipelines from writing the same result twice after failures or restarts. (confluent.io) The thread’s appeal is its compression. Instead of treating system design as one giant whiteboard exercise, it breaks the field into concrete failure modes — caches expiring, queues backing up, leaders disappearing, writes duplicating — that map more closely to what senior engineers are expected to diagnose in production. (unrollnow.com)

27 core system-design problems

Get your own daily briefing