Google-style system design problem posted
- Pulkit Mittal posted a Google-style system design prompt on X on May 21 asking candidates to explain why a recommendation rollout stalled at 60%. - The prompt’s key split is 60% on the new model and 40% on the old one, with fixes centered on caching, sharding, rollout strategy and monitoring. - The post remains available on X, while HackerRank’s April 2026 materials describe AI-assisted interviews that assess troubleshooting and review skills.
Pulkit Mittal posted a system design practice prompt on X on May 21 that asks candidates to diagnose why a new recommendation algorithm reached only 60% of users while 40% remained on the old model. The exercise frames the issue as a production rollout problem rather than a pure coding question, and asks for root causes and fixes. The post points candidates toward caching, sharding, rollout strategy and monitoring as the main areas to inspect. HackerRank’s April 2026 product materials, published last month, describe AI-assisted interviews that evaluate how candidates troubleshoot, review output and work through practical engineering tasks, offering a contemporaneous backdrop for the kind of question circulating online. ### What is the candidate actually being asked to solve? The May 21 prompt starts with a concrete symptom: a recommendation-system rollout was intended to move users to a new algorithm, but only 60% received it. That setup turns the interview into an investigation of serving paths, state consistency and deployment controls, rather than model quality alone. The 60/40 split matters because it suggests partial adoption rather than a total outage. (x.com) In practice, that usually directs attention to traffic allocation, cache invalidation, sticky sessions, shard-specific lag, feature-flag targeting or uneven deployment across regions or clusters. Those are the categories named in the post itself through references to caching, sharding, rollout strategies and monitoring. ### Why would caching keep part of the audience on the old model? Caching is one of the first suspects in a mixed rollout because recommendation results, model IDs or user-treatment assignments may be stored upstream of the new serving path. If caches are keyed too broadly, refreshed too slowly or invalidated inconsistently, some users can continue to receive outputs generated by the prior model even after the new version is live. That is consistent with the prompt’s explicit focus on caching as a candidate explanation. (x.com) Shaped, a recommender-systems company, says caching in recommendation systems is used to reduce recomputation and latency by storing frequently accessed data and results. In a rollout setting, that same mechanism can preserve stale responses unless invalidation and versioning are handled correctly. ### How do sharding and rollout strategy create a 60/40 split? Sharding can create uneven exposure when different database partitions, user cohorts or serving clusters are updated on different schedules. (x.com) If one shard set points to the new model and another still resolves to the old one, users can see different behavior based on where their traffic lands. The prompt names sharding directly, which signals that the candidate is expected to reason about infrastructure boundaries, not just application code. (shaped.ai) Rollout strategy can produce the same pattern by design or by mistake. A percentage rollout, canary release or region-by-region deployment may stall if guardrails trigger, if targeting rules are misconfigured or if monitoring shows regressions in one segment. The candidate’s job is to say how they would verify each possibility with logs, metrics and cohort breakdowns. ### Why does monitoring sit at the center of the exercise? (x.com) Monitoring is the only way to distinguish between stale cache, bad targeting, shard imbalance and client-side stickiness once the system is live. A strong answer would normally ask for metrics by region, shard, app version, feature-flag bucket and cache hit rate, then compare assignment logs with actual model-serving logs to find where the split begins. The prompt includes monitoring among the core areas to inspect. (x.com) HackerRank said in an April 22 update that employers want to assess whether developers can “review AI-generated output” and “troubleshoot effectively” without sacrificing quality. In separate support documentation updated the same day, the company said AI-assisted interviews are designed to reveal a candidate’s “technical thinking” and interaction with an AI assistant in real time. Those product descriptions do not mention this X post, but they document an interview market that is explicitly testing debugging and judgment in applied settings. (x.com) ### What would a strong interview answer include? A strong answer would usually start with instrumentation: confirm intended rollout percentage, inspect feature-flag rules, compare cache hit and invalidation patterns, check shard-level deployment status, and trace a user request from assignment to final recommendation response. It would then propose fixes such as versioned cache keys, explicit cache flushes, shard parity checks, sticky-assignment audits, and alerts tied to treatment mismatch. (pages.hackerrank.com) Those steps follow directly from the failure modes named in the prompt. The X post remains the primary public artifact for the exercise as of May 21. HackerRank’s AI-assisted interview and release-note pages, both updated on April 22, 2026, remain available for employers and candidates reviewing how practical troubleshooting is being assessed. (x.com)