Instacart ML Lead: Constraints Matter More Than Algorithms

Ahsaas Bajaj, who works on ML at Instacart, summarized key takeaways from a talk on building large-scale recommender systems. The core message was that system constraints and a clear definition of a "good" recommendation are more critical than the specific algorithm chosen. The talk also noted that LLMs are seen as augmenting, not replacing, traditional recsys architectures.

Ahsaas Bajaj's work on Instacart's product substitution system, which handles hundreds of millions of item replacements annually with a satisfaction rate over 95%, has been highlighted in shareholder letters for measurably improving the "perfect order fill rate." His path from software engineering at Samsung to ML at Instacart shaped his focus on how systems behave in production rather than on models in isolation. Real-world constraints often dictate the design of recommender systems more than the algorithm itself. At Instacart, this includes managing high-cardinality decision spaces (millions of niche products with little interaction data) and dynamic, real-time inventory availability—a perfect recommendation is useless if the item is out of stock when the shopper reaches the aisle. Similar constraints at services like Pandora include user tiers (free vs. premium) and query origins (voice vs. typed). Defining a "good" recommendation goes far beyond simple accuracy. Mature recommender teams at major tech companies now optimize for a portfolio of metrics including diversity (how varied the recommendations are), coverage (what percentage of the item catalog is shown), and serendipity (how new and pleasantly surprising the suggestions are). These "beyond-accuracy" metrics are critical for long-term user satisfaction and preventing filter bubbles. This focus on operational reliability has elevated the importance of MLOps. Production systems require robust frameworks for continuous retraining to combat data drift, automated monitoring of data and prediction distributions, and managing the full lifecycle of models. The goal is to unify development and operations to deploy and maintain models reliably at scale, a core tenet of ML engineering at FAANG companies. Large Language Models (LLMs) are being integrated to enhance, not replace, existing recommender architectures. Their strength in semantic understanding is used to create richer embeddings from user reviews and product descriptions, which is particularly effective for the cold-start problem where interaction data is sparse. Companies are also exploring LLMs to power conversational and more explainable recommendation interfaces. Thinking in terms of trade-offs is a hallmark of FAANG-level system design. For instance, algorithms with lower offline accuracy but faster inference speed, like ItemKNN, are often preferred for real-time applications where low latency is a hard constraint. This mirrors a broader industry shift from focusing purely on what is possible in research to what is reliable and efficient in production.

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.