LinkedIn Shifts Search to 'Cognitive' Models
LinkedIn is moving from traditional keyword matching to cognitive search systems that use LLMs to understand user intent and context, according to Staff Software Engineer Rahul Raja. In a recent podcast, Raja emphasized that the retrieval layer is the most critical part of Retrieval-Augmented Generation (RAG) systems. The company is also shifting focus from simple accuracy metrics to broader measures of user satisfaction, such as reduced follow-up queries.
- LinkedIn's talent search has historically relied on a multi-pass ranking architecture built on its custom search engine, Galene. This system first retrieves a broad set of candidates and then uses progressively more complex machine-learning models to refine the ranking, a common pattern in large-scale recommendation systems. - The initial retrieval stage in LinkedIn's job recommendation funnel uses a "Two-Tower" neural network model. This architecture learns separate vector representations (embeddings) for users and jobs, allowing for a fast initial screening of millions of postings to find a few thousand relevant candidates in under 100 milliseconds. - Retrieval-Augmented Generation (RAG) addresses a key LLM weakness by allowing the model to query external, real-time knowledge bases before generating a response. This grounds the output in factual, current data, reducing the risk of "hallucinations" and saving on the significant computational cost of constantly retraining the model on new information. - The move beyond simple accuracy metrics involves analyzing complex user behaviors to measure satisfaction. For example, search engineers distinguish between "good abandonment," where a user finds their answer without clicking, and "bad abandonment," which indicates dissatisfaction, noting that approximately 30% of abandoned searches are actually successful. - A unique challenge in LinkedIn's talent search is modeling for mutual interest, which is more complex than traditional relevance. The system must optimize for the likelihood that a recruiter will message a candidate *and* that the candidate will respond positively, a key metric for A/B testing. - For its final, high-precision ranking of job recommendations, LinkedIn employs a large-scale, multi-task learning (MTL) framework called LiRank. This "Deep Brain" stage uses a wide array of signals to predict specific user engagement, moving beyond simple relevance to optimize for multiple objectives. - The OWASP Top 10 for Large Language Model Applications highlights key security vulnerabilities inherent in these new systems, such as "Prompt Injection," where malicious inputs can hijack the LLM's output. This represents a critical production concern for engineering teams deploying generative AI.