Databricks posts RAG agent research
Databricks shared a new paper on a multi‑behavior RAG agent that generalizes across ambiguous queries — a clear research push toward more agentic enterprise search and retrieval workflows. That kind of work feeds directly into needs for low‑latency retrieval and inference infrastructure for production agents. (x.com)
Databricks’ preprint and tech report identify the system as KARL (Knowledge Agents via Reinforcement Learning) and list a large multi‑author team including Jonathan Frankle among the contributors. (arxiv.org) The paper’s evaluation suite, KARLBench, explicitly spans six enterprise search regimes: constraint‑driven entity search, cross‑document report synthesis, long‑document traversal with tabular numerical reasoning, exhaustive entity retrieval, procedural reasoning over technical documentation, and fact aggregation from internal notes. (arxiv.org) Databricks and press coverage report KARL matches Anthropic’s Claude Opus 4.6 on KARLBench while claiming ~33% lower cost per query and ~47% lower latency on the benchmark. ( ) KARL’s training pipeline uses an agentic data‑synthesis loop that produced the training corpus entirely from synthetic, self‑generated examples, and Databricks says the project consumed only a few thousand GPU hours during training. ( ) The team introduces OAPL — an iterative, large‑batch off‑policy post‑training RL method (Optimal Advantage‑based Policy Optimization with a Lagged Inference Policy in related work) — to stabilize highly off‑policy training and improve sample efficiency and test‑time scaling. ( ) Databricks says the RL pipelines and lessons from KARL are being folded into customer tooling such as Agent Bricks, while independent verification is limited because KARLBench was created by Databricks and the company has not published the full benchmark dataset for third‑party reproduction. ( )