Doc‑to‑LoRA bridges RAG and tuning
Sakana AI introduced Doc‑to‑LoRA, a technique for injecting document content into models without full retraining and described it as a bridge between RAG and fine‑tuning via meta‑trained hypernetworks. The approach aims to make domain adaption faster and lighter-weight for production search and knowledge tasks. (x.com)
The Doc‑to‑LoRA preprint lists Rujikorn Charakorn, Edoardo Cetin, Shinnosuke Uesaka and Robert Tjarko Lange as authors and was posted to arXiv on Feb 13, 2026 (arxiv.org). (arxiv.org) The paper reports near‑perfect zero‑shot performance on a “needle‑in‑a‑haystack” long‑context benchmark at sequence lengths more than 4× the target model’s native context window (arxiv.org). (arxiv.org) In disclosed measurements the baseline model needed over 12 GB of extra KV‑cache to serve queries over a 128K‑token haystack, while the internalized LoRA adapter version used under 50 MB of additional memory during inference (arxiv.org). (arxiv.org) Sakana’s project page and coverage note that adapter generation happens in sub‑second time for single‑document internalization and that the demo illustrates “internalize once, then answer many” behavior to avoid re‑reading long documents at query time (pub.sakana.ai). (pub.sakana.ai) Code, example checkpoints and a demo release live on GitHub (repo: SakanaAI/doc‑to‑lora) and a corresponding Hugging Face project page hosts model artifacts and a web demo; the GitHub release includes an MIT license and runnable examples. (github.com) OpenReview metadata for the submission records generally positive reviewer interest but a meta‑review that raised missing‑baselines and cost‑vs‑quality concerns, and the program‑chair decision entry for the submission is listed as “Reject” in January 2026 (openreview.net). (openreview.net)