One layer, 100-layer match

- Ohio State researchers showed a recurrent-depth approach where a single layer, looped at inference, matches deep multi-hop models. (x.com) - The team reported equivalence to a 100-layer model on multi-hop reasoning tasks through inference looping of one layer. (x.com) - The technique suggests fewer parameters may be needed for complex reasoning by reusing computation across steps. (x.com)

Ohio State researchers say running a single transformer layer in a loop at inference can match the performance of a 100‑layer non‑looped model on multi‑hop reasoning tasks. (arxiv.org) Transformers usually build "depth" by stacking many distinct layers; recurrent‑depth (or looped) transformers instead reuse the same layer multiple times to add iterative computation without adding new parameters. (arxiv.org) The preprint "Loop, Think, & Generalize" was submitted April 9, 2026 by Harsh Kohli, Srinivasan Parthasarathy, Huan Sun and Yuekun Yao of The Ohio State University. (arxiv.org) In controlled experiments the team trained models from scratch on synthetic multi‑hop tasks and showed that increasing inference‑time recurrence unlocked "depth extrapolation," with looped models approximating much deeper non‑looped baselines. (arxiv.org) The authors report cases where a one‑layer model run for many inference iterations performs similarly to a 100‑layer baseline, implying the same weights reused across steps can substitute for stacked depth. (github.com) That pattern suggests parameter counts and structural depth are separable: reusing computation across time can reduce the need for hundreds of distinct layers to achieve multi‑hop composition. (arxiv.org) The paper also documents a three‑stage "grokking" process for systematic generalization and flags a limitation called "overthinking," where too many recurrence steps degrade output; the authors explore training strategies to mitigate this. (arxiv.org) Looped/depth‑recurrent ideas build on prior work showing k‑layer models looped L times can approach k·L‑layer performance on synthetic reasoning problems, and the OSU team has published code and outputs for inspection on GitHub. (openreview.net ) (github.com)

One layer, 100-layer match

Get your own daily briefing