OpenMythos proposes recurrent‑depth transformers
- OpenMythos, an open-source project published on GitHub and PyPI in April 2026, proposes recurrent-depth transformers that reuse a looped block instead of stacking unique layers. - The repository by Kye Gomez says attention can switch between MLA and GQA, while sparse MoE routing and configurable loop counts target compute-adaptive reasoning. - MarkTechPost published a May 22, 2026 Colab tutorial showing OpenMythos installs from PyPI or GitHub and trains on a synthetic reasoning task.
OpenMythos is an open-source software project that proposes a different way to build transformer models: reuse a recurrent block multiple times inside one forward pass instead of adding a fresh set of layers at every depth. The repository, published by Kye Gomez on GitHub, describes the design as a “Recurrent-Depth Transformer” with three stages — Prelude, a looped Recurrent Block and a final Coda. PyPI shows the `open-mythos` package was released on April 22, 2026, and the project says it is an “independent, community-driven theoretical reconstruction” rather than an Anthropic product. That matters because OpenMythos is not pitching another straightforward scale-up in parameter count. The code and package documentation say the same recurrent weights can be executed up to a configurable `max_loop_iters`, with attention selectable between MLA and GQA and feed-forward layers using sparse mixture-of-experts routing. In plain terms, the framework is testing whether more iterative computation can substitute for some of the cost of ever-larger dense stacks. (github.com) ### Why is “recurrent depth” different from a normal transformer stack? GitHub’s README says OpenMythos uses three stages: “Prelude,” a looped “Recurrent Block,” and “Coda.” In a standard transformer, each layer has its own weights; here, the recurrent block is reused for multiple passes, with the number of passes set at inference or training time through `n_loops` and bounded by `max_loop_iters`. MarkTechPost’s May 22 tutorial framed that as parameter reuse for “deeper computation.” Its example builds small MLA and GQA variants, runs forward and generation tests, and then trains on a synthetic compositional reasoning task involving digit-chain sums modulo a fixed value. (github.com) The tutorial says the point is to study how recurrent loops let one model perform more internal computation without adding a new block for every extra step. ### What do MLA, GQA and sparse MoE add here? The OpenMythos repository says attention is “switchable between MLA and GQA.” MarkTechPost describes MLA as Multi-Latent Attention with a compressed KV cache, and GQA as Grouped-Query Attention with fewer KV heads than query heads. Both are serving-oriented choices: they aim to reduce memory pressure or KV-cache cost relative to a naive full-attention setup. The same repository says feed-forward layers use a sparse MoE with routed and shared experts. (marktechpost.com) That means only part of the model is active for a given token, which is a familiar way to raise total parameter capacity without activating every parameter on every step. Inference from the design, rather than an explicit benchmark claim, is that OpenMythos is combining two levers at once — looped depth and sparse activation — to explore better compute-per-quality tradeoffs. (github.com) ### What is the project actually showing today? PyPI’s package page includes example code that instantiates models, prints parameter counts, generates tokens and checks the recurrent injection matrix. The example specifically prints a spectral-radius check for that matrix and labels stability as requiring the value to stay below 1. GitHub shows the same check in the README example. MarkTechPost’s tutorial follows that path in Colab. It installs from PyPI, falls back to GitHub if needed, fixes a seed, uses CUDA when available, compares MLA and GQA parameter counts and evaluates the recurrent mechanism before moving to the synthetic task. (github.com) That is a tooling and architecture demonstration, not an independently verified frontier benchmark result. ### Where does this leave the bigger architecture debate? OpenMythos lists preconfigured variants ranging from 1B to 1T parameters, with larger entries showing up to 1 million-token context and 128,000-token output in the package metadata. Those are configuration targets in the package, not reported trained-model results. The concrete next place to watch is the project’s GitHub repository and PyPI package. GitHub shows 13,100 stars as of May 23, 2026, while MarkTechPost’s May 22 tutorial provides the current public walkthrough for building and testing MLA and GQA variants in Colab. (marktechpost.com) (github.com) (pypi.org)