OrbitHigher posts /dev/mem_hint kernel hint
- Manish Klach published a GitHub reference implementation for `/dev/mem_hint`, a proposed Linux interface that lets runtimes tell the kernel which memory phase starts next. - The repo defines explicit hints for prefill, decode, agentic execution, and training, plus fallback PMU auto-classification and a single 64-bit encoded hint path. - The design is still a reference driver, not a Linux kernel patch, and ships with modeled hardware guardrails. (github.com)
Modern AI jobs do not use memory the same way all the time, and Manish Klach’s new `/dev/mem_hint` proposal tries to tell Linux when those phases change. (github.com) Klach published the reference implementation on GitHub as `manishklach/mem-hint`, with a kernel module, userspace helpers, and docs describing a character-device interface called `/dev/mem_hint`. (github.com) The core idea is simple: a runtime that knows it is entering prefill, token decode, agentic execution, or training writes that phase ID to the kernel instead of leaving the operating system to guess. (github.com) That matters because those phases can stress memory very differently. Prefill can be bandwidth-heavy, while decode is often more latency-sensitive and repetitive, so one policy does not fit both. (github.com) The repo describes three deployment modes. The preferred one is explicit runtime hints; the fallback is performance-monitoring-unit auto-classification; the lightest option is sysfs-only tuning for lab work. (github.com) In the explicit mode, developers add hooks around calls such as `generate`, token decode steps, `forward`, and `loss.backward`, then emit the matching phase through `/dev/mem_hint`. (github.com) The docs say those explicit writes should outrank inferred behavior. If the runtime says it just entered decode, the kernel should trust that more than a heuristic built from counters. (github.com) Under the hood, the reference design packs each hint into a single 64-bit command. That layout includes `phase_id`, `latency_ns`, `bw_target`, `security`, and `priority` fields. (github.com) Klach also models three ways that hint could reach hardware: model-specific registers, memory-mapped input/output registers, and Compute Express Link device-specific extended capabilities, or CXL DVSEC. The default in the reference driver is the model-specific-register path. (github.com) The project is not presented as an upstream Linux patch set. The repo calls it a “reference implementation,” and GitHub shows no packaged releases for it yet. (github.com 1) (github.com 2) The docs also spend time on what can go wrong if software pushes memory too aggressively. A “safety limiter” section says real enforcement should live in immutable hardware logic, not in a bypassable kernel module. (github.com) That modeled limiter uses DDR5-6400-style bounds, watches correctable error rates, and can automatically relax timing when errors rise. The repo says software may propose a setting, but hardware should be the final authority. (github.com) The documentation ties the work to Indian Patent Application No. 202641053160 and frames the code as a way to demonstrate interface semantics across AI inference, training, and other low-latency workloads. (github.com 1) (github.com 2) For now, `/dev/mem_hint` is a concrete proposal with code, not a kernel feature Linux users can rely on. Its next test is whether runtime authors, hardware vendors, or kernel developers decide the extra hint is worth standardizing. (github.com)