M.2 AI module appears
An enthusiast product write‑up shows an M.2 AI module claiming 60 TOPS and 32GB of memory, pitched to run up to 20‑billion‑parameter LLMs in a standard slot. (wccftech.com) The coverage frames this as part of a trend toward modular, local AI acceleration rather than mainstream cloud infrastructure news. (wccftech.com)
An M.2 card that looks like a storage drive is being sold as a local artificial intelligence accelerator, with Unigen saying its new module can run models up to 20 billion parameters in a standard slot. (unigen.com) The product is Unigen’s Amaretti E1.S AI Module, announced April 13, 2026, and the company says it delivers 60 trillion operations per second of artificial intelligence compute while using 10 watts of power. (unigen.com) Unigen says the module uses EdgeCortix’s SAKURA-II processor and comes with up to 32 gigabytes of LPDDR4x memory, a low-power memory type mounted on the card itself instead of relying on the host computer’s main memory. (unigen.com) (edgecortix.com) An M.2 slot is the small connector most people know as the place for a solid-state drive, and vendors have been repurposing that slot for compact artificial intelligence add-ons that handle inference, the step where a trained model generates answers. (hailo.ai 1) (hailo.ai 2) That puts this module in a growing edge-computing market: Hailo sells M.2 cards rated at 26 TOPS and 40 TOPS, and Hewlett Packard Enterprise has published materials for an M.2 accelerator card built around Hailo-10H for local generative artificial intelligence workloads. (hailo.ai 1) (hailo.ai 2) (hp.com) EdgeCortix says SAKURA-II is offered in M.2 2280 modules and low-profile Peripheral Component Interconnect Express cards, with single-chip modules rated at 60 TOPS and dual-chip cards rated at up to 120 TOPS. (edgecortix.com) The catch is bandwidth and fit. Wccftech’s write-up says “standard slot,” but Unigen markets Amaretti as an E1.S module and says it is designed for servers and industrial systems, not as a drop-in upgrade for every laptop with an empty storage bay. (wccftech.com) (unigen.com) There is also a difference between fitting a model in memory and running it quickly. Unigen’s 20-billion-parameter claim refers to supported model size, while EdgeCortix’s published numbers separate 60 TOPS for 8-bit integer workloads from 30 tera floating-point operations per second for bfloat16, a format often used in generative artificial intelligence. (unigen.com) (ggmania.com) What Unigen is selling, then, is not a replacement for a graphics card but a compact coprocessor for on-premises inference, aimed at systems that need local models, lower power draw, or data kept off the cloud. (unigen.com) (hailo.ai) The module’s appeal is that it uses one of the most common expansion footprints in modern hardware, even if the practical buyers are more likely to be server and edge-device makers than people filling a spare laptop slot. (unigen.com) (wccftech.com)