Google slashes AI memory needs

Google rolled out an AI compression algorithm this week that cuts inference memory footprints by roughly sixfold — investors immediately punished DRAM makers like Samsung. The move signals big cost and deployment shifts for LLMs and makes memory-efficient system design a marketable portfolio skill. (timesofindia.indiatimes.com)

Google Research published the paper "TurboQuant" on March 24, 2026, authored by Amir Zandieh and Vahab Mirrokni and announced it will be presented at ICLR 2026. (research.google) Google describes TurboQuant as a training‑free vector quantization suite that in internal tests compressed LLM key‑value (KV) caches to roughly 3 bits per value with no measured accuracy drop and delivered up to an 8× speedup on Nvidia H100 attention workloads. (research.google) The release bundles three related techniques—TurboQuant, PolarQuant and Quantized Johnson‑Lindenstrauss (QJL)—and Google says TurboQuant uses PolarQuant’s random rotations to simplify vector geometry before quantization. (research.google) Market moves after the paper’s publication included a near‑term selloff in memory names: SK Hynix fell about 6%, Samsung Electronics fell nearly 5%, and Kioxia dropped roughly 6% while U.S. peers including Micron and SanDisk declined, according to CNBC’s March 26 market report. (cnbc.com) Independent developers began porting the method into local inference toolchains within a day, with community implementations and ports to projects like llama.cpp and MLX reported in coverage and follow‑on posts. (venturebeat.com) Industry commentary diverged: a Forbes analysis noted the paper’s authors and argued TurboQuant could both shrink KV memory needs and shift storage economics, while Bloomberg reported the two‑day stock move revealed a split in the AI memory trade with high‑bandwidth memory demand appearing less affected. (forbes.com)

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.