Alphabet’s TurboQuant shocks memory market
Alphabet unveiled TurboQuant—AI memory compression tech—that’s already reshaping expectations for memory demand and sparking a selloff in high-end memory and storage names. The shift suggests memory-efficiency innovations could become as material to infra TCO as raw compute for trading AI workloads. (simplywall.st)
TurboQuant stitches PolarQuant and a Quantized Johnson‑Lindenstrauss (QJL) stage to quantize KV caches down to as low as 3 bits per value while removing per‑block quantization constants, and Google’s tests show at least a 6× reduction in KV‑cache memory. (research.google) Google characterized TurboQuant as training‑free with zero accuracy loss, and reported that 4‑bit TurboQuant produced up to an 8× increase in attention‑logit computation on Nvidia H100 GPUs while shrinking KV‑cache memory by at least six‑fold. (tomshardware.com) South Korean chip names reacted sharply: SK Hynix shares fell about 6% and Samsung Electronics fell nearly 5% in Seoul after the research circulated. (cnbc.com) U.S. memory suppliers also sold off, with Micron, Western Digital and SanDisk sliding on the order of roughly 7% in extended trading following the announcement. (bloomberg.com) Morgan Stanley analyst Shawn Kim said the work could be positive for the industry by relieving a key bottleneck, arguing the net effect might boost product adoption. (bloomberg.com) SemiAnalysis analyst Ray Wang cautioned that removing a bottleneck often enables larger models and thus can lead to higher overall memory use as model capability expands. (cnbc.com) Coverage in Forbes highlighted that lower per‑instance memory costs from TurboQuant could accelerate on‑prem LLM and vector‑search deployments, which would raise aggregate memory demand even as per‑instance footprints fall. (forbes.com) Commentators have invoked the Jevons Paradox—efficiency improvements increasing total consumption—as a key counterpoint to the immediate “demand crush” thesis circulating in markets. (ainvest.com)