Galaxy Blackhole AI Servers Ship Now

- Tenstorrent said on April 28 its Galaxy Blackhole AI servers are now generally available, turning its long-teased Blackhole hardware into a shippable product. - The flagship box is a 6U, air-cooled server with 32 Blackhole ASICs, 1 TB of GDDR6, 23 PFLOPS FP8, and a $110,000 starting price. - It matters because Tenstorrent is pitching a cheaper, Ethernet-scaled alternative to Nvidia racks — if its software stack is finally ready.

AI servers are the part of the boom where big promises usually hit hard reality. Chips can benchmark well in a lab, but shipping a full box that customers can actually buy, rack, cool, and run is the real test. That’s the jump Tenstorrent says it just made. On April 28, the company said its Galaxy Blackhole systems are now generally available, with a single-server configuration starting at $110,000 and larger superclusters already on offer. (tenstorrent.com) ### What is this thing, exactly? Galaxy Blackhole is Tenstorrent’s dense AI server built around its own Blackhole accelerators. One chassis holds 32 ASICs in a 6U air-cooled box, plus an AMD EPYC host CPU, up to 576 GB of system memory, and local NVMe storage. The system is not a dev kit or a board set — it’s a full rackmount server meant for production deployment. (tenstorrent.com) The headline specs are aggressive for the price. Tenstorrent lists 23 PFLOPS of Block FP8 compute, 1 TB of GDDR6 across the accelerators, and 16 TB/s of memory bandwidth. Power draw is listed at 8 to 10 kW on average, with 12 kW max in the default setup. (tenstorrent.com) ### Why is shipping the real news? Because “we have a chip” and “we have a(tenstorrent.com)out Blackhole for a while, but general availability means the company is saying customers can now order deployed systems, not just hear architecture talks and roadmap claims. The company also says it has been building Galaxy Blackhole servers since January (tenstorrent.com)r into actual manufacturing and rollout. (tenstorrent.com) ### Why does the price matter so much? At $110,000 starting price, Tenstorrent is obviously aiming at the complaint everyone has about AI infrastructure — it costs a fortune. A four-node Galaxy Blackhole supercluster starts at $440,000. The pitch is basically: maybe you don’t need the most famous GPU stack if a cheaper cluster can get you usable performance on real inference jobs. (tenstorrent.com) That does not mean it beats Nvidia box-for-box on every metric. Even friendly coverage notes Nvidia’s DGX systems are faster and carry more capacity. Tenstorrent’s angle is different — lower entry price, dense packaging, and scale-out over Ethernet rather than forcing buyers into a more exotic interconnect story. (theregister.com)? Not training-first moonshots. The company is leaning hard into inference — especially large-context LLM serving and AI video generation. It says Galaxy can handle both prefill and decode on the same machines, which matters because part of the market is drifting toward disaggregated setups where different hardware(theregister.com)bs well enough. (tenstorrent.com) ### Are the performance claims real? They’re real as claims, but they need careful reading. Tenstorrent says its superclusters can generate 720p, 81-frame video in 2.4 seconds with Prodia, and in “Blitz Mode” can hit 350+ tokens per second per user with sub-4-second time-to-first-token on DeepSeek-R1-0528 671B. EE Times said it(tenstorrent.com)r. (tenstorrent.com) ### So what’s the catch? Software. Basically always software. Earlier hands-on coverage found limited model support and weak scaling, and even newer reporting frames the big improvement as a software-stack story as much as a hardware one. Tenstorrent now says 90% of Hugging Face models “just work” through TT-Forge and TT-Lang. (tenstorrent.com). (tenstorrent.com) ### Why use Ethernet here? Because Tenstorrent wants scale-out to feel less custom and less locked down. Each Blackhole ASIC exposes 10×400 GbE links, and the company’s larger clusters extend by adding more Galaxy systems rather than building around a single giant monolithic box. Think of it less like one monster server and more like a mesh of smaller compute tiles that can grow outward. (tenstorrent.com) ### Bottom line? Tenstorrent did not just show another AI chip. It moved to selling a full server at a price that undercuts the premium GPU incumbents by a lot. The hard part starts now — proving that customers can get the advertised model support, scaling, and token economics outside a demo. (tenstorrent.com)

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.