Taalas Claims to Etch AI Models Directly on Silicon

Startup Taalas claims it can etch AI models directly onto transistors, a method it says provides a step-function improvement in inference speed and efficiency. The approach aims to bypass memory bottlenecks and could disrupt cost-performance calculations for edge and latency-sensitive applications.

- Taalas was co-founded by Ljubisa Bajic, who also co-founded and formerly served as CEO and CTO of the AI chip company Tenstorrent. - The company has raised over $200 million in funding from investors including Quiet Capital and Fidelity. - Their first chip, the HC1, is designed specifically for the Llama 3.1 8B model and is built on TSMC's 6nm process. - Taalas claims the HC1 can generate 17,000 output tokens per second, which it states is 73 times more than an Nvidia H200 GPU while using one-tenth of the power. - This performance is achieved by "hardwiring" the model's weights onto the chip, which reduces the need for high-bandwidth memory and avoids related bottlenecks. - While this specialization boosts performance for a single model, it also means a new chip is required for any new or updated AI model. - The company's roadmap includes a chip for a 20-billion parameter model expected in the summer, followed by a next-generation "HC2" chip designed for frontier models. - Taalas' business model will involve selling both inference-as-a-service and the specialized hardware itself.

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.