Startup Taalas Launches High-Speed AI Inference Product
Canadian startup Taalas Inc. has launched its first AI inference product after a $30 million development effort. The company claims its product can achieve speeds of up to 16,000 tokens per second. A demo of the new inference technology is available at chatjimmy.ai.
- Taalas was founded by CEO Ljubisa Bajic, who previously founded the AI chip company Tenstorrent, along with early Tenstorrent engineers Drago Ignjatovic and Lejla Bajic. The founding team has a background in designing processors for companies like AMD and Nvidia. - The company has raised over $200 million in total funding. After initially raising $50 million in two rounds led by veteran semiconductor investor Pierre Lamond and Quiet Capital, Taalas secured an additional $169 million. - The startup’s core technology involves creating highly specialized chips, effectively "hardwiring" a specific AI model directly onto the silicon. This approach sacrifices the flexibility to run different models in exchange for significant gains in speed, cost, and power efficiency. - Their first product, the HC1, is a custom chip designed exclusively to run Meta's Llama 3.1 8B model. Taalas claims this chip is nearly 10 times faster than current state-of-the-art alternatives and consumes 10 times less power. - Compared to competitors, Taalas reports significantly higher performance. For the Llama 3.1 8B model, they claim over 16,000 tokens per second, while companies like Cerebras and Groq have been benchmarked in the range of 600 to 2,000 tokens per second. - Taalas's process with manufacturing partner TSMC allows them to create and deploy a custom chip for a new AI model in about two months, a much faster cycle than traditional semiconductor development. - The company's business strategy may include selling chips, offering API access to models running on their hardware, or developing custom silicon for specific model developers. - The team consists of about 25 engineers, with experience from companies such as AMD, Apple, Google, and Nvidia. Paresh Kharya, formerly a director of AI infrastructure product management at Google Cloud, has joined as the vice president of products.