DeepSeek tailors model for Huawei chips
- DeepSeek on April 24 released preview versions of its new V4 artificial intelligence model, built to run on Huawei’s Ascend chips instead of relying mainly on Nvidia hardware. - DeepSeek said V4 comes in Pro and Flash versions, with a 1-million-token context window, and Huawei said Ascend 950 supernode clusters already support the full V4 series. - The launch lands as Washington accuses Chinese firms of distilling U.S. models and as DeepSeek explores fresh funding above a $20 billion valuation. (cnbc.com)
DeepSeek on April 24 released preview versions of V4, a new artificial intelligence model adapted to run on Huawei’s Ascend chips. (finance.yahoo.com) (money.usnews.com) The Chinese startup said V4 comes in two versions, Pro and Flash, and both support a 1-million-token context window. DeepSeek said Pro is aimed at higher-end coding, reasoning and world-knowledge tasks, while Flash is cheaper and faster. (money.usnews.com) Huawei said V4 is supported across its Ascend 950-based supernode clusters and that its chips were used for part of V4-Flash’s training. Reuters reported the release marked a break from DeepSeek’s earlier V3 and R1 models, which were trained on Nvidia chips. (money.usnews.com) (finance.yahoo.com) An artificial intelligence model is software that learns patterns from huge amounts of text and code, while the chip is the engine that runs it. Adapting a model to a new chip means rewriting and tuning parts of the system so it works efficiently on different hardware. (money.usnews.com) (finance.yahoo.com) That shift has become more important since the United States began restricting China’s access to advanced artificial intelligence chips in 2022. Reuters said DeepSeek worked directly with Huawei and Cambricon for months to rewrite code and test V4 on Chinese hardware. (finance.yahoo.com) (money.usnews.com) Reuters reported on April 3 that Alibaba, ByteDance and Tencent had placed bulk orders totaling hundreds of thousands of Huawei chips ahead of the V4 launch. The same report said DeepSeek gave early access to domestic suppliers such as Huawei instead of U.S. chipmakers for performance tuning. (money.usnews.com) DeepSeek’s hardware pivot arrived a day after the White House accused China of stealing United States artificial intelligence intellectual property on an industrial scale. On April 25, CNBC reported that the State Department sent a cable ordering diplomats worldwide to warn governments about risks from Chinese model distillation. (finance.yahoo.com) (cnbc.com) The cable named DeepSeek, Moonshot AI and MiniMax, according to Reuters, and described distillation as training smaller models on the outputs of larger proprietary ones. The Chinese Embassy in Washington rejected the accusations as baseless and said Beijing attaches great importance to intellectual property protection. (cnbc.com) At the same time, Reuters reported on April 22 that Tencent and Alibaba were in talks to invest in DeepSeek at a valuation above $20 billion, after an earlier funding target of at least $300 million at a valuation of at least $10 billion. Reuters said the talks were still underway and the terms could change. (thestar.com.my) V4 puts DeepSeek’s next test in plain view: whether a top Chinese model can keep improving on Chinese chips while U.S. pressure on hardware and intellectual property keeps rising. (finance.yahoo.com) (cnbc.com)