Zhipu AI's GLM-5 model leads coding benchmark

Zhipu AI's latest model, GLM-5, is a 745-billion parameter Mixture-of-Experts (MoE) model that has reportedly achieved dominant performance on the SWE-Bench coding benchmark. The model's architecture leverages MoE for efficient scaling and is part of a growing arms race in open-source foundation models, particularly those optimized for code generation and agentic automation.

- The 77.8% score on the SWE-Bench Verified dataset places GLM-5's performance close to leading proprietary models like Claude Opus 4.6 (which scores between 79.4% and 80.9%) and GPT-5.3 (78.2%). - A key strategic detail is that the model was trained entirely on Huawei Ascend chips, demonstrating a significant step towards China's hardware independence in developing frontier AI systems. - Zhipu AI, the company behind the model, was founded in 2019 by researchers from Tsinghua University and recently became the first major Chinese generative AI firm to go public with a $558 million IPO in Hong Kong. - The model's architecture uses a sparse attention mechanism from DeepSeek to manage its 200,000-token context window and employs 256 experts, activating the top 8 for each token during inference. - Beyond SWE-Bench, GLM-5 also scored 56.2% on Terminal-Bench 2.0, a benchmark focused on terminal-based tasks, and ranked first among open-source models on Vending Bench 2, which evaluates long-term operational capabilities. - Independent testing has highlighted potential discrepancies between benchmark results and real-world application; one analysis replicated the Terminal-Bench 2.0 score at 40.4% instead of the official 56.2%, attributing the gap to the official benchmark's lack of real-world time limits. - The model is available under a permissive MIT license, and its API is priced to be highly competitive, costing around $0.11 per million tokens compared to approximately $5 per million tokens for a model like Claude Opus 4.6.

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.