Z.ai Debuts GLM-5, an Open-Weights LLM for System Agents

Z.ai has released GLM-5, an open-weights large language model optimized for “long-horizon” agents and systems engineering. The model's focus on extended context and robust reasoning makes it suitable for embedded systems requiring persistent, multi-step planning, such as in autonomous vehicles or industrial automation. Its open-weight status aligns with the trend toward more transparent and customizable AI tools for developers, and it has achieved state-of-the-art benchmark results.

- Z.ai's GLM-5 is a Mixture-of-Experts (MoE) model with 744 billion total parameters, of which 40 billion are active during inference. This represents a significant increase from its predecessor's 355 billion total and 32 billion active parameters. - The model was trained on 28.5 trillion tokens of data, an increase from the 23 trillion used for the previous version, and incorporates DeepSeek Sparse Attention to manage its large context window efficiently. - GLM-5 features a 200,000-token context length and can generate a maximum of 128,000 output tokens. For local deployment, the full model requires approximately 1.5 TB of memory in its native BF16 precision. - On the SWE-bench Verified coding benchmark, GLM-5 achieved a score of 77.8, outperforming Google's Gemini 3.0 Pro and approaching the performance of Claude Opus 4.6. - Z.ai, formerly known as Zhipu AI, was founded in 2019 as a spin-out from Tsinghua University and has received funding from major tech companies like Alibaba and Tencent. The company was added to the U.S. Commerce Department's Entity List in January 2025 due to national security concerns. - The term "open-weights" means that while the model's trained parameters (weights) are publicly accessible for use and fine-tuning, the underlying source code for training and the full dataset may not be. This contrasts with fully open-source models where the training code and data are also released. - "Long-horizon" agents are designed for complex, multi-step tasks that require planning and maintaining context over an extended period. This is a shift from simpler, single-transaction AI tasks and often involves features like sub-agents, planning capabilities, and persistent memory like a file system. - GLM-5 is available through Z.ai's developer platform and is priced at approximately $1.00 per million input tokens and $3.20 per million output tokens on third-party provider Novita.

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.