GLM-5 is a new milestone in open-source models and one of the most ambitious projects from Chinese developers. Z.ai has introduced a fresh model that draws comparisons to giants like Opus 4.5, Gemini 3 Pro, and GPT-5.2. The primary focus is on programming tasks, creative writing, mathematics, and various agent systems. It is reported that the model handles long texts exceptionally well — the release mentions support for up to 200,000 tokens, though what constitutes “long” can vary.
In terms of performance metrics, GLM-5 looks quite impressive — it achieves excellent results on the HLE and SWE-bench tests, especially considering the open weights of the model. Another important point: the model was fully trained on Huawei Ascend chips using the MindSpore framework, completely eliminating dependence on American hardware. If this is indeed the case, it represents a real technological breakthrough — a significant step forward in many respects.
What’s inside?
– The model uses MoE architecture with 745 billion parameters (with approximately 44 billion actively engaged), doubling the size of the previous GLM-4.5 version.
– It features 78 layers: the first three are fully dense, while the rest utilize DeepSeek Sparse Attention (DSA), specifically designed for working with long sequences.
– Additionally, Multi-Token Prediction (MTP) technology has been implemented: the model can predict multiple tokens simultaneously in one pass, providing processing speeds of over 50 tokens per second — nearly twice as fast as the previous generation.
The model weights are distributed under the MIT license.
It is available via WaveSpeed API at approximately $0.90 per million input tokens and $2.88 per million output tokens.
Overall, GLM-5 represents a strong bid for leadership in the open neural network community and demonstrates robust technical capabilities.
Created with n8n:
https://cutt.ly/n8n
Created with syllaby:
https://cutt.ly/syllaby
