What is GLM-5.1 and who built it?

GLM-5.1 is a 754-billion-parameter Mixture-of-Experts model released on April 7, 2026 by Z.ai (formerly Zhipu AI), targeting agentic engineering and long-horizon coding tasks. Its weights are available on Hugging Face under the MIT License.

Where did GLM-5.1 rank on Code Arena?

On April 10, 2026 GLM-5.1 posted a 1530 Elo score on Code Arena, placing third behind claude-opus-4-6-thinking (1548) and claude-opus-4-6 (1542), and ahead of every GPT and Gemini model on the leaderboard.

Why does this matter for open-source AI?

It is the first time an open-weight model has broken into Code Arena's top three, signaling that permissively licensed models can now compete with frontier closed-weight systems on human-judged coding tasks.

GLM-5.1 Cracks Code Arena Top 3, First Open-Weight Model to Do So

Z.ai's GLM-5.1 has become the first open-weight model to reach the top three on Code Arena, the human-judged coding leaderboard that has long been dominated by closed frontier systems from Anthropic, OpenAI and Google. The Chinese lab — formerly known as Zhipu AI — confirmed the milestone this week after its new flagship posted a 1530 Elo score on the board.

A new benchmark for open models

Released on April 7, 2026 under the permissive MIT License on Hugging Face, GLM-5.1 is a 754-billion-parameter Mixture-of-Experts model targeting agentic engineering and long-horizon coding work. Its Code Arena placement on April 10 puts it behind only Anthropic's claude-opus-4-6-thinking (1548) and claude-opus-4-6 (1542), while sitting ahead of every GPT-5 and Gemini 3 model on the agentic webdev leaderboard.

The jump is unusually large for a point release. Reports indicate GLM-5.1 gained roughly 90 Elo points over GLM-5 and around 100 over Kimi K2.5 Thinking, the prior best open entrant. On SWE-Bench Pro, third-party coverage places GLM-5.1 ahead of both Claude Opus 4.6 and GPT-5.4 on a number of tasks.

Why Code Arena matters

Unlike synthetic benchmarks, Code Arena ranks models through head-to-head comparisons judged by developers on real coding prompts. A top-three finish there is harder to game than a leaderboard saturated with closed-book multiple-choice questions, and it is the metric many engineering teams watch when picking a default coding model.

Open weights, permissive license

The release is distributed on Hugging Face at zai-org/GLM-5.1 under the MIT License — one of the most permissive in open source. Enterprises and researchers can fine-tune, modify and redistribute the weights commercially without royalties, which removes one of the last structural advantages frontier labs have enjoyed against open alternatives.

That combination — frontier-tier ranking plus an unrestricted license — is what makes this release different from previous open-weight milestones. Earlier Chinese-origin models such as DeepSeek-V3 and Qwen built strong benchmark stories, but none had cleared Code Arena's top three.

Implications for the frontier

For closed-lab incumbents, GLM-5.1 puts pressure on pricing power. When a freely downloadable model beats GPT-5.4 and Gemini 3.1 Pro on a human-judged coding board, the premium enterprises pay for API access has to be justified on things other than raw capability — latency, safety tooling, context length, or integration depth.

For open-source advocates, the release is a proof point that the capability gap between open and closed frontier models has narrowed materially in 2026. The question now is whether U.S. and European labs respond by accelerating their own open-weight releases, or by leaning further into proprietary agentic tooling where model weights alone no longer decide the winner.

GLM-5.1 is available today on Hugging Face, and Z.ai is hosting the model through its own API for teams that prefer not to self-host.

GLM-5.1 Cracks Code Arena Top 3, First Open-Weight Model to Do So

A new benchmark for open models

Why Code Arena matters

Open weights, permissive license

Implications for the frontier

More in Models

xAI Launches Grok Voice Think Fast 1.0, Tops τ-Voice Bench and Powers Starlink Support

Tencent Drops Hy3 Preview: 295B Open-Source MoE Model Kicks DeepSeek Out of Yuanbao

DeepSeek V4 Preview Lands: 1.6T-Parameter Open Model With 1M Context, Flash Pricing at $0.14/M