Back to stories
Models

OpenAI Launches GPT-5.4 With Native Computer-Use and 1M Token Context

Michael Ouroumis2 min read
OpenAI Launches GPT-5.4 With Native Computer-Use and 1M Token Context

OpenAI dropped its most ambitious model yet on March 5, and the AI community is still digesting what GPT-5.4 means for the industry. The new release merges the coding prowess of GPT-5.3-Codex with breakthrough computer-use capabilities, creating a single system that can reason, write code, and directly operate software interfaces.

What Makes GPT-5.4 Different

The headline feature is native computer-use. GPT-5.4 can view screenshots, move a cursor, click buttons, and type keystrokes — effectively operating any desktop or web application the way a human would. Previous models required third-party tooling or wrappers to achieve similar functionality. Now it is built into the model itself.

OpenAI also expanded the API context window to one million tokens, the largest the company has ever offered. For enterprise customers working with sprawling codebases or lengthy legal documents, this removes a persistent bottleneck.

Benchmark Performance

The numbers back up the marketing. GPT-5.4 set new records on OSWorld-Verified and WebArena-Verified, two benchmarks that measure a model's ability to complete real software tasks. It also scored 83 percent on OpenAI's internal GDPval test for knowledge work — tasks like drafting reports, analyzing spreadsheets, and managing project workflows.

Factual accuracy improved as well. Compared to GPT-5.2, individual claims are 33 percent less likely to be false, and full responses are 18 percent less likely to contain any factual errors.

Two Flavors at Launch

OpenAI released two variants simultaneously. GPT-5.4 Thinking is the default in ChatGPT, optimized for interactive conversations with visible chain-of-thought reasoning. GPT-5.4 Pro targets power users and enterprise customers who need maximum performance on complex, multi-step tasks.

Both versions are available through the API, where developers can access the full million-token context and integrate computer-use into agentic workflows.

Industry Implications

The release intensifies an already crowded March. Google shipped Gemini 3.1 Pro in late February, Anthropic updated Claude Sonnet 4.6, and DeepSeek V4 arrived days earlier with its own trillion-parameter multimodal architecture. The pace of releases has shifted from quarterly cadences to what industry trackers now describe as weekly waves.

More significantly, GPT-5.4 signals that the frontier is moving beyond chat and code generation toward autonomous task execution. Models that can see and operate software blur the line between assistant and agent — a shift that will reshape how businesses think about automation, workforce planning, and software design in the months ahead.

Learn AI for Free — FreeAcademy.ai

Take "AI Essentials: Understanding AI in 2026" — a free course with certificate to master the skills behind this story.

More in Models

xAI Launches Grok Voice Think Fast 1.0, Tops τ-Voice Bench and Powers Starlink Support
Models

xAI Launches Grok Voice Think Fast 1.0, Tops τ-Voice Bench and Powers Starlink Support

xAI's new voice model scored 67.3% on the τ-voice Bench — well ahead of Gemini 3.1 Flash Live and GPT Realtime — and is now powering Starlink's phone sales and support with a 70% autonomous resolution rate.

2 days ago2 min read
Tencent Drops Hy3 Preview: 295B Open-Source MoE Model Kicks DeepSeek Out of Yuanbao
Models

Tencent Drops Hy3 Preview: 295B Open-Source MoE Model Kicks DeepSeek Out of Yuanbao

Tencent has open-sourced Hy3 Preview, a 295B/21B-activated mixture-of-experts model built in under three months. The Yuanbao chatbot is switching its primary engine from DeepSeek to the new in-house model.

4 days ago2 min read
DeepSeek V4 Preview Lands: 1.6T-Parameter Open Model With 1M Context, Flash Pricing at $0.14/M
Models

DeepSeek V4 Preview Lands: 1.6T-Parameter Open Model With 1M Context, Flash Pricing at $0.14/M

DeepSeek on April 24 released preview versions of V4-Pro and V4-Flash, an open-weight MoE family with a 1M-token context window and pricing that undercuts Western frontier labs.

4 days ago2 min read