OpenAI Launches GPT-5.4 With Computer Agent Capabilities, Beats Human Baseline on OSWorld
OpenAI has released GPT-5.4, its latest flagship model focused on reasoning, coding, and agent-style tasks. The model debuted on March 5, 2026, and is now available through OpenRouter and ChatGPT with both Thinking and Pro variants.
The standout feature of GPT-5.4 is its native computer use capabilities. The model achieved a 75.0% success rate on the OSWorld-Verified benchmark, surpassing the average human performance of 72.4%. This marks a significant leap from GPT-5.2's 47.3% score and positions the model as a state-of-the-art solution for desktop automation tasks.
Benchmark Performance
GPT-5.4 demonstrates strong performance across multiple agent-focused benchmarks:
- OSWorld-Verified: 75.0% (human baseline: 72.4%)
- WebArena-Verified: 67.3%
- BrowseComp: 82.7%
- Spreadsheet Modeling: 87.3%
- Toolathlon: 54.6%
The model also supports a 1 million token context window, up from GPT-5.1's 400K tokens. This expanded context enables more complex document analysis and multi-step reasoning workflows.
Pricing and Availability
GPT-5.4 is accessible through multiple channels:
- ChatGPT: Available for Plus and Pro subscribers
- OpenRouter: Publicly listed with token pricing
- OpenAI API: Standard and Pro versions available
The /fast mode offers 1.5x token speed for coding and agentic tasks. According to Reddit discussions, some users have noted the API pricing appears significantly higher than competitors like Anthropic's Sonnet.
What This Means for the Industry
GPT-5.4's computer use capabilities represent a meaningful advance in AI agents. The ability to navigate desktop environments through screenshots, keyboard, and mouse inputs positions the model for practical automation workflows. With the 1M token context and improved steerability, developers can now build more sophisticated agentic applications that handle complex, multi-step tasks.
The model is being positioned for enterprise use cases rather than simple chat interactions, signaling OpenAI's continued push toward agentic AI workflows.