AI Haven
AI News

OpenAI Launches GPT-5.4 With Computer Agent Capabilities, Beats Human Baseline on OSWorld

OpenAI released GPT-5.4 with 75% on OSWorld-Verified, beating human baseline of 72.4%. The model features 1M token context and native computer use capabilities.

March 6, 2026

OpenAI Launches GPT-5.4 With Computer Agent Capabilities, Beats Human Baseline on OSWorld

OpenAI has released GPT-5.4, its latest flagship model focused on reasoning, coding, and agent-style tasks. The model debuted on March 5, 2026, and is now available through OpenRouter and ChatGPT with both Thinking and Pro variants.

The standout feature of GPT-5.4 is its native computer use capabilities. The model achieved a 75.0% success rate on the OSWorld-Verified benchmark, surpassing the average human performance of 72.4%. This marks a significant leap from GPT-5.2's 47.3% score and positions the model as a state-of-the-art solution for desktop automation tasks.

Benchmark Performance

GPT-5.4 demonstrates strong performance across multiple agent-focused benchmarks:

  • OSWorld-Verified: 75.0% (human baseline: 72.4%)
  • WebArena-Verified: 67.3%
  • BrowseComp: 82.7%
  • Spreadsheet Modeling: 87.3%
  • Toolathlon: 54.6%

The model also supports a 1 million token context window, up from GPT-5.1's 400K tokens. This expanded context enables more complex document analysis and multi-step reasoning workflows.

Pricing and Availability

GPT-5.4 is accessible through multiple channels:

  • ChatGPT: Available for Plus and Pro subscribers
  • OpenRouter: Publicly listed with token pricing
  • OpenAI API: Standard and Pro versions available

The /fast mode offers 1.5x token speed for coding and agentic tasks. According to Reddit discussions, some users have noted the API pricing appears significantly higher than competitors like Anthropic's Sonnet.

What This Means for the Industry

GPT-5.4's computer use capabilities represent a meaningful advance in AI agents. The ability to navigate desktop environments through screenshots, keyboard, and mouse inputs positions the model for practical automation workflows. With the 1M token context and improved steerability, developers can now build more sophisticated agentic applications that handle complex, multi-step tasks.

The model is being positioned for enterprise use cases rather than simple chat interactions, signaling OpenAI's continued push toward agentic AI workflows.

Source: OpenAI BlogView original →