OpenAI and Anthropic unveiled new flagship AI models in their respective product lines within an hour of each other on Thursday, highlighting intensifying competition among leading developers to dominate enterprise software and advanced coding tools.
Anthropic announced Claude Opus 4.6, touting gains in long-context reasoning and agent-based workflows, while OpenAI shortly after released GPT-5.3 Codex, a model optimized for agentic coding and software development.
The near-simultaneous launches underscored how quickly rivals are iterating as companies race to secure long-term contracts with large corporate customers.
Benchmark results suggested the two models are optimized for different strengths.
Claude Opus 4.6 showed stronger performance on tasks tied to legal and financial reasoning, while GPT-5.3 Codex outperformed on agentic coding tests and efficiency metrics, according to figures released by both companies.
The releases come as investors reassess the outlook for traditional software providers, with shares of several information and professional-services firms falling this week amid concerns that AI-native platforms could erode demand for established enterprise tools.
Anthropic said that Claude Opus 4.6 delivered gains in long-context reasoning and professional tasks, citing a 1-million-token context window and a 76% score on MRCR v2, a benchmark for complex information retrieval.
The company said the model also outperformed earlier versions on finance and legal tasks and introduced “agent teams” that allow multiple AI agents to work in parallel on coding and documentation.
OpenAI released GPT-5.3 Codex shortly afterward, positioning it as a model optimized for agentic coding and research.
OpenAI said Codex scored 77.3% on Terminal-Bench 2.0, an agentic coding benchmark where Claude Opus 4.6 scored 65.4%, and completed tasks faster while using fewer tokens.
OpenAI also said early versions of Codex were used internally to help debug training and manage deployment, marking one of the first times a model played a direct role in accelerating its own development.
Taken together, the results suggest neither model holds a clear overall lead, with performance advantages depending on whether enterprises prioritize professional reasoning or autonomous software development.
Google is also expected to roll out updates to its Gemini models in the coming months, while other AI developers, including DeepSeek, are preparing new releases, adding to the pace of competition in the sector.
Still, benchmark results alone are unlikely to determine market leadership, as broader adoption and enterprise deployment increasingly shape the competitive landscape.
As competition continues to pressure rivals, time will tell whether agent-based workflows become a core component of economic activity. OpenAI and Anthropic are certainly banking on that.
Daily Debrief Newsletter
Start every day with the top news stories right now, plus original features, a podcast, videos and more.