Claude 4 benchmarks show improvements, but context is still 200K

Today, OpenAI rival Anthropic announced Claude 4 models, which are significantly better than Claude 3 in benchmarks, but we’re left disappointed with the same 200,000 context window limit.

In a blog post, Anthropic said Claude Opus 4 is the company’s most powerful model, and it’s also the best model for coding in the industry.

- Advertisement -

For example, in SWE-bench (SWE is short for Software Engineering Benchmark), Claude Opus 4 scored 72.5 percent and 43.2 on Terminal-bench.

“It delivers sustained performance on long-running tasks that require focused effort and thousands of steps, with the ability to work continuously for several hours, dramatically outperforming all Sonnet models and significantly expanding what AI agents can accomplish,” Anthropic noted.

- Advertisement -

While benchmarks put Claude 4 Sonnet and Opus ahead of their predecessors and competitors like Gemini 2.5 Pro in coding, we’re still concerned about the model’s 200,000 context window limit.

This could be one of the reasons why Claude 4 models excel at coding and complex-solving tasks in these benchmarks, because these models are not being tested against a large context.

- Advertisement -

For comparison, Google’s Gemini 2.5 Pro ships with a 1 million token context window and support for a 2 million context window is also in the works.

ChatGPT’s 4.1 models also offer up to a million context window.

Model
Description
Input
Prompt Caching Write
Prompt Caching Read
Output
Context Window
Batch Processing Discount
Claude Opus 4
Most intelligent model for complex tasks
$15 / MTok
$18.75 / MTok
$1.50 / MTok
$75 / MTok
200K
50% discount with batch processing
Claude Sonnet 4
Optimal balance of intelligence, cost, and speed
$3 / MTok
$3.75 / MTok
$0.30 / MTok
$15 / MTok
200K
50% discount with batch processing
Claude is still lagging behind the competition when it comes to the context window, which is important in large projects.

- Advertisement -

8 Common Threats in 2025
While cloud attacks may be growing more sophisticated, attackers still succeed with surprisingly simple techniques.

Drawing from Wiz’s detections across thousands of organizations, this report reveals 8 key techniques used by cloud-fluent threat actors.

Top Stories

Olay Day & Night anti-ageing set that makes skin ‘look much younger than my years’ has £55 off

Shoppers go wild for Boden’s ‘beautiful’ £125 shirt dress now reduced to £37

F.C.C. TSR Honda France wins EWC 8 Hours of Spa Moto

Stay Connected

Claude 4 benchmarks show improvements, but context is still 200K

Leave a Reply Cancel reply

Related Stories

Data embassies and US embargo halt give Saudi AI hope

Microsoft’s ICC email block reignites European data sovereignty concerns

Labour puts Humphrey AI to work for council admin

AI storage: NAS vs SAN vs object for training and inference

SAP Sapphire 2025

BGO Teams With Centuria to Buy Three Sydney Sheds From Goodman for $130M

Hong Kong Banks Refinancing $11B in New World Loans and More Asia Real Estate Headlines

US indicts leader of Qakbot botnet linked to ransomware attacks

Top Stories

Stay Connected

Leave a Reply Cancel reply

Related Stories

Ads Blocker Detected & This Is Prohibited!!!