Claude 4 released! The world's most powerful programming model is here

Written by
Caleb Hayes
Updated on:June-28th-2025
Recommendation

A revolutionary breakthrough in the field of AI programming, Claude 4 redefines the boundaries of programming capabilities.

Core content:
1. Two new models in the Claude 4 family, with comprehensive performance improvements
2. Claude Opus 4 achieves amazing results in authoritative programming benchmarks
3. Positive feedback and expectations from industry leaders and users on Claude 4

Yang Fangxian
Founder of 53AI/Most Valuable Expert of Tencent Cloud (TVP)

Anthropic dropped a bombshell late at night!

The Claude 4 family was officially released, including two versions, Claude Opus 4 and Claude Sonnet 4 , which directly raised the ceiling of programming AI.

These two models are designed for different scenarios, but they have one thing in common - their coding capabilities far outshine those of competing products !

This major version upgrade from 3.x to 4 will naturally not be a small-scale change, but will completely redefine the capabilities of AI programming .

The upgrade of Claude 4 is not a simple performance improvement, but brings a series of new functions and breakthrough capabilities.

Let’s see how powerful these two models are!

The world's most powerful programming model is born

Claude Opus 4 is officially called "the world's best programming model" by Anthropic!

On SWE-bench Verified, the most authoritative programming benchmark test recognized by the industry, Opus 4 scored 72.5%!

What is the concept?

This benchmark is specifically designed to measure a model’s ability to solve real software engineering problems, meaning it can solve complex problems found in real software development.

What's even more terrifying is that Claude Opus 4 scored 43.2% on Terminal-bench , which means Opus 4 can work continuously for hours and maintain focus and high performance on long and complex tasks.

Imagine an AI that can independently refactor your entire code base and work for 7 hours without breaking a sweat - this is no longer science fiction.

Rakuten verified this by having Opus 4 independently complete a highly demanding open source reconstruction task that took a full 7 hours with stable performance .

More than just programming

Although Claude Sonnet 4 is positioned as a "daily use version", its strength should not be underestimated.

Compared with the previous generation Sonnet 3.7, Sonnet 4 has significantly improved programming and reasoning capabilities , scoring 72.7% on SWE-bench, surpassing most models on the market.

Both versions use a hybrid architecture and provide two working modes: near-instant response and deep thinking reasoning. When encountering complex problems, the model automatically switches to "thinking mode" and performs in-depth analysis like humans .

What’s even more amazing is that the model can also call on tools during the thinking process , such as online search, forming a “think-search-think again” work cycle.

This really takes the model's capabilities to a new dimension!

Industry leaders collectively praised

Some companies using Claude have given positive feedback:

Cursor  directly stated that Opus 4 is a major breakthrough in the field of programming and has made a qualitative leap in understanding complex code bases.

GitHub  announced that it will use Sonnet 4 as the base model for GitHub Copilot.

Replit  reports that the model has made “dramatic improvements” in handling complex changes across multiple files.

Rakuten verified its capabilities by allowing it to independently reconstruct the open source code and maintain stable performance while running for 7 hours in a row!

Judging from the reactions of X users, netizens are also very excited:

Christian Yun (@christiankyun) directly compared this release to a major event in the gaming industry:

The GTA6 of the AI ​​world is finally here!

kitze (@thekitze) can't wait to refactor the React components using Sonnet 4:

Can't wait to refactor my React components using Sonnet 4 to reinvent the universe from scratch

However, there are also voices of doubt.

voicesz (@voicesz_) expressed skepticism about the benchmark results:

These guys want us to believe it's not as good as o3 at high school math, but better at programming? Wake up

Hybrid Model, Two Swords Together

Claude Opus 4 and Sonnet 4 are hybrid models that offer two operating modes:

  1. Near-instant response

  2. Prolong thinking time and make deeper reasoning

Both models can also switch between reasoning and the use of tools — such as web searches — to improve the quality of responses.

what does that mean?

In short, Claude can quickly answer simple questions while also handling complex tasks that require careful consideration .

Most impressively, both models were able to use tools in parallel , follow instructions more accurately , and when developers granted access to local files, they demonstrated significantly improved memory , able to extract and preserve key facts, and maintain continuity during long interactions.

GitHub says Claude Sonnet 4 " performs well " in agent scenarios and is using it as the base model for new coded agents in GitHub Copilot.

iGent reports that Sonnet 4 excels in autonomous multi-function application development, with significant improvements in problem solving and code base navigation capabilities— navigation errors reduced from 20% to nearly zero !


Claude Code is now live

Along with the release of the model, Claude Code also moved from a research preview version to officially available .

Now developers can use Claude directly in the terminal, VS Code, and JetBrains IDEs. AI modification suggestions will be displayed directly in your code files , enabling a seamless pair programming experience.

Even more exciting is that Claude Code now supports GitHub Actions background tasks, and you can even @Claude Code in PRs to respond to code review feedback or fix CI errors.

Memory ability is greatly improved

The most surprising thing is the model's memory ability .

The Claude 4 model maintains continuous focus and full context through deep integration.

The Anthropic team also shared their experience spending a full day with Claude, conducting extended research, building application prototypes, and orchestrating complex project plans.

When developers provide Claude with local file access rights, Opus 4 will actively create and maintain "memory files" to store key information . This means that AI will be able to maintain continuity and accumulate experience and knowledge in long-term tasks.

The official showed an interesting example: When Opus 4 was playing the "Pokemon" game, it created a "navigation guide" to record the game progress and strategy .

This memory ability enables AI to truly learn and accumulate knowledge, so that every conversation no longer has to start from scratch.

More importantly, the models also improved in preventing the use of shortcuts or loopholes to complete tasks. For both models, Claude 4 was 65% less likely than Sonnet 3.7 to perform proxy tasks that were prone to shortcuts and loopholes .


Immediately available, no change in price

The Claude 4 series is available today , and Sonnet 4 is even available to free users.

Paid users can use both versions and extended thinking capabilities. API pricing remains the same: Opus 4 is $15/75 per million tokens (input/output), Sonnet 4 is $3/15 .

The model is now available on Anthropic API, Amazon Bedrock, and Google Cloud Vertex AI.

The arms race in AI programming has entered a new phase again!

A new round of AI war has begun.

The cycle of competition never ends. 

Almost every month, there is at least one new  " most powerful model on the planet "  crowned.

People applauded, compared, and waited for the next one.

This is a race with no end!

The last time was O3, the time before was Gemini 2.5 pro, this time is Claude 4…

Who will it be next time ?