Claude 4 is now available: Anthropic's next-generation model capabilities explained + best practice guide

Anthropic's new generation model Claude 4 brings revolutionary progress, with both performance and security.
Core content:
1. Detailed explanation of the performance of the Claude 4 family model, the characteristics and application scenarios of Opus 4 and Sonnet 4
2. The community's heated discussion on pricing and closed source issues, as well as the controversy over the "automatic reporting" function
3. Introduction to new model capabilities, including Beta functions that expand thinking and tool usage
The Claude 4 family by Anthropic consists of two main models:
Claude Opus 4 : Flagship large model for complex tasks, emphasizing reasoning, idea generation, and safety. Claude Sonnet 4 : An efficient model for daily use with balanced performance and cost-effectiveness.
Claude Opus 4: Extreme performance, pushing the limits
✨ Excellent results
SWE-bench (Software Engineering Benchmark): 72.5% (leading the world) Terminal-bench (command line task benchmark): 43.2% It can work continuously for hours and complete thousands of complex tasks , far exceeding the performance of the Sonnet model.
? Industry feedback
Cursor : Called the "SOTA in the code field", it excels at understanding complex code bases. Replit : Excellent performance in complex multi-file modifications with significantly improved accuracy. Block : The first model to improve code quality during code editing and debugging (its agent is codenamed Goose). Rakuten : The model ran continuously for 7 hours in the refactored open source project and maintained high performance. Cognition : Capable of solving key tasks that previous models could not handle.
⚙️ Claude Sonnet 4: Efficient and practical, comprehensive evolution
✨ Performance
The SWE-bench score is as high as 72.7% , far ahead of its peers. There has been a significant improvement in execution, response accuracy and command controllability .
? Industry feedback
GitHub : Introducing Sonnet 4 into GitHub Copilot, a new generation of code proxy model. Manus : Better performance in complex instructions, logical reasoning, aesthetic output, etc. iGent : The ability to automatically build multi-functional applications has been greatly improved, and the navigation error rate has been reduced from 20% to nearly 0% . Sourcegraph : Deeply understand the problem, write more elegant code, and promote the leap of development process. Augment Code : Performs more precise and detailed code modifications and is the preferred model.
Claude Opus 4 | |
Claude Sonnet 4 |
Community Focus
1. Closed source and pricing transparency controversy
Users hope to open source the weights of Claude 3.5 Sonnet to promote local model development. They expressed dissatisfaction with the opaque token billing and demanded clearer billing instructions and traceable token consumption.
2. The “Automatic Reporting” feature sparked controversy
It is said that Claude 4 Opus may have certain "trigger-reporting" functions: when user behavior violates ethical or legal bottom lines, the model can automatically notify the media or regulatory agencies, or even lock key system permissions. The community has raised serious questions about this feature, fearing it could be abused for AI surveillance or government censorship . Some people criticized this as "malicious software behavior implanted in AI" that violates user privacy and security principles.
? New model capabilities
1. Expand your thinking + tool usage (Beta)
Both Opus 4 and Sonnet 4 can call external tools (such as Web searches) for inference. The model can switch between "reasoning" and "tool usage" to achieve more in-depth answers.
2. Use tools in parallel & improve memory
The model can call multiple tools in parallel to improve efficiency. When developers provide access to local files, Claude automatically extracts and saves key information, builds long-term memory, and optimizes context continuity.
? Anthropic API adds four powerful capabilities
Code execution tool MCP connector (supports Agent framework integration) File API (file reading and writing processing) Prompt cache (up to 1 hour)
It enables developers to build more complex and continuously running AI Agents.
⚡ Model Mode and Access Plan
Claude Opus 4 and Sonnet 4 in hybrid mode :
Instant response mode : for quick response Extended Thinking : For complex reasoning
The subscription plans include:
Opus 4 and Sonnet 4 are fully functional for Pro/Max/Team/Enterprise . Sonnet 4 is also available to free users (but without Extended Thinking).
Deployment channels:
Anthropic API Amazon Bedrock Google Cloud Vertex AI Claude 4 Model Improvements Highlights
✅ 1. Reduce shortcuts and loopholes
In agent tasks, the model sometimes takes shortcuts to complete the task instead of following the expected steps. Now:
Claude Opus 4 and Sonnet 4 use 65% less exploit behavior than Sonnet 3.7 ; More robust and reliable behavior in agent tasks susceptible to speculation.
2. Comprehensive memory system upgrade (Opus 4 exclusive)
Claude Opus 4 is the first model to excel at “long-term memory” . When developers provide local file access , Claude can:
Automatically generate and maintain " memory files "; Persistently store key context and task data to improve coherence and agent capabilities .
? Example display
In Pokémon Red, Claude Opus 4 can:
Create a Navigation Guide. And continuously update the file content to maintain the task context.
The above behavior has been demonstrated through a visual “memory note”, which is the actual file content automatically recorded by Claude.
3. Introducing Thinking Summaries
To improve user experience, Claude 4 introduces a small model to compress the lengthy reasoning process; Only about 5% of the reasoning process requires summary , and the vast majority can be presented in full; For advanced users who need a full Chain-of-Thought for prompt engineering, they can apply for Developer Mode .
These improvements in Claude 4 significantly advance it towards a truly controllable and reliable AI Agent framework .
Claude Code is now available
Claude Code is now open and can be widely embedded in developers' workflows:
Support terminal operation Seamless integration with mainstream IDEs Provides an extensible Claude Code SDK to facilitate building custom agents and applications IDE plugin support (Beta)
Added native support for two major IDEs:
✅ VS Code plugin
✅ JetBrains plugins (such as IntelliJ, PyCharm)
Plugin integration features:
Claude's code modification suggestions appear directly inline in the editor file ; Code review and version tracking can be easily performed without switching environments; Installation method: Run in the IDE terminal Claude Code
Just run the command.
Claude Code SDK released (for developers only)
Provide an extensible SDK that allows developers to build their own agents and code tools based on the Claude Code core agent; At the same time, official sample projects are released to demonstrate the capabilities of the SDK.
GitHub app integration (Beta)
Claude Code can now be deployed as a GitHub application to facilitate code collaboration and review:
Functional Examples
Respond to PR comments (e.g. explain code, auto-fix issues) Automatically fix CI errors Modify the code according to the prompts
Installation
Run in Claude Code:./install-github-app
Claude 4 Prompt Engineering Guide
The Claude 4 series has significantly improved its command understanding and execution accuracy compared to the past, but it also requires a more explicit prompt structure to realize its full potential.
Basic principles
1. Be clear and specific
Claude 4 prefers to perform tasks "just as prompted" . Therefore:
If you want your model to demonstrate creativity or deep reasoning "beyond expectations", make that clear in the prompt ; Vague or overly brief instructions may only result in basic output.
❌ Poor performance:
Create a data analytics dashboard
Claude would prefer to output only a basic framework or conceptual description.
✅ Better results:
Please create a data analysis dashboard. Please include as many relevant functions and interactions as possible, including data filtering, chart switching, custom indicators, etc. We hope that you will not only implement basic functions, but also build a complete version that is fully functional and ready for use.
Such prompts will encourage Claude to perform deeper generation tasks, demonstrating higher-order understanding and execution.
2. Add context to enhance the effect
Claude 4 has a much better ability to understand instructions. If you explain “why” a certain behavior is needed , Claude will more accurately grasp your goals and optimize its output.
✅ Example comparison: formatting preference
❌ Poor performance:
NEVER use ellipses.
Such hard commands are mechanically executed but do not necessarily extend to the relevant context.
✅ Better results:
Your response will be spoken by a text-to-speech (TTS) engine. TTS does not handle ellipses correctly, so please do not use them.
Claude will generalize a more reasonable behavioral logic from the explanation , which not only avoids ellipsis but also may optimize the sentence segmentation method.
3. Keep examples and details consistent
Claude 4 is very sensitive to examples in prompts and will try to imitate the example behavior .
Practical suggestions:
? Instructions for special situations
1. Effective methods to control response format
Claude 4 performs well in terms of format steerability , but to maximize its effectiveness, please refer to the following suggestions:
1. Use “do” instead of “don’t do”
Claude responds more to positive instructions (how you want him to act).
❌ Weak Tip:
Don't use markdown
✅ Stronger tips:
Please write in flowing, natural paragraphs without any markdown tags.
2. Use XML tags to constrain structure
Claude recognizes and follows formatting tags in prompts, for example:
Please wrap all paragraphs in <smoothly_flowing_prose_paragraphs> tags.
This not only controls the output structure, but also helps you extract the content more easily during post-processing.
3. Matching prompt style with target output style
Claude tends to mimic the format of the prompts you provide .
Tip examples:
If you want plain text, please do not include markdown, bullet points, headings, etc. in your prompts ; Want a table format Your prompt can state the requirements in table style.
This will significantly improve the consistency between Claude's output and your expectations.
2. Using Claude 4’s “thinking ability” and “cross-thinking ability”
Claude 4 can be used to insert a thinking phase (e.g., evaluation, reflection, planning) after the execution tool call, and is particularly suitable for:
Multi-step reasoning tasks Response judgment after using external tools (such as search, code execution, API calls) Agentic workflows
✅ Recommended prompting model: guiding thinking + planning actions
Once you receive results from the tool, reflect on their quality and determine the best next step before moving on. Use your thinking skills to plan and iterate based on the latest information, then execute the best next action.
3. Parallel Tool Calling
Claude 4 natively has the ability to execute parallel tools with a high success rate, but if you want to ensure a close to 100% parallel usage success rate , it is recommended to add the following tips:
✅ Prompt template (for agent development):
To ensure maximum efficiency, when you need to perform multiple independent operations, call all related tools at the same time rather than sequentially.
4. Reduce temporary file creation when the agent generates code
When Claude is doing intelligent coding, he may create multiple temporary files (such as test scripts, auxiliary functions) as scratchpads . This behavior can sometimes improve the output quality.
If you prefer to keep your project tidy after the task is completed , you can include the following tips:
✅ Prompt template:
If you create any temporary scripts, files, or auxiliary modules during the mission, clean them up or delete them at the end of the mission to keep your files clean.
5. Improve the quality of front-end code generation
Claude 4 already has strong performance in areas such as web interfaces and front-end design, but you can further encourage its creativity and polish its details to significantly improve the results.
✅ Tips template combination:
Claude 4 Migration Notes
Claude 4's behavior is more precise, controllable, and responsive , but that also means you need to tell it more clearly what you want it to do .
1. Clarify behavioral expectations
Claude 4 no longer has the freedom to play with vague instructions like 3.7 did. You should:
Describe the output you want to see Avoid overly brief or general instructions
2. Use “enhancing modifiers” to guide Claude to improve quality
Claude 4 is particularly sensitive to "qualifiers". You can add something like:
“As comprehensive as possible” “Beyond the Basics” “Show your best abilities” “Rich in details” “Visually appealing”
These phrases can significantly improve the quality and complexity of Claude's output.
❌ Example (without strengthening modifiers) :
Generate an analytical dashboard.
✅ Example (more strong modifiers) :
Please generate a data analysis dashboard. Please include as many relevant features and interactive components as possible, such as chart switching, filters, export options, etc. Don't just provide a basic template, but build a fully functional version.
3. Explicitly request specific features
If you want the output to include:
✨Animation effects Interaction Responsive layout Dynamic Data Binding
Claude 4 will not generate this kind of complex behavior by default unless you specify it explicitly .
✅ Examples:
Please add hover effects, chart animations, and drill-down interactions after users click on them to this analytical dashboard.