Claude quietly evolves: new "thinking" tools unlocked, AI can also "stop and think" like humans

Written by

Jasper Cole

Updated on:July-09th-2025

Claude model has a new thing called "think tool"

In simple terms, this tool is like adding a "pause button" and "scratch paper" to Claude, creating a dedicated space for structured thinking during complex tasks.

This is different from the “extended thinking” they mentioned before. “Extended thinking” is the pre-thinking and iteration of the model before it starts generating answers . The “think tool” allows Claude to stop at any time during the generation of answers , review the existing information, and determine whether further analysis is needed.

What’s the use of this “thinking”?

Anthropic officials said that this method is particularly suitable for scenarios with complex tool calls . For example:

When information is overloaded: Claude needs to process the results returned by multiple tools. It is easy to be confused by too much information. The "thinking tool" can help him slow down and analyze carefully.

When the rules are complicated: When faced with complex policies or guidelines, Claude needs to check them one by one to ensure compliance. The "thinking tool" can help him sort out the strategy.

When taking each step carefully: In a multi-step task, each step builds on the previous step. If something goes wrong, the cost is high. The "thinking tool" can help Claude stop at key points and assess the risks.

Technical details: JSON configuration, easy to use

Anthropic also provides a JSON configuration example of the "think tool" that developers can easily integrate into their own applications. The configuration is very concise, and the core is to define the tool's name, description, and input parameters (a string named "thought")

{
  "name" : "think" ,
"description" : "Use the tool to think about something. It will not obtain new information or change the database, but just append the thought to the log. Use it when complex reasoning or some cache memory is needed." ,
"input_schema" : {
    "type" : "object" ,
    "properties" : {
      "thought" : {
        "type" : "string" ,
        "description" : "A thought to think about."
      }
    } ,
    "required" : [ "thought" ]
}
}

Effect measurement: significant performance improvement

To verify the effectiveness of the “think tool”, Anthropic used two benchmark tests: T-Bench and SWE-Bench.

T-Bench (Customer Service Scenario): In a simulated customer service conversation, the “think tool” combined with the optimized prompt improved the pass@1 indicator in the Airline field by 54% ! There was also a significant improvement in the Retail field.

SWE-Bench (Software Engineering Scenario): In the software engineering task, the performance improved by an average of 1.6% after adding the “think tool”

When to use it and when not to use it

Although the “think tool” is useful, it is not a panacea. Anthropic also gives some suggestions for use:

Recommended usage scenarios:

• Tool output analysis
• Policy compliance environment
• Continuous decision making tasks

Not recommended scenarios:

• Non-continuous tool calls
• Simple instructions to follow

Last words

Research has shown that the Think tool can significantly improve the performance of Claude 3.7 Sonnet when performing complex tasks that require policy compliance and reasoning in long chains of tool calls. Think is not a one-size-fits-all solution, but it provides substantial benefits for the right use cases with minimal implementation complexity.