Gemini 2.0 Flash Thinking: Google's big move! AI that can live-stream thinking is here, and its reasoning ability is better than OpenAI?

Written by
Audrey Miles
Updated on:July-16th-2025
Recommendation

Google's latest AI technology, Gemini 2.0 Flash Thinking, is a reasoning model that can "live think", bringing revolutionary breakthroughs to the field of AI.

Core content:
1. The launch background and importance of Gemini 2.0 Flash Thinking
2. The difference between Gemini 2.0 and traditional AI models: showing the reasoning process and enhancing transparency
3. Gemini 2.0's multi-modal input support and ultra-large context window features

Yang Fangxian
Founder of 53AI/Most Valuable Expert of Tencent Cloud (TVP)

.01

Overview
Recently, Google launched the new Gemini 2.0 Flash Thinking Experimental, and expanded it from being only available on Google AI Studio, Gemini API, and Vertex AI to users of Gemini applications. This change is undoubtedly an important node in the field of artificial intelligence, which not only brings more powerful functions to reasoning models, but also redefines the way we interact with AI.
.02
What is Gemini 2.0 Flash Thinking?
Gemini 2.0 Flash Thinking is an AI model launched by Google that focuses on reasoning. Unlike traditional language models, its biggest feature is not to simply provide answers, but to show the reasoning process to users. This means that it not only gives answers, but also gradually shows the thinking steps, evaluates different options, and explains the thinking method to reach a conclusion.
The core of reasoning: demonstrating the thought process
Compared to OpenAI's O series and DeepSeek's R series, the biggest advantage of Gemini 2.0 Flash Thinking lies in its speed and transparency. Traditional AI models are more likely to answer questions through fluent text generation, while Flash Thinking is like you have hired a smart assistant to show you how to think at each step, how to make decisions, and even propose other possible options.
Not only can it answer questions, but it also allows you to see every step of its solution. This transparency greatly enhances users' trust in AI reasoning.
.03
2.0 Flash Thinking: Combination of Multimodality and Large-Scale Reasoning
Support multi-modal input
Gemini 2.0 Flash Thinking is a multimodal model, which means that it can process not only text input, but also images. In simple terms, it can understand and analyze complex tasks involving images, such as interpreting charts, analyzing complex documents, and even extracting information from pictures. This opens up new possibilities for tasks that require visual information, especially when chart analysis or complex document interpretation is required, Flash Thinking shows a strong advantage.
Extra large context window
A particularly exciting feature is that the context window of Gemini 2.0 Flash Thinking is very large, supporting inputs of up to 1 million tokens and generating outputs of up to 64,000 tokens. This enables it to handle larger data and maintain coherence even in the context of long books, research papers, or long conversations.
This large-scale context processing capability means that it can reason more comprehensively, understand more information, and reduce the need for users to repeatedly enter context.
Knowledge Deadlines and Tool Integration
However, there is one limitation to note: Gemini 2.0 Flash Thinking has a knowledge cutoff date of June 2024. This means that it cannot obtain information about events that occur after June 2024, so it may produce "hallucinations" - that is, wrong reasoning or assumptions - in some cases. For example, it may incorrectly speculate on the timing of an event, or not have a deep enough understanding of certain new technologies and updates.
To solve this problem, Google has integrated YouTube, maps, and search functions into Flash Thinking. Through these tools, users can get more real-time information, although this information also has certain inaccuracies. For example, when I asked about the release date of Gemini 2.0 Flash Thinking, it obtained relevant information through the search tool, but because the search results contained an incorrect date (February 6, 2025), it also incorrectly estimated the release date.
Automatic tool selection feature
Gemini 2.0 Flash Thinking can also automatically select the most appropriate tool based on the user's question. For example, when I asked for the best driving route from Bucharest to London, it automatically selected the Google Maps tool. This automatic selection function improves the intelligence of the model, making it more efficient and accurate when dealing with different types of questions.
.04
Flash Thinking Benchmark Performance
Breakthroughs in mathematics and science
Gemini 2.0 Flash Thinking has significantly improved its performance in multiple key areas, especially in mathematics, science, and multimodal reasoning. In the AIME2024 (mathematics) benchmark test, Gemini 2.0 Flash Thinking scored 73.3%, a huge improvement over its predecessor (35.5%). Although there is still a gap compared to OpenAI's o3-mini (87.3%), it is undoubtedly an excellent performance.
In the GPQA Diamond (Science) benchmark test, Flash Thinking scored 74.2%, a significant improvement over the previous version (58.6%), and close to DeepSeek's R1 (71.5%) and OpenAI's o1 (75.7%), demonstrating its competitiveness in the scientific field.
In the MMMU (multimodal reasoning) benchmark test, Gemini 2.0 Flash Thinking scored 75.4%, once again surpassing its predecessor and demonstrating its huge advantage in multimodal data processing.
Reasoning ability and reasoning calculation
Similar to other reasoning models, the reasoning ability of Gemini 2.0 Flash Thinking improves as computing power increases. Inference computing refers to the amount of computing that AI performs after the user gives a question. By increasing the ability of inference computing, Gemini 2.0 Flash Thinking can complete complex reasoning tasks more accurately.
.05
How to Use Gemini 2.0 Flash Thinking
How to get access
Google currently provides users with access to Gemini 2.0 Flash Thinking through multiple platforms:
  • Gemini Application (App and Web) : Users can experience Flash Thinking for free directly through the Gemini Web application or mobile app.
  • Google AI Studio : This is a web platform more suitable for advanced users, which allows users to further explore the reasoning ability of the model by controlling the parameters of the model, testing complex queries, etc.
  • Gemini API : For developers, they can integrate Flash Thinking into their own applications through the Gemini API, bringing more customization and flexibility.
.06
Conclusion: The future of reasoning and the bright future of AI
Gemini 2.0 Flash Thinking is undoubtedly an important step for Google in the field of reasoning AI. By demonstrating the thinking process and structured reasoning, Flash Thinking not only improves the quality of interaction between AI and humans, but also greatly enhances the accuracy of reasoning tasks. Although it still faces challenges in some aspects, such as occasional inaccuracies and over-reliance on tools, it is undoubtedly a benchmark for future reasoning AI models.
In the future, as Flash Thinking continues to optimize and competitors catch up, we can expect a more intelligent, accurate, and transparent AI reasoning experience. If you are also building AI products, or are curious about this field, you might as well try Gemini 2.0 Flash Thinking yourself. Its performance in reasoning, scientific computing, and multimodal tasks will surprise you.