In-depth article | Is AI really "intelligent"? DeepMind's new paper gives a subversive answer

Written by
Silas Grey
Updated on:July-15th-2025
Recommendation

Does AI's "intelligence" really exist? DeepMind's latest paper redefines the concept of "intelligence".

Core content:
1. DeepMind's new paper's subversive explanation of AI's "intelligent body"
2. From the "Turing test" to the "Chinese room", the evolution of AI's intelligence standards
3. GPT-4's understanding of Chinese and the true meaning of AI intelligence

Yang Fangxian
Founder of 53AI/Most Valuable Expert of Tencent Cloud (TVP)

Do you remember the moment when AlphaGo defeated Lee Sedol? The whole world was shocked by the "intelligence" of artificial intelligence, and people exclaimed that the "era of artificial intelligence" has arrived. However, a recent paper by Google DeepMind has put a big question mark on this "intelligence". This paper, titled " Agency Is Frame-Dependent " [1] , points out that the "agency" of AI is not its inherent attribute, but depends on the "framework" in which we observe and evaluate it.

Wait, “agent”? “Framework”? What do these mean? Don’t worry, this is exactly what this article will explore in depth. This DeepMind paper is not only of philosophical value, but also has a profound impact on our understanding of AI and the future development of AGI (artificial general intelligence), and may even subvert your view of “intelligence”.

Stop obsessing over “objective intelligence”!

We seem to have been looking for an objective and unified standard for AI intelligence. From the "Turing test" to the "Chinese room", various thought experiments have emerged in an endless stream, trying to uncover the truth of AI "understanding". But DeepMind's research tells us: Stop being obsessed with "objective intelligence", it doesn't exist!

“Chinese Room” and GPT-4: Does AI really “understand” Chinese?

Let's first review the famous "Chinese Room" thought experiment. Philosopher John Searle hypothesized that a person who only knew English was locked in a closed room. He had a detailed rule book at hand, telling him how to output the corresponding Chinese symbols based on the Chinese symbols he received. Even if he perfectly executed these rules and gave a convincing Chinese answer, did he really "understand" Chinese?

Searle believes that this is just symbol manipulation and has nothing to do with real understanding. Because the person lacks the mechanism to connect the symbols with real-world experience. He is like a machine, running according to the program but not knowing what he is doing.

This thought experiment has also sparked widespread discussion in China. After all, understanding is not just about the literal meaning, but also includes multiple factors such as artistic conception, emotion, and cultural background.

So, what about GPT-4? It has demonstrated amazing capabilities in various language tasks, and can even generate fluent, coherent, and logical text. In the Chinese context, it can also answer questions fluently, and can even write poetry and code. Does it really "understand" Chinese?

From the perspective of the "Chinese Room", GPT-4 is still a symbolic operating system. It has mastered the association rules between symbols by learning massive amounts of text data, but these rules are still based on statistics rather than semantics.

Don't believe it? You can try to ask GPT-4 to explain what "fire tongs Liu Ming" means. It is likely to seriously analyze the meanings of the two words "fire tongs" and "Liu Ming" for you, but it doesn't know that this is actually the homonym of "leaving a name before the fire", which is a common expression used by netizens when flooding forums.

For example, if you ask GPT-4 to translate an ancient poem, it may be able to translate the meaning of each word, but it is difficult to convey the artistic conception and charm of the poem. This is because it lacks understanding of traditional Chinese culture and lacks resonance with the poet's emotions. It is like a soulless translation machine. Although it corresponds word by word, it has lost its soul.

The Trap of the “Turing Test”: Could We Have Been “Cheated” by AI All Along?

Since "understanding" is so difficult to define, can we judge whether AI is intelligent through "behavior"? The "Turing test" is such an attempt.

The core idea of ​​the "Turing Test" is that if a machine can communicate with humans and humans cannot distinguish whether it is a machine or a real person, then the machine should be considered intelligent.

However, can AI really pass the "Turing test"? In other words, does AI that passes the "Turing test" really have "intelligence"?

I'm afraid it's not that simple.  In recent years, more and more studies have shown that the "Turing Test" has serious limitations.

On the one hand, AI systems are becoming increasingly adept at “deceiving” humans. In 2014, a chatbot named “Eugene Goostman” managed to convince 33% of the judges that it was a 13-year-old Ukrainian boy.

On the other hand, human perception of AI is also changing. We are increasingly inclined to confuse AI's "behavior" with its "understanding" ability. As long as AI's answers look "like" humans, we tend to think it has intelligence.

What is even more worrying is that in recent years there has been a phenomenon called "reverse Turing test". That is, humans need to prove that they are not AI!

In one study, researchers asked participants to have a conversation with an unknown person online and determine whether the other party was a human or AI. The results showed that humans were only about 60% accurate in identifying AI. And when humans were asked to prove that they were not AI, their "pass rate" was only 63%!

What does this mean? The threshold of the Turing test is getting lower and lower. It is becoming easier and easier for AI to pass the Turing test, while it is becoming increasingly difficult for humans to prove that they are human. In some cases, humans even have to actively imitate the way AI speaks in order to pass the reverse Turing test, such as deliberately making some spelling mistakes or using some blunt expressions. This is simply an insult to human intelligence! Do we really have to learn from AI how to be "not like AI"?

DeepMind’s “soul-searching question”: What is the “intelligent entity” of AI?

Faced with various controversies about AI "intelligence", researchers at DeepMind proposed a new perspective: we should focus on the "agency" of AI rather than the general "intelligence".

What is “intelligence”?

DeepMind defines "agent" as: " the ability of a system to direct outcomes toward goals ."

This definition contains four key elements, none of which can be missing:

  1. Individuality:  The system must have a clear boundary to distinguish it from the external environment, just like human skin, which distinguishes "me" from "non-me".
  2. Source of Action:  The behavior of the system must be autonomous, not completely determined by the external environment. Just like my hand, it is me who makes it move, not the wind.
  3. Normativity:  The system must have a goal, or a set of norms to guide its behavior. Just like playing chess, the goal is to win.
  4. Adaptivity:  The system must be able to adjust its behavior according to changes in the environment to better achieve its goals, just like driving a car, you have to stop at a red light and go around an obstacle.

These four elements constitute a complete "intelligent agent" framework. Only when these four conditions are met at the same time can a system be considered to have "intelligent agent properties".

“Framework”: The “invisible lens” that determines AI’s “intelligence”

However, DeepMind researchers further pointed out that the judgment of these four factors all depends on a "frame".

If the four elements of "intelligent entity" are "hardware", then "framework" is "software". It is the "perspective" or "reference system" from which we observe and evaluate AI. It determines how we view AI and how we define the "intelligence" of AI.

More specifically, the Framework consists of the following four elements:

  • Boundary Definition:  How do we define the boundaries of an AI system? Is it the entire system or a certain module within it?
    • For example,  we can view an autonomous vehicle as a whole system (including all sensors, computing units, actuators, etc.), or we can focus on only one of its modules, such as the perception module (responsible for identifying road conditions, pedestrians, etc.), the decision-making module (responsible for planning driving routes, controlling vehicle speed, etc.), etc. Different boundary definitions will affect our judgment of the "intelligent nature" of the system.
  • Causal Variables:  To what do we attribute the behavior of the AI? Is it the internal algorithm, or the external data?
    • Example:  We can attribute the behavior of self-driving cars to the decisions of the onboard computer (internal algorithm), or to external factors such as training data, road conditions, and traffic rules. Different choices of causal variables will affect our judgment of the "source of action" of the system.
  • Goal Identification:  How do we determine what the AI’s goal is? Is it assigned by the designer or generated autonomously by the AI?
    • For example,  we can think that the goal of an autonomous vehicle is to arrive at the destination safely, or we can think that its goal is to obey traffic rules, or to maximize fuel efficiency, or even to "avoid accidents." Different goal recognition principles will affect our judgment of the "normativity" of the system.
  • Adaptivity Criteria:  How do we measure the adaptability of an AI? Is it based on its performance on a specific task, or its ability to generalize across different environments?
    • For example,  we can evaluate the adaptability of an autonomous vehicle by its performance in different road conditions (urban roads, highways, country roads, etc.), or by its learning speed in a simulated environment. Different adaptability criteria will affect our judgment of the "adaptability" of the system.

The following diagram shows the four elements of the “framework”:

DeepMind uses "contact lenses" to describe "frames." When we wear different "glasses," the AI ​​we see will present different "intelligent" forms.  Different "frames" will lead to completely different "intelligent entity" judgments on the same AI system.

To explain the concept of "framework" more clearly, let's look at an example of autonomous driving:

Reference frame elements
Framework A (system level)
Framework B (module level)
boundary
Contains all sensors, computing units, actuators
Perception modules only (camera, lidar, etc.)
Causal variables
Decision-making by the onboard computer
Environmental information received by the sensor
Target
Reach your destination safely and efficiently
Accurately identify roads, vehicles, pedestrians, etc.
Adaptability
Dealing with complex road conditions and emergencies
Robustness to different lighting and weather conditions
Intelligent agency determinationhave
(Meet the four elements)
May not have
(depending on the specific design of the module)

Do you see? The same autonomous driving system can be considered "intelligent" or "unintelligent" under different "frameworks". It's like "Rashomon". The truth that everyone sees is only part of the truth.

Looking at AlphaGo from a different “angle”: Is it really “invincible”?

AlphaGo is undoubtedly a milestone in the history of AI development. It defeated top human players in the field of Go and demonstrated amazing "intelligence".

However, if we look at it from the perspective of “frame dependency”, what is the “intelligence” of AlphaGo?

If we look at AlphaGo as a whole, then it undoubtedly has a strong "intelligent entity": it can make autonomous decisions in complex chess games with the goal of winning the game, and can continuously improve its chess skills through learning.

However, if we narrow the “framework” to the inside of AlphaGo and focus only on the neural network, then it is just a complex function that outputs a move position based on the input board state. Its behavior is completely determined by the parameters of the neural network and does not have any “autonomy”.

Going further, if we compare AlphaGo with AlphaGo Zero and AlphaZero, we will find that there are subtle differences in their "intelligent agents".

The training of AlphaGo relies on a large amount of human chess data.  AlphaGo Zero [2]  starts from scratch and learns through self-play. AlphaZero [3]  goes a step further and extends this "self-learning" ability to other board games (such as chess and shogi).

From the perspective of "intelligence", AlphaGo Zero and AlphaZero are more "autonomous" than AlphaGo. This is because they do not rely on human prior knowledge, but build their understanding of chess games through their own exploration and learning. AlphaZero has also proved the "transferability" of its "intelligence" through its outstanding performance in different chess games.

Redefining AI "intelligence": from "objective" to "subjective"

DeepMind's "frame dependency" theory has far-reaching implications for AI research and applications. It makes us realize that AI's "intelligence" is not an objective existence, but a product of our subjective construction.

AGI’s “N Possibilities”: There is no best, only the most suitable

"Framework dependence" means that there is no objective and absolute standard for AGI (artificial general intelligence). Our understanding and evaluation of AGI is always relative and subjective.

Different "frameworks" will lead to different requirements for AGI.

  • If we focus on the “practicality” of AGI, then we may pay more attention to its performance on specific tasks and care less about whether it has “consciousness” or “emotions”.
  • If we care about the “explainability” of AGI, then we might prefer AGIs that can clearly demonstrate their decision-making process, even if their performance may be slightly inferior.
  • If we focus on the "controllability" of AGI, then we may place more emphasis on the safety of AGI, even if this will sacrifice some "intelligent" performance.

Therefore, the development of AGI in the future may show a diversified trend. Different AGIs may show different "intelligent" forms under different "frameworks". Perhaps, the future AGI is like a "Transformer", which can switch different "forms" according to different tasks and scenarios.

The “new yardstick” for AI evaluation: practical, explainable, and controllable

“Framework dependence” also requires us to rethink the evaluation criteria of AI.

The traditional "Turing test" overemphasizes the "humanity" of AI and ignores the value of AI in other aspects.

Future AI evaluations should be more comprehensive and flexible. In addition to “intelligent” performance, we should also focus on AI’s “practicality,” “explainability,” and “controllability.”

  • Practicality:  Can AI solve real-world problems? How effectively can it do so? How much does it cost?
  • Explainability:  Is the AI’s decision-making process transparent? Can we understand its behavior?
  • Controllability:  Is AI safe and reliable? Can we control its behavior to prevent it from causing harm?

These three dimensions constitute the "new yardstick" for AI evaluation. In different application scenarios, the weights of these three dimensions may vary.

For example, in the field of medical diagnosis, explainability may be more important than practicality, because doctors need to understand the diagnostic basis of AI to make the final decision. In the field of autonomous driving, controllability may be the most important, because any mistake of AI may lead to serious safety accidents.

Human-machine empathy: building a bridge of understanding and trust

"Framework dependence" also poses new challenges to the relationship between man and machine.

If we cannot truly understand the “intelligence” of AI, how can we build trust with AI? How can we communicate and collaborate effectively with AI?

The AI ​​of the future should have the ability of "empathy". That is, AI should be able to understand human emotions and respond accordingly.

However, AI's "empathy" is different from human "empathy". AI's "empathy" is based on the analysis of human behavior and language, not on its own feelings.

Despite this, AI's "empathy" can still help us build trust and understanding with AI.

For example, a chatbot with empathy can judge the user's emotional state based on the user's tone and wording, and respond accordingly. This can enhance the interactive experience between the user and AI and reduce the user's frustration.

In recent years, there have been many advances in the study of human-machine empathy. Researchers are developing various technologies to enable AI to better understand and respond to human emotions.

For example, researchers at Stanford University developed a human-machine collaboration system called  HAILEY [4]  . This system can analyze human language and behavior in real time and provide corresponding feedback, allowing humans to better express empathy in conversations. The core of this system is to build a "nested framework" design: humans lead the conversation, while AI analyzes the language features in the conversation and provides real-time feedback to help humans better express empathy. Experiments show that the HAILEY system can improve human empathy expression by 19.6%, and in difficult cases it can be improved by 38.9%.

DeepMind researchers are also exploring how to give AI the ability to be "frame-aware." That is, to enable AI to understand how humans view its "intelligence" and adjust its behavior based on human "frames."

These studies have shown that AI's ability to empathize depends not only on the technical level of AI itself, but also on how we design and use AI. The AI ​​of the future may be more like an "understanding" partner rather than a cold tool.

The ethical questioning of “framework dependence”

The "framework dependence" theory not only puts forward new requirements for the development of AI technology, but also poses new challenges to AI ethics.

If AI’s “intelligence” is subjective and relative, how should we define AI’s “responsibility”?  If AI makes different decisions under different “frameworks,” how should we ensure that AI’s decisions are consistent with human values?

For example, an autonomous driving system, under the "safety framework", may choose to sacrifice the lives of passengers in the car to protect more pedestrians. But under the "fairness framework", this choice may be unethical. So, how should we choose the "frame"?

Going further, if AI "bias" is inevitable, how should we deal with it?

For example, an AI system used for recruitment may discriminate against job seekers from certain groups due to bias in the training data. So, how should we eliminate this bias?

Going further, if AI’s “values” can be shaped, then how should we guide AI to form correct values?

For example, an AI system used to generate news may produce false or misleading reports due to the bias in the training data. So, how should we ensure that the information generated by AI is true, objective, and fair?

These questions are ethical questions raised by the "framework dependency" theory. DeepMind's paper does not provide answers to these questions. But it reminds us that the development of AI is not only a technical issue, but also an ethical and social issue.  We need to fully consider the impact of "framework dependency" in all aspects of AI design, development, and application to ensure that the development of AI is in line with human interests and values.

DeepMind's "frame dependency" theory provides us with a new perspective to understand AI "intelligence". It tells us that AI's "intelligence" is not a fixed, objective attribute, but a dynamic, subjective construction.

Future AI research should focus more on the "intelligent nature" of AI, its performance under different "frameworks", and its interaction and collaboration with humans. Perhaps, the AI ​​of the future will no longer be "artificial intelligence", but "augmented intelligence" or even "symbiotic intelligence".

Only in this way can we truly understand the "intelligence" of AI and make AI better serve humans. And this may be the true meaning of "intelligence". The future world is not a duel between humans and AI, but a dance between humans and AI. In this dance, the "framework dependency" theory may give birth to a new AI development paradigm, namely "frame engineering", which specializes in how to design, select, evaluate and optimize the "framework" of AI systems, so as to achieve a deep integration of AI and humans. And whether we can master this unknown dance will determine the direction of human civilization.