Revealing the secrets of AI Agent! In 3 minutes, Professor Li Hongyi will take you through the frontiers of AI, from principles to applications, and understand the infinite possibilities of intelligent agents in one article!

Written by

Caleb Hayes

Updated on:July-09th-2025

In the AI world, 2022 is already prehistoric, why don’t you hurry up and learn!
----
Today I share the latest class notes of "Understanding the Principles of AI Agent in One Class" (11 o'clock in the middle of the night, 71 people took the test online...)
1. The basic operating principle of AI Agent
• Core loop: Goal → Observation → Action.
• Action affects the environment, produces new observations, and repeats until the goal is achieved.
• Example: AlphaGo
• Goal: Win the game. •
Observation: The position of the chess pieces on the board. •
Action: Place the pieces on the board.
• Relationship with reinforcement learning (RL)
• Traditional AI Agent relies on RL algorithms, but the limitation of RL is that the model needs to be trained separately for each task.
• New idea: Use LLM directly as AI Agent.

2. LLM as an AI Agent

• How it works: target (text description) → environment (converted to text or directly using images) → action (text description, which needs to be translated into executable instructions).
• Core concept: The core of LLM is text chain, and AI Agent is an application of LLM.
• Historical review: There was an AI Agent boom (AutoGPT) in the spring of 2023, but the actual effect did not meet expectations.
• Advantages
• The possibilities for actions are almost unlimited and are no longer limited to preset behaviors.
• No need to define Reward, rich information such as error logs can be provided directly.

3. AI Agent Examples
• AI villagers: Stanford Town
• AI using computers: Cloud Computer Use, ChatGPT Operator
• AI training AI models: Google co-scientist, etc.
4. More immediate interaction
• Core requirement: Immediately adjust actions according to real-time changes in the environment.
• Application scenario: Voice dialogue

V. Analysis of the Key Capabilities of AI Agents
(I) Adjusting Behavior Based on Experience
• Traditional method: Adjust model parameters (not covered in this course).
• LLM capabilities: Directly provide error information to change behavior without adjusting parameters.
• Key question: How to manage and utilize past experience?
• Solution: Memory mechanism
• Read module: Select experience related to the current problem from memory (similar to RAG technology).
• Write module: Decide what information should be recorded.
• Reflection module: Abstract and organize the information in memory and establish connections between experiences (similar to Knowledge Graph).

(II) Using tools
• Tool definition: You only need to know how to use it, and you don’t need to understand its internal operation.
• Common tools: search engines, programs (written by LLM), and other AI models.
• Essence: Calling functions (Function Calling).
• Developer role: Build bridges and convert Tool instructions into actual function calls.
• Specific tools
• Search engines (RAG)
• Build your own tools: LLM writes its own programs and uses them as tools.
• Other AI as tools: Text models call speech recognition, emotion recognition and other tools to process speech; large models and small models work together.
• Risk: Over-trust in tools may lead to errors.
• Problem: Internal knowledge vs. external knowledge conflict
• LLM will weigh internal knowledge (beliefs) and external knowledge (tool results).
• The greater the gap between external knowledge and LLM’s beliefs, the less likely LLM is to believe.
• LLM’s confidence in its own beliefs will also affect whether it is shaken by external information.

(III) Make a plan
• Current capability: The planning capability of traditional LLM is between having and not having.
• Strengthening planning capability: Interactive exploration with the environment (Tree Search), removing hopeless paths.
• Disadvantages: Some actions are irreversible.
• Solution: Simulate in the brain (World Model), simulate environmental changes.
• Thinking model: DeepSeek-R1 and other thinking models have similar effects.
• Risk: Overthinking may lead to stagnation or even giving up.
----