Li Hongyi: Understand the principles of AI Agent in one class

Li Hongyi explains the principles of AI Agent in a simple and in-depth way.
Core content:
1. Basic operating principles and core cycles of AI Agent
2. Application and advantage analysis of LLM as AI Agent
3. Analysis of key capabilities of AI Agent and practical application cases
Li Hongyi has just released "Understanding the Principles of AI Agent in One Lesson", which is very easy to understand and highly recommended.
Li Hongyi believes that from the perspective of LLM, its AI Agent task is still doing word chaining. AI Agent is not a new technology of language model, it is more like an application of language model.
The video is as follows:
Course content:
Note: The key information is in the fifth part, Analysis of the Key Capabilities of AI Agent
1. Basic operating principles of AI Agent
Core loop: Goal -> Observation -> Action.
Action affects the environment and generates new Observation.
Repeat the cycle until the goal is achieved.
Example: AlphaGo
- Goal: Win the game.
- Observation: The position of the chess pieces on the board.
- Action: Place a piece on the board.
Relation to Reinforcement Learning (RL):
- Traditionally, the creation of AI Agents relies on RL algorithms.
- Limitations of RL: A separate model needs to be trained for each task.
- New idea: Can LLM be used directly as an AI Agent?
2. LLM as an AI Agent
Target (text description) -> Environment (converted to text or directly use images) -> Action (text description, needs to be translated into executable instructions).
The core of LLM is word chain, and AI Agent is an application of LLM.
No new models are trained in this course, which is based on the application of existing LLM general capabilities.
Historical review: There was a wave of AI Agent enthusiasm (AutoGPT) in the spring of 2023, but it later cooled down because the actual results were not as expected.
Advantages of LLM-driven AI Agent:
- The possibilities for action are nearly endless and are no longer limited to preset behaviors.
- No need to define rewards like RL, rich information such as error logs can be provided directly.
3. AI Agent Examples
AI Villagers: Stanford Town
AI uses computers: Cloud Computer Use, ChatGPT Operator.
AI training AI models: Google co-scientist, etc.
4. More immediate interaction
It is necessary to be able to adjust actions immediately based on real-time changes in the environment.
Application scenario: Voice conversation
5. Analysis of Key Capabilities of AI Agent
Li Hongyi believes that AI Agents need three core capabilities: 1. The ability to adjust behavior based on historical experience; 2. The ability to use tools; 3. Planning capabilities.
1. Adjusting behavior based on experience
Traditional method: Adjust model parameters (not covered in this course).
LLM capabilities: Provide error information directly and change behavior without adjusting parameters.
Key question: How to manage and leverage past experience?
Solution: Memory mechanism, similar to human long-term memory.
- Read module: Select experiences from Memory that are relevant to the current problem, similar to the RAG technique.
- Write module: decides what information should be recorded.
- Reflection module: abstract and organize the information in memory and establish connections between experiences (Knowledge Graph). Similar to GraphRAG and HippoRAG.
2. Use tools
Tool definition: You only need to know how to use it, no need to understand its inner workings.
Commonly used tools: search engines, programs (written by LLM himself), and other AI models.
Use Tool = Function Calling.
Developers need to build a bridge to convert Tool instructions into actual function calls.
Specific tools:
- Search Engine (RAG)
-Build your own tools: LLM writes his own programs to use as tools.
- Other AI as tools:
The text model uses speech recognition, emotion recognition and other tools to process speech.
Large and small models work together.
Risk of over-trusting the tool: LLMs have a degree of judgment, but sometimes they still make mistakes.
Problems encountered when using tools: Internal knowledge vs. external knowledge conflicts
- LLM makes a trade-off between internal knowledge (beliefs) and external knowledge (instrumental outcomes).
- The greater the gap between external knowledge and LLM’s belief, the less likely LLM is to believe it.
- LLM's confidence in their own beliefs will also affect whether they are shaken by external information.
In addition, using tools may not always be more efficient, it depends on the capabilities of the LLM itself.
3. Make a plan
The planning capabilities of current traditional LLMs are somewhere between yes and no.
Further strengthen planning capabilities: interactive exploration with the environment (Tree Search) to remove hopeless paths.
Tree Search Disadvantages: Some actions are irreversible.
Solution: Let the attempt take place in the brain simulation (World Model) to simulate environmental changes.
Use your inner theatre to plan: think, verify possibilities, and simulate world changes.
Thinking models such as DeepSeek-R1 do have similar effects.
But there is also a risk of overthinking: LLMs may overthink, become stagnant, or even give up altogether.