Woter AI detection.Hurry - ends Jul 17th

New Year Sales :up to 80% OFF

AI Humanize AI Translator Bypass AI AI Rewriter AI Detector

PRICING

TRY FOR FREE

Ten thousand words long article, talk about the new paradigm of the next generation of AI Agent

Written by

Silas Grey

Updated on:July-09th-2025

Recently, a Chinese company released the world's first general-purpose AI Agent, Manus AI, which has attracted widespread attention in the technology community. Unlike traditional AI assistants, applications like Manus do not just generate text or provide suggestions, but are able to think independently, plan and execute complex tasks, and achieve a one-stop service "from instructions to results." At the recent NVIDIA Annual Technology Conference (GTC), Huang Renxun defined Agentic AI as a key stage in the evolution of artificial intelligence technology. Its core lies in upgrading from "single response of generative AI" to an intelligent entity with autonomous reasoning capabilities. The roundtable hosted by Tencent Research Institute and Tencent Academy made an in-depth interpretation of the product innovation and technical architecture of the next-generation Agent represented by Manus and Deep Research, exploring the new paradigm of the next-generation Agent.

【Roundtable Guests】

Hong Sirui

DeepWisdom (MetaGPT), a researcher in the field of NLP/Agent, is mainly responsible for algorithm development and scientific research, and is one of the core contributors to the OpenManus open source project. He has won the world championship of the NeurIPS 2019 AutoDL competition (NLP), and is the first author of the open source multi-agent framework MetaGPT paper (ICLR 2024 Oral) and the Data Interpreter paper, as well as one of the authors of the AFLOW paper (ICLR 2025 Oral). At present, he mainly focuses on the design and performance optimization of Agent (multi-agent) systems, focusing on the application effects of Agent in code generation, complex data analysis automation, and LLM reasoning ability enhancement. Many research results have been published in top international academic conferences and journals such as TPAM and ICLR.

kongjie (Kongjie)

Tencent expert engineer, early practitioner and evangelist of big model application. Lecturer of company-level courses such as "AI Agent: A New Paradigm for Building Intelligent Applications" and "Big Model Efficiency R&D, from Copilot to Autopilot". Responsible for supporting the operation and management matrix system group of Tencent Video Media Asset Xinghai; responsible for the architecture design and upgrade of low-code platforms such as Linglong CMS, UN, Feiliu, logical orchestration and media asset BFF. Formerly the head of the company's low-code Oteam, he led the design and development of Agent construction and operation platform Edan (AKA goose egg) and logical orchestration system Loki, and is the main author of the IEEE low-code standard.

Yu Yi

Tencent Qingteng AI and Globalization Project Manager. Previously, he was the Vice President of Post-Investment Marketing of a venture capital firm. He also worked in the technology venture media for many years and was named LinkedIn China Expert of the Year for two consecutive years. Wild AI evangelist, Tencent's Outstanding Expert of the Year in 2024, a mentor of the Get AI Learning Circle, the 2024 AI Expert of the Year for Everyone is a Product Manager, and a wild external brain for many AI product companies and unicorn companies. He has received multiple courses from Get and Tencent Academy, as well as open source documents such as the "AI Personal Exploration Guide Series" and the "AI Product and Company Transformation Research Series", which have been read and studied by more than 200,000 people. Within the company, he also supported Tencent Technology's Good Will Week, Tencent programmers, and internal AI training and sharing for multiple departments.

Host: syanxu (Xu Siyan)

PhD, senior researcher at Tencent Research Institute, and the person in charge of the "AGI Roadmap". His main research area is the Internet industry economy. He is responsible for tracking and researching cutting-edge Internet technologies and trends, and studying the innovative economic models brought about by cutting-edge digital technologies. His main focus is on AIGC and blockchain. He led the research and writing of "Machine Brain: Ten Trends in Big Models", "Industrial Blockchain", "Industrial Internet: Building a New Economic Landscape in the Age of Intelligence +", etc. PhD from Tsinghua University, visiting scholar at Massachusetts Institute of Technology.

【Content introduction】

Focusing on Manus and other similar product innovations, Agent technology frontiers, and the next generation of Agent new paradigms, we will deeply explore the following issues:

1. What is the actual effect of Manus? How do you evaluate its product design?

2. What are the current scenarios in which AI Agents demonstrate their capabilities?

3.What are the major technological advances in AI Agent?

4.What insights do applications such as Manus bring to the development of AI Agents?

5. How do you view the “second half” of Agent development?

6. What core capabilities do AI Agents need to strengthen in the future?

......

(Based on the roundtable content: Comparison between DeepResearch and Manus)

(Based on the roundtable content: Next-generation Agent features)

Key points:

1. Evolution of Next-Generation Agent Technology

Multi-Agent System: Manus uses multiple AI assistants to work together. Although the operation mode is basically fixed, it demonstrates the potential of multi-agent systems.
Memory and context management: Future agents need to have enhanced memory and context understanding capabilities to better handle complex tasks.
End-to-end training: Deep Research has demonstrated the direction of integrating the capabilities of an entire AI assistant directly into a single model through end-to-end training, which is considered to be the development direction of the next generation of AI assistants.

2. Characteristics of the Next Generation Agent

Self-assessment ability: Future agents need to have the ability to self-assess and reflect in order to improve their autonomy and intelligence.
Cross-environment capability: Agents need to be able to cross different application environments and autonomously use various software tools to solve problems.
Autonomous learning and evolution: Agents should have the ability to continuously learn and evolve from usage data to improve the efficiency and personalization of problem solving.

3. How should we respond to changes in employment in the AI era?

Incremental thinking: Look at the development of AI with an incremental mindset, realize that new industries and job opportunities will emerge, and everyone can become a "super individual."
AI leadership: Shift from executing specific tasks to setting goals, managing and accepting AI's work results, and becoming an AI leader.
Continuous learning and adaptation: The key to making good use of AI is continuous learning and adaptation, personally experiencing the advantages and limitations of AI in different scenarios, and finding your own value positioning.

The following is the full text of the roundtable:

Part I Application Practice

Xu Siyan:

I noticed that many students may not have used Manus or have not obtained a trial license. Yu Yi, could you please introduce to everyone how Manus works, and why do you think it is particularly like an intern?

Yu Yi:

As an ordinary user, I have been paying attention to the field of AI assistants for a long time, and I have also been paying attention to Meta GPT's products. I am not good at programming, and I always encountered many problems when using AI assistant products in the past. These products are not very user-friendly for ordinary users: installation is troublesome, the interface is not easy to use, and I don't know how to solve problems when I encounter them. Sometimes it is very frustrating to spend money to buy them but they don't work well.

But this time, the experience of using Manus was completely different. Maybe because there are not many users now, the whole product is very smooth to use. What surprised me was that it only took 17 minutes to help me complete a very complete industry analysis report. This report includes: industry status analysis, development trend forecast, important company screening and detailed information of each company. I also tried its other functions, such as making web pages, developing small programs, and posting pictures on Xiaohongshu. The whole experience was very smooth, especially the analysis report completed in 17 minutes, which really surprised me.

Although there are different opinions and controversies about this product in the market, I call it "Agent's DeepSeek moment" in my circle of friends. The reason why I say this is that its product design is really excellent. It will display a detailed task list, allowing users to clearly see how it plans and decomposes the work. Unlike previous AI, this product can handle problems intelligently without too many presets. It can handle multiple tasks at the same time, complete them step by step and finally deliver complete results. The whole process is very smooth.

This reminds me of the development of other AI products. I used O1 in the early days, but it didn't show the thinking process. Later, DeepSeek R1 could show the thinking process, but the effect was not ideal. However, R1's way of thinking has impressed me.

This new product allows users to see the complete work process by showing all the steps and task lists. Its performance is comparable to that of an excellent intern in terms of data analysis, summary and marketing strategy formulation. But it still needs to improve in programming - for example, when writing a snake game or developing small programs, it is not as effective as Claude 3.7. Other similar AI development assistants have also appeared on the market recently.

In general, this is a very user-friendly product. After using it, you will truly understand what an AI assistant is - it is like an intelligent assistant that can autonomously plan, decompose tasks and execute them. This is a complete but not perfect product. It is still limited by its own technology and AI model capabilities and is constantly improving and improving.

Xu Siyan:

There is no unified definition of Agent yet. How do you define Agent? Does Manus’s product design meet the expectations of Agent evolution?

Jie Guangfa:

Regarding the definition of Agent, we can understand it as follows: Unlike the traditional language model (LLM) pure text generation or chat mode, Agent is a system that can think, plan, and use tools to complete user tasks. It can form a complete task closed loop, not just generate a simple response to the question and end. Traditional chatbots may only say hello or write a story, but real Agents can understand user needs, search the Internet, generate files, write code, and finally deliver complete results to users.

In the past two years, the industry has been confused about the term Agent, and some practitioners even call simple language model text generation Agent. Therefore, we need to clarify the concept of Agent: the Agent we are talking about now should be a system that can complete specific tasks, just like an intern can complete the assigned tasks, rather than just giving simple feedback through dialogue.

From the use of language models to the present, this field has gone through quite a long development. Taking code generation as an example, Manus's product form borrows from at least two or three technical products. One of them is Devin, a product developed by a Chinese team. Manus draws on Devin's innovation of visually displaying the code writing process in the browser. In addition, Manus also adopts common practices of other products, such as making plans and task lists, which are common in code collaboration tools such as GPT Pilot. The way it operates imitates a real development team, breaking down large tasks into small tasks.

GPT Pilot has established a task database so that AI "engineers" can claim and complete coding test tasks. Therefore, the design of Manus is not completely original, but integrates the excellent functions of other products, which is why some people say that it is a product that "pieces together" different technologies.

However, Manus has cleverly integrated these excellent features to surprise ordinary users. As an industry insider, I think Manus's greatest success lies in the product experience: it allows users to clearly see every step of the AI to complete the task and understand the progress in real time. This experience design is indeed excellent. But from a technical perspective, it uses technologies that have been relatively common in the past two years, without many breakthroughs. We will discuss the future development of AI assistants later.

Xu Siyan:

Manus defines itself as the industry's "first universal agent". Is this just a hype of a concept, or does it actually have a certain degree of universal capabilities?

Jie Guangfa:

I think they are good at marketing concepts. In fact, the basic framework of this agent is not complicated - programmers can create a similar agent in one day using the existing open source framework.

Whether it is truly "universal" depends mainly on what tools it can be used with. For example, in the field of code generation, tools like Cursor focus on programming-related functions such as writing code, reading and writing files, and querying network information.

Manus does have 29 built-in tools, which has been analyzed by people on the Internet. These tools can complete many basic tasks in daily work, such as writing, collecting information, analyzing data, writing code, and browsing the web.

It is called a "general" agent because its tools cover a wide range of areas. But this does not mean that it can really do everything. For example, if you ask it to trade stocks or handle special tasks in some professional fields, it cannot do it.

So to be precise, it is just a basic intelligent assistant with relatively rich functions, rather than a general agent in the true sense. This is more of a marketing trick, showing that the team is very capable in marketing.

Xu Siyan:

If DeepSeek's improvement over previous large models lies in saving resources by combining large models with expert models, then for agents, which path is easier to implement: focusing on niche areas or "general-purpose" agents?

Jie Guangfa:

At the agent technology level, just like basic quality education, everyone needs to have basic abilities. But like human society, agents also need to have division of labor and specialization, and provide different solutions according to different scenarios.

This is a feasible development path. Because fully general agents have limitations in practical applications, and their professional depth is not as good as that of agents in vertical fields. For example, using Claude 3.7 for code generation, the quality of drawing SVG is much better than writing code. This is because more optimization is invested in vertical fields, and it is difficult for general agents to achieve such depth in every field.

Xu Siyan:

Please share with us, how do you think AI Agent is reshaping your workflow?

Yu Yi:

Today’s AI systems have strong underlying capabilities and integrate many tools that can be used in multiple fields. I have also been using these AI tools in depth recently. These AI assistants are really flexible and can handle both daily work and professional tasks. For example, when doing research, it can handle general data collection and provide analysis in professional fields.

I think that if an AI simply adds professional knowledge of a certain industry, it is not enough. On the contrary, general AI may have more advantages because it has more comprehensive technology, stronger basic capabilities, and is cheaper to use. This general AI is likely to replace some basic professional AI.

I am exploring its role in work and life to see what its advantages and disadvantages are. Although it has not completely changed the way I work, AI tools like Claude and GPT-4 Pro have become important helpers in my work. The biggest change is in search. Now I don’t have to do everything myself. I just hand over the task to AI and check the results regularly. And it will give feedback on progress in a timely manner.

What touched me the most was the statement “there is no need to assign a person to AI”, but I am also testing how many tasks can free me from this, or I may only need to do some checking and pointing out direction tasks at certain stages.

Part II Technical Understanding

Xu Siyan:

We just discussed the understanding of application and product innovation. In the second part, let's explore the technical aspects of Manus. Manus's AI core technology is an integration of various technologies in the past two years. So compared with OpenAI's Deep Research and Devin, what are the specific similarities and differences?

Jie Guangfa:

Well, let me explain the current state of technology. The working principle and flow chart of Manus are now available online. To be honest, Manus does not have much breakthrough in core technology. The main technological innovation is in products such as Deep ReSearch.

Manus uses multiple AI assistants (agents) to work together. It includes functions such as planning, summarizing, and reviewing, all of which require multiple calls to the large language model. Some people call this a "multi-agent system", but in fact it is more like a fixed workflow.

Although the official team said that they are not simple workflows, their operation methods are basically fixed. A true multi-agent system should be one in which each AI assistant can call and communicate with each other autonomously, rather than relying on a centralized control process.

In terms of technical implementation, Manus uses some post-training techniques, such as using a large model to distill data into a smaller model. This is necessary because if a large amount of context and description documents are loaded each time, the running cost will be very high. In general, Manus has done a solid job on the technical level, but there is no particularly outstanding innovation. Their greatest success is in product experience.

When it comes to product experience, there is an interesting change: in the past, when AI was thinking and searching for information, users would feel that the system was too slow and could not understand the intermediate steps. But starting with DeepSeek R1, it clearly displays the reasoning process, allowing users to understand that AI needs time to think. Manus goes a step further, even if it takes 10-20 minutes to complete a task, users can accept this waiting time.

Users now understand that AI programs are slower than ordinary programs and require time to think. When we describe this process as "AI thinking seriously," users' acceptance is greatly improved. This is an important change in product experience brought by Manus and DeepSeek R1.

Next, let's talk about several important AI products. MGX is a product of Mr. Hong's team. It features an AI development team that works 24/7 and completes software development through collaboration of multiple AI assistants. In terms of professional field applications, MetaGPT has done an excellent job and developed many excellent open source frameworks.

Finally, I want to talk about Deep Research, which I think represents the direction of the development of the next generation of AI assistants. It takes a completely different approach: through end-to-end training, the entire AI assistant's capabilities are directly made into a model. This is different from the current approach, which is to combine large language models and various tools, and programmers write control programs to coordinate.

I believe that the AI assistants of the future will return to the model training method. Deep Seek has mentioned the goal of warehouse-level code generation and proactive AI assistants in their NSA paper. The new attention mechanism they developed is designed to handle ultra-large-scale text. This is a development direction worth paying attention to.

Xu Siyan:

What are the main technical bottlenecks of Manus and other AI agents? If there is no bottleneck, can other teams quickly replicate it?

Jie Guangfa:

Regarding the technical bottleneck of AI Agent, it is not difficult to build a basic Agent. People with programming experience can understand its working principle within one day with the help of existing tools. However, the challenge of a truly usable Agent is greater. The key lies in error tolerance. If the content generated by a large model is incorrect, multiple calls will reduce the accuracy. In scenarios such as research reports and surveys, the error tolerance is high; in scenarios such as code writing, the error tolerance is low and professional programmers need to intervene. Especially for data operation agents, errors may affect production.

Therefore, the error tolerance needs to be determined according to different scenarios, which determines the availability of the Agent.

Looking at the cause of the error, the first is the intelligence of the model, that is, whether the result of a single call to the large model is acceptable. At present, the SOTA large model is usually better than humans in single generation. In simple scenarios, the performance of the large model has exceeded the average human level. For example, in terms of basic code writing, it can quickly generate high-quality functions and unit tests.

But in complex scenarios, such as dealing with multi-file code or involving implicit knowledge, its performance drops significantly. This leads to the core bottleneck: memory capacity.

At present, large models are stateless and memoryless. Technically, memory is simulated by providing rich associated context for large models, and providing context is not as simple as storing and providing historical conversations. It often involves several core technical points, including the context window size RAG technology. The former expands the context window of the model so that it can input as much content as possible in a conversation, but the performance of most current models on ultra-large contexts is still unsatisfactory. A powerful solution is needed here. The native sparse attention mentioned in the NSA paper of DeepSeek is expected to solve this problem; and using the RAG method to recall memory will face the problems of embedding quality and recall accuracy, which makes it extremely difficult to achieve effective memory.

There are many technical details here, but in short, there are two core factors that affect the performance of AI Agents: one is the intelligence of the model itself, and the other is the context and memory management when dealing with complex scale problems. These two points directly determine the performance of the Agent, and the latter is particularly critical and is also the focus of the current industry's investment in a lot of research.

Xu Siyan:

Professor Hong, please tell us about the background of the OpenManus project you are developing.

Hong Sirui:

Regarding the background of the OpenManus project, we originally wanted to use our own multi-agent framework to challenge the SweetBench benchmark. SweetBench is a project-level code repair dataset that requires locating and repairing code in the code repository. Because we need to process a large amount of code, we developed a dedicated code location, scanning, and reading tool.

As the capabilities of large models continue to improve, we have integrated these open source tools into the repository and simplified the use process. Although the project initially focused on code fixes, we later added features such as web browsing.

Open Manus adopts the traditional agent architecture, combined with prompt word engineering and tool calling capabilities, to provide a lightweight agent development framework. In order to handle the long context problem, we also try to optimize memory management.

In addition, our product (MetaGPT X) is different from this, it focuses on generating complete software projects. The biggest innovation is that it realizes the automatic assignment of different agents to solve problems based on task difficulty, and has dynamic routing and adaptive topology.

For example, data analysis tasks are automatically assigned to the data interpreter agent, while front-end and back-end development are handled by the engineer agent. The system can dynamically adjust this allocation scheme according to the complexity of the task and ensure that the task is completed at a high enough level.

Part III Trends in the Next Generation of AI Agents

Xu Siyan:

Now let’s move on to the third topic, which may be of most interest to everyone: What inspiration will Manus bring to future AI agents?

Yu Yi:

Regarding the barriers of this product, I think the core is its deep insight into user needs. Before the product was released, I talked with Xiaohong, the product manager, and they put a lot of thought into the product design, including technical innovations such as memory, multi-modal search, and multi-model calling.

They observed that although the current market is mainly focused on upgrading the underlying large model and B-side applications, there is a clear gap in ToC products. The general public needs more advanced and easy-to-use AI products, rather than just staying at the basic large model reasoning capabilities.

This product draws on many successful experiences, and the team believes that now is the best time to enter the consumer AI market. The product design pays special attention to user experience, including fluency and completeness, and has created a good example for future Agent products.

At present, the acceptance among the capital circle, the market and users is quite good, which also shows that the market has a strong demand for such products.

Xu Siyan:

Mr. Hong, what capabilities do you think the Agent needs to strengthen the most in the future? Deep Research also mentioned in a previous interview that its ultimate goal is to make an AGI-oriented Agent. What kind of Agent can meet such standards?

Hong Sirui:

Let me share my understanding from a technical perspective. First of all, whether it is Manus or other agent products. They all lack a key capability: self-evaluation. Although these agents can plan and solve problems with various tools, they do not yet evaluate whether the final results meet expectations. This evaluation capability or self-examination capability is very important, and the system needs to provide such feedback to the agent. Therefore, providing closed-loop feedback from the environment is an important link when building an agent system. Whether it is achieved through reward learning or setting up a corresponding feedback model in the environment. If the agent can evaluate the results, it can further evaluate the completion of the goal, thereby improving its autonomy and intelligence level. In addition, as for the core capabilities, we now see that whether it is a more general multi-step reasoning capability or a tool-using capability, the training cost is very high, and a large amount of trajectory data needs to be collected and trained through later training, including various reinforcement learning methods.

However, we can explore letting the agent enhance its ability autonomously in reasoning. Perhaps there is no need to confirm a specific model, but to let the agent actively explore the environment multiple times. During the entire exploration process, integrated or mixed capabilities can be introduced to improve the final effect.

Of course, this requires reducing the overall exploration cost. For example, if one execution does not work well and hallucinations occur, we can try five times with different settings and then mix the results. The key is to make the user feel as if they have only executed it once, maintaining speed and cost.

This does pose a huge challenge to engineering capabilities. The amount of learning data for current tasks is still insufficient. Even if the data required for a single task is several hundred, the collection and synthesis of a large number of different tasks will consume a lot of resources. We need to explore new methods, such as introducing meta learning into agent design. This will allow agents to learn new tasks and adapt to new environments in a more efficient way, thereby better solving various user problems. Although agents have dealt with many similar problems, the specific situations encountered each time are different, and migration capabilities are very important.

These subtle differences in problems are exactly what we need to focus on. At the same time, we also need to strengthen the agent's memory and context understanding capabilities. Currently, browser-side agents and agents with multimodal capabilities can perceive various types of data, which will enter their context. The key is how to maintain the unity of context representation, ensure the integrity of information, and effectively integrate information from different modalities to make decisions. These are the core capabilities that need to be strengthened in future agent design.

Xu Siyan:

Next, please let Teacher Jie give us some insights into how you think the second half of this future Agent will unfold.

Jie Guangfa:

We have just discussed the core abilities that need to be strengthened, such as reflection ability, memory ability, etc. From my observation, through the end-to-end training paradigm, we can actually solve these problems very well.

The two models, OpenAI's O1 and DeepSeek's R1, are essentially "agent-like" models. Their characteristics are that they generate more than once, but in stages: the first stage is to think, and the second stage is to generate answers based on the thinking results. This method compresses the agent function that originally required multiple interactions into the generation process inside the model.

According to the OpenAI team, these new generation agents are models in themselves, rather than traditional agent engineering. They use direct training methods, with reinforcement learning at the core. DeepSeek's R1 demonstrates the thinking process, proving that reinforcement learning combined with simple reward rules can enable the model to demonstrate autonomous thinking capabilities.

This training method is different from traditional prompt word teaching - just set goals and reward mechanisms to allow the model to autonomously learn planning and execution. DeepSeek has also open-sourced NSA (native sparse attention) technology to solve the problems of large-scale code generation and processing of ultra-large contexts. When reinforcement learning and sparse attention are mature, Agent training will usher in spring.

Next, we will train agents for specific scenarios. We are not pursuing a completely universal agent, because this may not be realistic. Instead, we will train specialized agents according to different fields and professions, just like training professional talents.

Recently, the industry has proposed a new viewpoint: the future product form will undergo a major change. The traditional method requires building modules and designing interactive processes, but in the future, it may only require training a model with service capabilities.

In 2023, I proposed a concept in the course "Big Models Reshaping Software Development": Generative big models will go through several stages, including generating text, generating code, generating software, and finally generating services. The first two were already very common two years ago. In the field of generating software, Cursor, Client AI IDE and plug-ins are proving it. At present, Agent is the one that directly provides services for us. The end-to-end trained OpenAI Research has further made services trainable and generative.

Their model was able to provide services directly, rather than giving users software that they had to operate themselves. This development was amazing, and it took only two years to achieve.

We have entered the era of "model as product, model as service" . This is exactly the technical direction we need to focus on in the second half.

Xu Siyan:

Could you please share your thoughts on the second half of Agent? In addition, what plans does the Open Manus team have for the next step?

Hong Sirui:

Yes, I think Professor Jie explained it very clearly. The second half of the Agent is to train the autonomous capabilities into the model. By combining the Agent's autonomous form, the model can further improve the success rate of problem solving.

There are many technical points that need to be broken through. We are studying how to train the agent to use the tool, such as inputting Chain of Thought (COT) data into the model, which is very helpful to improve the tool's decision-making ability. Another thing is how to synthesize the data of the execution trajectory. Because the agent will have erroneous behaviors during the execution process, we cannot directly use these behaviors for training, but need to process, synthesize and correct the data.

These are the things we are currently working on. If you check the GitHub of Open Manus, you will find that we have already started some academic collaborations. We hope to train our own agent model based on MetaGPT and Open Manus, and combine it with reinforcement learning to advance together.

I think Agent needs another important capability in the second half, which is cross-environment capability. Currently, Agent only lives in the browser or a single environment. Can it cross to different application environments? When we deal with problems, we often not only need to operate on the browser, but also need to use other application software, such as drawing software or professional reporting software. So can Agent cross these application environments to help us solve problems? I think this is a very important capability. Just like there are various code development products and tools now, each tool has its own expertise - some are good at front-end development, some are good at back-end development, and some are good at data analysis. So can Agent use these different software autonomously to help us build more complex applications? This is indeed a key capability.

Then we talked about cross-environment capabilities. The second important direction is the Agent's evolutionary capability at the product level, that is, autonomous learning and evolution.

Whether it is training through trajectory data or enhancing model capabilities, this is a phased process. Initially, we use data to improve its problem-solving capabilities. But as personal usage frequency increases and application scenario data accumulates, can it continue to learn from this data?

For example, can we improve the efficiency of problem solving and simplify the operation that originally required 50 steps to 10 steps? This will not only reduce costs, but also provide more personalized solutions.

This evolutionary capability is crucial for Agent. I believe that in the second half of the game, we will see more such products - experiences that better meet the personalized needs of users the more they are used.

Xu Siyan:

I would like to thank the three teachers today for helping us objectively understand the popular product manus from their respective professional perspectives. As well as the outlook for the second half of Agent, each of them has put forward very constructive suggestions and analysis. Due to time constraints, our roundtable discussion today ends here. There is still a little time left. I would like to ask you some questions on behalf of the online audience, and you can answer them.

Q&A

Xu Siyan:

Today's live broadcast was very popular, and everyone left a lot of questions in the background. The first question was asked by a classmate before the class, and it is also quite representative. Agent is developing so fast now, how long will it take for our jobs to be replaced?

Jie Guangfa:

I can tell this student that you are not alone in having this thought. It is not just you who has this thought. In fact, it may be a stage of productivity explosion, and everyone will have that common anxiety. A week or two ago, I actually went to listen to Liang Ning's product class. There was actually a question in it, saying that AI has become very powerful now. As product managers, can we still keep our jobs? Yes, it turns out that we programmers ourselves will find that tools like cursors for writing code are already very powerful. We all doubt whether we will need to write code in the future? Right. I didn't expect that the product manager next door didn't have this anxiety. Yes, so this is actually a common problem. AI's impact on the entire field or industry is the same for the entire industry. This is a fact, but we have to see how you look at this problem. The angle is very important.

Let me analyze it. We can think from two perspectives: stock thinking and incremental thinking. Stock thinking believes that we only have so many existing jobs and needs. Since AI can already complete these jobs, then don’t we have nothing to do? Are we going to face unemployment? This idea is natural, and I believe most people will think so at first. This situation is just like when the Jenny textile machine and the car appeared. The anxiety and panic of those industry workers are essentially the same.

But we can change our perspective and look at this problem from an incremental perspective. Looking back at those textile workers or coachmen, although their original industries disappeared, new industries were born. For example, when cars first appeared, who could have imagined that there would be hundreds of millions of cars in the world today? This was completely unimaginable at the time.

So we have to think in an incremental way. With the help of AI capabilities, each of us can become a "super individual". Correspondingly, the productivity of the entire team will be greatly improved. From a positive perspective, the increase in team strength allows us to try more new challenges. For example, developers do not have to be limited to the front-end or back-end, they can become full-stack engineers and even develop cross-product capabilities. Product managers can use AI to quickly develop and verify MVPs, so that all work can be accelerated.

As a super individual, your abilities also need to be transformed. The past skills of coding, document writing, and product prototyping may no longer be so important. You need to iterate and upgrade your abilities. You need to become a leader of AI and lead AI to work together. I call this "AI leadership" - from executing specific affairs to setting goals, managing, and accepting the results of AI's work. This means that everyone will change from an executor to a small manager. This is a major change in the nature of future work, and it also puts forward new capability requirements for us.

Xu Siyan:

We have observed that everyone has different thresholds for using Agents, and there are many complex issues in team writing that traditionally require people to communicate. Will this be an obstacle to the implementation of Agents in enterprises? Is there any solution?

Hong Sirui:

This is indeed a good question. The current consensus is that we will standardize the interface of the problem-solving process and provide it to the business side, so as to reduce the human-computer interaction link, because the non-standard interface will affect the final processing effect. As AI capabilities improve, it can not only evolve, but also be personalized and learn business data. This means that the interface will become more and more open, and when the same business needs to serve different teams within the enterprise, these information processes can be flexibly adapted to achieve self-adaptation.

Xu Siyan:

Okay, the last question is for Yu Yi, which is about the AI leadership building we just talked about. In the scenario of personal use of Agent, do you have any suggestions on how to make good use of Agent?

Yu Yi:

Let me first give you some background: I have previously shared my experience after 2,000 hours of AI collaboration at Tencent Intranet and Tencent Research Institute's Technology for Good Festival. There are more details there, but today I want to talk about it briefly. Although Mr. Jie just gave everyone a good psychological massage, I may have some not-so-optimistic news. This comes from my work experience of working with a lot of entrepreneurs.

This year, some very strong signals have emerged, which I think are a wake-up call for everyone. In the past two years, the entrepreneurs and business leaders I met have mainly discussed some macro issues related to AI.

But this year is different. Now they are asking specific questions: How to deploy AI privately? What successful cases can we learn from or directly refer to? How to restructure organizational processes?

They are all actively embracing AI. Many people told me that they have prepared funds and want to know what strategy to use to advance. What they mentioned repeatedly is improving labor efficiency and reducing costs. This is the core issue that business owners are thinking about now, and it also shows their strong willingness to embrace AI.

The second point is two shocking real data. I have a friend who runs a business. I asked him why AI has not been widely adopted in customer service and sales systems. He said that AI could only help him lay off two people at that time. He said: "We only have a team of eight people in total, and we can only lay off two people. The hidden cost of transforming the entire process is too high." But by the end of last year, there were only two people left in his team.

Let me share a few more examples. A friend of mine who works on a low-code platform said that half of the code in his company is now generated by AI. This shows that the extent to which enterprises use AI has reached a new stage.

Let's talk about the situation in Silicon Valley. Companies there are using AI to update old code systems. Why? Because in Silicon Valley, hiring programmers is expensive, and good programmers are unwilling to do this kind of repetitive work, and programmers with average ability can't do it well. Now they have a new way: spend $200 to $500 to let AI generate code, and then hire a senior programmer to check it. This saves money and is efficient. Of course, this also means that some junior programmers may lose their jobs, which is not good news.

I have observed that 2025 is an important turning point. Whether it is hybrid office mode or changes in work processes, including improving work efficiency, there will be great changes. This change is not only at the product level, but also in the increasing use of AI within enterprises.

When it comes to when AI will replace humans, laymen often give some very general answers. But I think that in the current wave of AI, general analysis is not very meaningful. I think the ability of AI is like a jagged line. It looks like a straight line from a distance, but only when you get closer can you see its strengths and weaknesses in different aspects. So I have been advising my friends: If you don’t start using AI this year, I am afraid you will be eliminated by the market.

As for how many jobs are left for humans, or how many working hours can be achieved each week, these questions can only be answered by yourself. You must introduce AI into your work and life, and personally experience its advantages and limitations in different scenarios.

Only in this way can we draw conclusions. The second point is about attitude. I have always believed, and still believe, that making good use of AI is the key. Just like what Mr. Jie said, you have to be a good boss of AI. A good boss does not necessarily have to be more capable than your subordinates, but you must be able to provide resources that they don't have and demonstrate your value. Otherwise, just like employees will start their own businesses, you must not only be good at collaboration. You have to prove that "cooperating with me, I can provide you with unique resources and capabilities." My current attitude is: We must learn to work for AI. The attitude of collaborating with AI is to confirm what value I can provide to AI.

Xu Siyan:

2025 is destined to be an extraordinary year. I believe that while everyone sees various novel and practical AI models and products, their own work will also usher in tremendous changes, integration and challenges.

Today's roundtable ends here. Thanks again to the three experts for their wonderful sharing. This year we will continue to hold roundtable events from time to time to discuss the latest progress and innovation of generative AI and its impact on us. If you want to get on the right boat, please continue to pay attention to our roundtable events.

Thank you everyone, see you at the next roundtable!