Is AI Agent too hot and may become a "fake agent"?

The real situation and challenges in the field of AI Agents, in-depth analysis by experts.
Core content:
1. Definition and hype of AI Agents
2. Technical bottlenecks and actual business value measurement
3. Four-dimensional analysis of AI investment ROI
How to define AI Agent? How to view the over-hype in the field of AI Agent? How to find the best entry point for AI Agent? What key KPIs can verify the actual business value of AI Agent projects?
In the "DeepTalk|DeepSeek Conjecture Series" dialogue planned by Cui Niuhui, hosted by Cui Qiang, founder and CEO of Cui Niuhui , Julian Sun, vice president of Gartner , was invited to conduct an in-depth discussion on the theme of "True and False AI Agents: OEM Trap vs. Technology Bubble" .
Sun Xin mentioned that the domestic AI Agent is currently at a peak point. In the next two to five years, AI Agent may enter a mature stage of production, but companies should still maintain a relatively cautious attitude towards the exploration of AI Agent. Technical bottlenecks, reliability, cost, and scenario applicability are still important constraints. Deep Seek has narrowed the gap between many companies in the application of big model technology, but companies should consider more about creating synergy between their own data and big models and using big models for their own benefit. When talking about how to measure the ROI of AI investment, Sun Xin mentioned the four dimensions of efficiency, quality, finance, and security.
Read the catalog
1. “Agent Washing”: Rational thinking behind the excitement
2. Return to customer expectations: How to define AI Agent?
3. AI gravity + user usage: the "moat" in the era of big models
4. AI investment measurement indicators and implementation “three steps”
“Agent Washing”: Rational thinking behind the excitement
Cui Qiang: Today we are going to talk about a topic that everyone is concerned about: Real or fake AI Agents, is it a brand trap or a technology bubble? Tonight's guest is Sun Xin, vice president of Gartner. Tonight's topic also comes from an article by Gartner, titled " Gartner: Beware of "Agent Washing" and distinguish between hype and substance . "
Is Agent just old wine in a new bottle? This is a very common question both at home and abroad. In response to this question, Gartner has given a very strict definition standard for AI Agent. In Gartner's eyes, what is a real AI Agent? How to define enterprise-level AI Agent? Currently, Salesforce and many domestic manufacturers have also launched AI Agent products. Are these products considered AI Agents? Let's talk about this topic tonight. Mr. Sun, please introduce yourself first, and then introduce the background of writing this article.
Sun Xin: Okay, thank you, Professor Cui. My name is Sun Xin, and you may be more familiar with my other name Julian. I work for Gartner, an American research and consulting firm that specializes in serving global chief information officers (CIOs) and their teams, including big data managers and AI leaders of enterprises. I am mainly responsible for Gartner's research team in China, and my main research directions are artificial intelligence and data analysis.
This article is based on a word that Americans often say: "FOMO" (Fear of Missing Out) , which means that companies or individuals are afraid of forgetting (missing) something. In fact, this happens frequently. For example, for many years, many domestic manufacturers have called themselves "Cloud " , a few years ago they were "Middle Platform", a dozen months ago they were "GPT", and today they may be "Agent".
We think that "Agent Washing" refers to a phenomenon where suppliers repackage their existing technologies as agents. However, these "existing technologies" actually lack some product autonomy and complex decision-making capabilities , which leads to confusion in the client's enterprise market or misleading investments.
We have seen many organizations or companies overspend and greatly underestimate the cost and complexity of deploying AI Agents today, and may ultimately fail to meet the expectations brought about by excessive hype, which is one of the original intentions of writing this article.
Cui Qiang: In the current context, manufacturers are definitely willing to take advantage of the concept of AI Agent. In the past, when big data came out, everyone was a big data company. Now that AI has come out, everyone is an AI company. What are the main characteristics of "Agent Washing" in the current market?
Sun Xin: These characteristics are very obvious: one is relatively simple and crude, such as changing the name . The original RPA company or application company directly becomes an AI Agent company. This is a kind of "minimalist narrative" on the marketing side. It may not pursue the logic of the product too much, but it is very provocative and can attract ordinary users.
AI products are quite interesting. They serve more Prosumer users (consumers involved in production) rather than general Consumer users (consumer users) . They have a strong reach compared to professional consumer users. Therefore, once the manufacturer changes the name to Agent, it will greatly attract some to C users who like to delve into new products to try. Many manufacturers' products were called Copilot a few months ago, and then changed to Agent.
Let me share a Gartner data: Our consultation volume on Agent increased by 750% from Q2 to Q4 in 2024, which is a very amazing number. However, according to our research on the deployment of Party A Agent, the success rate is less than 30%. It looks very lively, but the proportion of successful deployment is actually very low.
Therefore, this "Washing" may have a certain meaning of success for many parties B, but it is far from a success for party A.
Cui Qiang: You mentioned two statistics just now. The number of consultations on Agent has increased by 750%, but only 30% of customers have successfully deployed AI Agent. What are the main scenarios for successful deployment?
Sun Xin: In fact, many of these 30% of customers may not necessarily be AI Agents, but more likely Workflow (workflow technology) . The customer service, Agentic RAG (agent-augmented retrieval generation) model of the knowledge base, and the Coding (software development) module we have seen are all relatively successful scenario categories.
Cui Qiang: Are the overall trends at home and abroad similar, or are there differences?
Sun Xin: There may be more choices abroad. No matter what kind of technical product it is, there is a choice of " buy or build ".
In the initial stage of ChatGPT, we saw a clear difference between the domestic and foreign markets: domestic enterprises adopted the build model, while foreign enterprises adopted the buy model, which is very similar to SaaS deployment. However, in the process of self-building, Chinese enterprise customers may face many technical limitations.
However, since January this year, especially with the rise of DeepSeek, this situation has changed significantly with the help of inference models for Agents. According to a survey we conducted in June last year, the success rate of customer-generated AI deployment in Chinese enterprises was 8%, while the global success rate was around 23-24% at the time; although this year's figure has not yet been released, domestic enterprises are already very close to the global level. Therefore, DeepSeek has a driving force in this matter.
Cui Qiang: Yes, people who were not related to the IT industry are talking about how to use DeepSeek. You mentioned two data just now, including the Gartner technology curve. Where does this wave of generative AI stand on the technology curve?
Sun Xin: I can briefly introduce the Gartner technology cycle. Gartner believes that all technology trends basically follow a technology maturity curve (The Hype Cycle) . At the beginning, they will enter the technology budding period : when a new technology achieves breakthrough progress or is widely disseminated, it will arouse great interest from the media and the industry, which we call the technology budding period; then it will enter the expectation expansion period . When the outside world gives too much enthusiasm and unrealistic expectations to a technology trend, some leading companies may vigorously promote it, but often only a small number of them succeed.
I think that generative AI is currently in a period of inflated expectations. Domestic agents are at a peak point . Expectations are very high. Some manufacturers even claim to have made a universal AI agent, but in fact this is unlikely. After that, some technologies will enter a bubble bursting trough , falling from the highest point to the lowest point, and then gradually entering a steady recovery period . With the integration of new technologies and the realization of more commercial methodologies and tools, it will eventually reach maturity .
(Image source: Gartner website)
At each stage, we will provide customers with certain guidance and evaluate technology investment strategies. In our technology maturity curve for AI Agent, it has a very high benefit rating.
In addition, we predict that within two to five years, AI Agent may enter the production maturity stage, and AGI realization will come earlier. To a certain extent, Agent provides a good channel for many companies to realize their wishes, including the emergence of auxiliary technologies such as MCP. Many companies feel that large models that could not work in the past can now help us work. This is also an example of some expectations of the client being realized by manufacturers.
Cui Qiang: We have seen that the situation in which SaaS was favored by capital has reappeared in the AI sector. I feel that AI may enter the cycle of Gartner's technology curve faster than SaaS. In terms of investment in AI technology, combined with the current status of AI in China, what advice would you give to manufacturers?
Sun Xin: I suggest that we should be relatively cautious in exploring AI Agents . First of all, there are still many bottlenecks or technical limitations in AI Agents, and we cannot make a very good Agent yet.
Secondly, the most common problem is its reliability. Today's AI Agents rely on some unreliable components, the most common of which is a large language model. Assuming that an AI Agent workflow includes 10 steps, each of which is based on large model reasoning, there is about a 10% error probability. Overall, the actual accuracy of an Agent may be only about 1/3. This accumulation of errors is actually unacceptable to enterprises.
Third, the cost issue. If you use tokens regardless of cost and let AI Agents perform some tasks, it may not be a suitable choice for enterprises. Therefore, the question faced by more enterprises is: Is it really necessary to build an Agent, or is it necessary to make the project into an Agent? Because the complexity of the Agent and its value grow in the same proportion.
In addition, Agent is not suitable for all application scenarios. Not all scenarios and all companies need to use Agent now.
Cui Qiang: Just now you mentioned that AI Agent is only suitable for certain scenarios. What other scenarios are there in China that can be implemented with AI Agent at low cost and in a relatively reliable state?
Sun Xin: It is more appropriate to look at this issue from four dimensions. First, complexity. Is it complex enough, requires enough steps, and needs to be implemented in a sufficiently uncontrollable external environment? This may be a scenario more suitable for AI Agents; second, benefits. Can it bring enough benefits? Third, the feasibility of existing technologies; fourth, the error rate of this matter.
Combining these dimensions, you will find that writing code is very suitable for making an agent . First, writing code is complicated enough; second, hiring a programmer is very expensive, especially in North America; third, AI writing code is now more reliable, for example, Claude 3.7 Sonnet can do it very well; in addition, writing code can be tested through a very rigorous test.
Combining the above four dimensions, we can make a good judgment on whether something is worth implementing through Agent.
Cui Qiang: A netizen asked, what are some good examples of successful commercialization of intelligent entities?
Sun Xin: You may have heard of several agents that are more recognized on the market. For example, OpenAI's Deep Research is a research-oriented agent; the code writers include Cursor and Devin, both of which are very good agents that can solve a certain type of problem end-to-end on a unified platform.
Cui Qiang: They are all foreign products. Are there any successful examples in China?
Sun Xin: At present, I have not clearly seen which Agent company in China is better, that is, a company that can provide ready-made and relatively good Agent products. But there are indeed many good Agent Builder (intelligent platform) companies that can provide tool sets to allow enterprises to build their own Agents. There are quite a few such companies.
Return to customer expectations: How to define AI Agent?
Cui Qiang: After reading the article about "Agent Washing", my first impression was that this standard is very strict. According to this standard, there is almost no product in China, or only a few products can be called AI Agent. Why is it defined by such a strict standard? Can you introduce this standard to everyone?
Sun Xin: First of all, it is a big market trend. The reason why AI Agent can attract a lot of capital investment is that it must have built a very huge vision and its capabilities must match. Gartner defines AI Agent as: an autonomous or semi-autonomous software entity that uses artificial intelligence technology to perceive, make decisions, take actions, and achieve business goals of enterprises or individuals in digital and physical environments .
Let me sort out these keywords again: First, it is an autonomous or semi-autonomous software entity. If it is a semi-autonomous form, it cannot be said that it is not an agent. At certain key points, it can incorporate human roles, but the most important point is that it must have the ability to make autonomous decisions .
Second, it is a software entity rather than a large model. This software entity puts AI components into it, but the final execution is still coordinated at the software layer, which means that the software entity is still the one that does the work in the end.
Third, when it comes to using AI technology, AI Agent, it is not necessary and only large language models can be called AI Agent. Before the emergence of large models, many companies have tried to use Agents to do some work, such as using more traditional machine learning, or using symbolic AI to do Agents to make the results more predictable and stable, and even using code to implement some work, which can still be called AI Agent.
AI Agents need to use artificial intelligence technology to perceive and obtain external information. This perception ability may be a major bottleneck in current technology, because the external environment may have to be better coordinated in a unified process, or even in a unified cloud platform or large factory environment.
Fourth, to make decisions, AI Agents may call on different functions to develop action plans and make some decisions. Fifth, to take action, we need to get the job done. We need to use Agents to call on some tools, interfaces, skills and functions to have an impact on the target environment.
The above mentioned keywords are based on what our current customers have mentioned, or what they expect from AI Agents . The current market environment is indeed full of hype, and it is difficult to realize such a vision.
At present, there are very few software entities that can be called Agents in the domestic market. Enterprises may be more concerned about how to build Agents that meet their needs through Agent Builder Platform.
Cui Qiang: Just now you mentioned that China still prefers to build its own products rather than buy mature commercial products. And there are almost no mature agent products in China. After the AI wave of DeepSeek, does it also provide many CIOs or enterprises with a possibility to quickly build AI Agents?
Sun Xin: The emergence of DeepSeek, or the open source models launched by many large foreign companies, has narrowed the gap between many companies, but has little impact on some leading first-party companies. It has brought new possibilities to Tier Two (second-tier suppliers) or companies that previously had great difficulty in obtaining large models, but in many cases this also requires companies to grasp it themselves.
What enterprises may need to consider more is how to make their data truly synergistic with the big model and use the big model for their own benefit. If everyone uses the same big model, how can they highlight their own differences? As a supplier, how to use the big model to realize their vision and build their own moat is really worth thinking about.
In addition, we are now very ingenious in designing some workflows and building some agents through AI engineering, but this may also cause problems at some point in the future. This is because the capabilities of large models will extend to more and more rich scenarios. On the one hand, it can process multimodal data, and on the other hand, it may itself become a tool, or change from a large language model to a large action model.
This means that you may have made a lot of agents today, but in the future, with a little effort from the big models, such capabilities will be directly replaced.
Therefore, what we should do is not to wait for a new big model to emerge, but to consider how to combine our own data and corporate know-how, realize some new functions through reinforcement learning, and build our own moat. This is very important .
Cui Qiang: That is indeed the case. There is a comment here that a good agent must have data boundaries, and the quality of the data is the quality of the agent . What do you think of this view?
Sun Xin: Good data quality may determine the quality of the big model itself, because now enterprises have basically realized that it is not possible to make a general agent. What is more important is how to clearly define the boundaries of the agent. For example, document processing, data acquisition, visualization, etc. The division of boundaries and how to successfully complete the work under a predefined workflow and efficiently realize some wishes of the enterprise are the boundaries of an agent.
Most companies will probably need to do Multi-Agent (multi-agent collaboration). It is difficult for a single agent to generate good expectations for customers. Of course, the better your data, the higher the quality, which is a very good foundation.
Cui Qiang: You mentioned earlier that the emergence of DeepSeek provides a possibility for small and medium-sized enterprises that do not have the budget of leading enterprises to build AI agents. How can they find the best approach ? What suggestions do you have?
Sun Xin: Although there are very few good, ready-made agent products that can be purchased directly in China, our suggestion is that if you want to get involved in this field, you can first learn from the buy to build process of some AI pioneers, or try the two paths in parallel.
The most successful pilot projects we have seen are those that focus on demonstrating business potential rather than the feasibility of a technology . When companies conduct some technology pilots, if they only verify that the existing intelligent methods are feasible in our systems or workflows, it will only bring minor improvements to the company, but will ignore the real transformative power that this technology may bring.
Therefore, we do not recommend that companies must create a general-purpose agent now. Instead, we recommend that you first do a good job of something that you think can benefit the company and realize business potential, and then consider the multi-agent model.
AI Gravity + User Usage: The "Moat" in the Era of Big Models
Cui Qiang: DeepSeek has brought great anxiety to everyone. Almost every industry will be reshaped and transformed by AI. Focusing on the field of enterprise software or SaaS, what will the relationship between Agent and SaaS or enterprise software be like in the future? I would like to hear your opinion or some opinions observed by Gartner.
Sun Xin: This is a very interesting question. We often discuss what kind of "love-hate" relationship will exist between big models and agent vendors in the future. Because agents need to call on some tool capabilities, which may come from SaaS and traditional software.
From the perspective of SaaS vendors, as long as SaaS tools can be used better and can generate usage, they are willing to be called. On the other hand, big model vendors also want to do this work, and they are also capable of building these tools, so there is no need to connect external tools through MCP. It is equivalent to implementing the original work of SaaS vendors in the big model, which is similar to the relationship between large cloud vendors and ISVs.
We have seen similar clues abroad. For example, OpenAI's Deep Research does not have an open API. It hopes that users will open the ChatGPT interface and use the large model as an Agent, which will become a platform that can realize various business capabilities in the future.
For front-end SaaS vendors and tool vendors, the short-term approach may be to build various agents on their own platforms, like Salesforce, combining their own know-how. But in the future, they will definitely consider how to build their own big models, so that they have a "central brain" and maintain an AI gravity attraction to customers, which is also very important for SaaS vendors.
In the long run, SaaS vendors and tool vendors will also invest in their own model construction , such as Perplexity building its own large model. Everyone will focus more on the user side, gradually shifting the past data gravity and platform gravity to AI gravity.
Why do we mention "AI gravity"? In the past, we often talked about data gravity, which means that if I can grasp the enterprise data, I will have more opportunities to attract enterprises to build various application capabilities on my data platform. For example, if an enterprise buys a database from a manufacturer, it is very likely to buy its data analysis products on this manufacturer's database, including building application capabilities.
The appeal of AI Gravity comes from the unique agentic experience it brings. It will also make many companies willing to purchase more tools and capabilities on the platform due to AI capabilities. Therefore, the future competition will definitely be on AI Gravity .
In the development of AI technology, the East and the West actually have very different evaluation standards. Western mainstream media, including some in the United States, may pay more attention to the capabilities of the large model itself, while China may pay more attention to more basic indicators such as daily and monthly active users.
In some recent interviews with OpenAI CEO Altman, he also mentioned that building the best model is not necessarily the most important thing. The most effective way is to have 1 billion daily active users on my platform.
Recently, GPT 4o launched a graphics rendering capability that allows more users to use it on the platform, which is to use AI gravity to allow more users to use it, thereby building the strongest moat. AI gravity + user use is undoubtedly a very good match.
Cui Qiang: Just now you mentioned the love-hate relationship between big companies and SaaS. This love-hate relationship has already gone through a round between big companies and ISVs in the early days, and now it may be repeated for the second time. In the future, how will the relationship between manufacturers such as Salesforce and general large model manufacturers change? Do we also need to build our own AI ecosystem and grow together with the ISVs in the ecosystem? What will this look like on the SaaS or ISV side?
Sun Xin: Salesforce and ServiceNow have very unique experience in their own fields and their own unique moats. In the short term, they will launch various GPTs, and today they are various Agents. They may be the ones who maintain the best sense of boundaries, and now they are just using a better Agentic interface to give users a better experience.
For these manufacturers, the next step will be to prioritize the data layer . For example, Salesforce has repeatedly emphasized the importance of its Data Cloud in its recent earnings reports or earnings conference calls.
This shows that data is what Salesforce thinks is truly important , and it will definitely train a model that better suits its own business context based on its own data.
The reason why Salesforce can create this platform-based agent collaboration method is, on the one hand, to provide users with a better experience; on the other hand, to make customers more sticky and allow customers to see a scenario or possibility of using Salesforce in the future.
In comparison, Chinese companies may be more realistic. They don’t care what model is used behind the scenes. They focus more on actual value rather than being able to beat you in running points.
Cui Qiang: It is more pragmatic and grounded. A netizen asked, how can Agent better handle end-to-end problems?
Sun Xin: In the past, Agent was more like a full-stack self-developed process. The emergence of the new MCP protocol has brought about a shift from full-stack self-developed to protocol assembly. Many small and medium-sized enterprises can use MCP to assemble some standardized parts into Agents. Although it cannot be called end-to-end, it is a more time-saving and reasonable process of building Agents in a management process.
In addition, another way is to use the Agent launched by large companies such as ServiceNow and Salesforce, where all applications grow on their own platforms.
Cui Qiang: In the future, many SaaS companies, such as large platforms like Salesforce, will it be possible to call capabilities outside the platform through MCP. Suppose ServiceNow needs a sales module, can it also be called through the MCP protocol? From the customer's perspective, in the process of solving a specific scenario, many products from different platforms and manufacturers may be called. Is this possible?
Sun Xin: This depends on the further development of the MCP protocol. On the one hand, it depends on whether these large companies and SaaS vendors will open their own MCP services; on the other hand, it also depends on whether customers really need it.
If a Chinese customer has never used Salesforce services, will he choose to connect to Salesforce Agent?
In addition, cost is also an important factor. If something that originally cost 10 yuan now costs 15 yuan, it depends on whether the customer is willing to spend this money. This is a very realistic problem. For many companies, do you really need to realize your digital ambitions by building an Agent Swarm (multi-agent framework)? In fact, it is not necessary.
Cui Qiang: What you said is very pragmatic. Don’t chase after technology. The most important thing is to spend the least amount of money to solve the core problems of the enterprise .
Speaking of connection and openness, I visited a domestic RPA + Agent company a few days ago. They also mentioned that MCP can connect everyone, but the problem is that no one opens the interface, so they can only use the original RPA method, which may be more practical. Therefore, we cannot ignore the current situation in China.
In addition, on the customer side, between self-research and procurement, domestic customers are not sufficiently receptive to commercial products. Although there have been changes over the years, there have not been many changes in essence. What do you think is the core problem?
Sun Xin: In fact, the discussion on self-development and procurement is ongoing. The domestic ecosystem is relatively closed. We see the so-called Data Ecosystem or Cloud Ecosystem abroad, where vendors make their own money and customers make the choice. As long as they use your product, you can get the benefits.
However, domestic manufacturers may prefer to develop in a closed manner rather than connect with others, so that many manufacturers say they are doing end-to-end business. Of course, this is also a demand of Party A, which is understandable.
From the perspective of Party A, the ratio of self-development to procurement can be determined by the scenario. If it is a core differentiated business, the ratio of self-development may be higher; if it is a general high-frequency demand, such as document classification, it can be completely purchased, and the ratio of self-development can be very low; this requires the company to have a clear-headed AI No. 1 to judge whether their scenario is suitable for self-development or procurement.
Metrics for AI investment and the three steps to implementation
Cui Qiang: A few days ago, we were joking that some bosses said they would formulate an AI plan for the next three years, but AI changes almost every week, so a month's plan may seem a bit long. In this regard, what suggestions can you give to CEOs, such as which KPIs can be used to measure the CIO's contribution to this wave of AI?
Sun Xin: We have a lot of research reports in this area. Here I will briefly talk about a four-layer evaluation model . The first layer is the efficiency layer , that is, with AI and without AI, how much will the time consumption of enterprise tasks be reduced? This is a defensive KPI.
The second layer is the quality layer , which is a more aggressive KPI. In the past, we always felt that big models would be illusions or not meet the needs of enterprises. For example, Chat BI, its decision accuracy is worth examining. How to use big models to make them meet enterprise expectations when making decisions, or make the accuracy acceptable, the quality layer is very important.
The third layer is the financial layer , which is a relatively aggressive indicator. For example, we could not do something in the past, but now, thanks to DeepSeek and various open source models, we can develop AI-enabled products, thereby bringing new revenue growth points to the company. This is an indicator of the financial layer.
The fourth layer is security indicators , which is a bottom line. Many companies may not have paid much attention to it in the past, but now there are many MCP protocols. Since they are still in their infancy, there may be many security risks. How can we minimize the potential risks? In other words, how can we protect the most important data assets of the enterprise while making good use of AI capabilities? This is also what CIOs need to pay attention to.
Simply put, it is efficiency, quality, finance and safety .
Cui Qiang: You also wrote a three-step roadmap for building agent capabilities. What suggestions would you give us in this regard?
Sun Xin: We do have an AI Roadmap design, but it is not as simple as three stages. Here we can briefly talk about the three steps: First, we need to do some capability pilots . When choosing pilots, some capabilities must be abandoned and some things cannot be done; if some pilots are successful, we can try to expand them in multiple businesses.
Second, capability expansion. Some capabilities that enterprises may not have cared about in the past can now be attached to users. Can enterprises build more prosumers and develop some applications they want, or even some agent capabilities?
The third is ecological integration , that is, how to integrate our AI-enabled products into the larger business model and create new business revenue points for enterprises in a larger environment. It is mainly about the intelligent leap in these three stages.