Woter AI detection.Hurry - ends Jun 29th

New Year Sales :up to 80% OFF

AI Humanize AI Translator Bypass AI AI Rewriter AI Detector

PRICING

TRY FOR FREE

How can Agent break through the imagination of large models?

Written by

Silas Grey

Updated on:June-18th-2025

The event lasted for more than two hours, with many exciting discussions. I have compiled the most valuable information that can be disclosed and shared it with you, hoping to help you gain a clearer understanding of the current development of Agent.

Q: When did you feel that Agent was accelerating? Do you have any feelings, or are there any clear keywords this year?

Ji Gege: A very profound experience I have is that the capability of the model actually increases in stages.

The demands of agents in this generation are very different from those in the previous Internet generation. In the Internet era, 2C actually came before 2B . This wave of enterprise-level demands is advancing in parallel with 2C , but in this process, because 2B has a higher requirement for certainty, it was very difficult in the early stage.

But the big model has very obvious improvements in performance, accuracy, and cost compression. This is my biggest feeling. A new business model is about to be generated.

Liang Yu: I may have a slightly conservative view. As for the issue of Agent , it has not yet arrived on a large scale, but there is a trend of acceleration.

I think one thing that is more recognized is that DeepSeek has changed the expectations of the B -side to a great extent.

What we really need to accomplish on the C -end is: from tools to logic to decomposition, to truly helping users complete a closed loop. Segmentation is happening, but there is still a long way to go in terms of generalization.

Yang Jinsong: Since DeepSeek came out this year , some Agent scenarios have been implemented. The number of customer inquiries has increased significantly. In the first quarter of this year, we probably completed 80% of last year's revenue. Customers are more clear about the scenarios than last year.

Hou Hong: 2B needs to be recognized. B -side customers are prone to implementation difficulties because the data is not ready. The B-side is still in the process of exploration. We need to think clearly about the competitiveness of the Agent layout and the difference between it and the previous IT investment.

I don't think there are any surprises on the C- end. In China, it is still losing money but gaining publicity. From the perspective of entrepreneurs, the business model behind it has not been figured out yet.

There is also an intermediate type between 2B and 2C, small B and big C. Some of them have quietly started to make money. The core is to model the industry experience, and the model can achieve the marginal benefit of production covering the marginal cost.

Q: I read an article some time ago, which said that the capabilities of today's large models are already pretty good . Even if the capabilities of today's large models don't improve any more, there's still a long way to go in terms of applications.

Do you think that now that big companies are also working on Agents , the training, reasoning, and computing power of large models are not the most optimal?

Hou Hong: Regarding the relationship between the model and the agent , I think the path of model as product has limitations or boundaries.

The trend of internalizing the agent 's capabilities into the model is objective. However, from the perspective of entrepreneurs, the corresponding idea is to consider adding intelligent system ecology.

The model has the capabilities, but how can it be integrated with local private data, internal tools, and even internal IT systems? The model itself is always just a hub. When using the model to integrate more tools, you must know the model training characteristics, how to train the tools, and where the model boundaries are? Where are these boundaries likely to appear?

First, is there some private data? Second, find the boundaries in the scene. If the entrepreneurial scene has no boundaries, then you should stop the loss. If you can use other methods to build your own barriers, such as industrial resources, you need to jump out of the technical thinking and see what kind of position you can occupy from the overall pattern of the entire industry.

Ji Gege: As a developer, I don’t think I will explore the ability to enhance the model. In agent development, only by providing complete and accurate context to the big model can we control the big model well.

On the other hand, if there are no industry barriers, does that mean there are no advantages? I think there may still be some opportunities, such as making some engineering work sufficiently refined. In other words, it is actually dirty and tiring work, but if the accumulation is deep enough, it can also become some barriers.

In addition, I will deploy a model privately. When the data accumulates to a sufficient level, the model obtained through timely online iterative feedback may also become a barrier.

Hou Hong: I proposed a term called "intelligent flywheel", which has three corners: model, data and intelligence. For any enterprise, whether it is ToC or ToB , whether you are Party A or Party B, you must consider how to make it turn as fast as possible, which is the only competitive advantage you can rely on.

It is not enough to just do the dirty work, you must accumulate experience from the dirty work and continue to grow, this kind of competitive advantage requires dynamics.

Liang Yu: What are we worried about when we discuss the relationship between big models, distilled small models, and agents ? It's nothing more than worrying that the agents we make will be crushed by giants.

But in fact, it is found that when the big model is used in different vertical industries, it does not work well, and the reasons also include incomplete data. The so-called industry know-how and industry data are the real factors that determine whether a good business closed loop can be formed. So it is definitely not a thin layer.

For Agent practitioners , you can look back at the last wave of autonomous driving. There are actually two groups of people, one is those who write papers and do research, and the other is those who work on traditional industries. In the end, traditional industries were not swallowed up.

AI is similar. Technology is a good start, but it is definitely not everything. What determines life and death is still customers, products, needs, and the speed of finding.

Yang Jinsong: When a human expert completes work in a specific field, he or she actually needs two types of knowledge. One type is called explicit knowledge, which is the text collected from documents. This type of knowledge is relatively easy to collect.

The second type is implicit knowledge. For example, when solving a problem, what to do in the first step and what to do in the second step when doing a specific task; what tools to find in the first step and how to process in the second step. These have not been well documented before.

If you want the Agent to be able to work in a specific field, you actually need a combination of these two sets of data.

We have an idea about how to enter the industry. If the industry has certain barriers, such as licenses, data monopoly, etc., in this case, traditional companies may still do the agent themselves, or they will invest in some companies to cooperate.

Another situation is that if there is professionalism but no barriers mentioned above, then industry experts can be introduced to develop together, complete the cold start, and then optimize the capabilities through actual use.

Q: Everyone generally feels that Agent is growing, but it has not yet reached an explosion. Is it because if the ability of the large model is only in text, then the Agent may not be particularly useful. Only when the multimodal ability is developed to a relatively strong level, will our Agent have more application possibilities, or more possibilities of landing?

Ji Gege: I have two children of my own. I often take photos to identify insects or objects in nature using intelligent agents such as Quark and Doubao.

I think there are two types of multimodality. One is the artistic type like midjourney . The other is the type that is very clear about logical reasoning. A year ago, logical reasoning was a difficult problem, but now within the scope of my junior high school geometry Olympiad questions, the model can give the correct ideas, which is amazing.

So I think if we want to be a super-intelligent entity, for example, with 100 million daily active users, then the multi-modal image generation capability must be at a certain level. This is slowly happening now.

Liang Yu: How did a large number of C-ends develop with each technological revolution? In fact, it comes from the change in the way of interaction, from DOS to Windows to mobile phones, and then to voice and multi-modal photos. Each time the input is getting simpler. Only in this way can more people be involved.

Human perception, vision, hearing, taste, touch, and many senses of touch have not yet come in. The real combination of virtual and reality also requires multimodal capabilities.

Hou Hong: The agent needs to feel the environment. When users can customize questions in the form of pictures and don’t even need to describe them clearly in words, but the big model can still understand the intention, the resulting expansion of capabilities is very significant.

Including on the B side, it was difficult for Chinese small and medium-sized enterprises to be informationized and digitized in the past. One reason was that there were no professional software talents, and small business owners did not know how to use such complex software. Now we can use intelligent agents to call various software through MCP without people having to learn, and productivity will increase rapidly. Multimodal capabilities are also very critical.

Yang Jinsong: Many teams working on large models have an assumption that human knowledge can ultimately be expressed through words and symbols. If one is pursuing the ultimate superintelligence, perhaps the ultimate use of words is enough.

Why is there multimodality? In fact, the outside world needs such an interaction to increase understanding of the outside world. It may not be the best way for models, but it is the most suitable way for humans and has the greatest value.

Q: Should Agent be vertical or universal?

In addition, targeting young people is a very good approach, because from the perspective of innovative product theory, young people are most receptive to new things and have a longer life cycle. What do you think of the Quark product?

Hou Hong: Regarding Quark's suggestions, the core issue is the positioning of the entire Alibaba Group, which needs to be clarified. Whether to go the platform traffic route or the intelligent transformation route, the logic is different.

Whether to go general or vertical depends on the genes of the entrepreneur. If you can focus on what you want to do, can sit on the bench for a long time, and have financial support, then you can go general.

To do vertical business, you need industrial resources. You need to think about what resources can be integrated with AI as the core. The barriers must come from a system, not from a model, because the model is public. It is not about how to make money in general and vertical business, but how to make more money in the vertical business.

Yang Jinsong: First of all, I think there are roughly three dimensions when discussing the impact of technology on society or industry:

On the first level, will this new technology lead to the emergence of a new business model that will overturn the current business model?

The second level is, based on the original business model, integrating big models or agents , are there some new business models or new ways of playing?

The third level is to reduce costs and increase efficiency. It is an inspiration to entrepreneurs. Is it possible to create a business model that is ten times more efficient than traditional players in a specific scenario? And can it occupy the core part of the value chain?

Sometimes we also use Tian Ji's horse racing method, comparing with traditional industries, we should give full play to our technological advantages; comparing with large companies, we should give full play to our advantages of deep industry penetration.

Ji Gege: I was just thinking about a question, which is, for an Agent, what functions should users be willing to pay for?

For creative work like studying for a paper, I am not willing to pay for it because I think I am creative. But repetitive work cannot bring me potential value, so I am more willing to pay and save energy to do creative work. Therefore, the C-end may stimulate users' willingness to pay for work that users are unwilling to do.

Regarding vertical and universal, my own experience is to start with a function, slowly add functions, call more tools, and solve more problems. In summary, slowly move from vertical to universal, the two are not contradictory.

When your user base is general enough, the data you feed into the model will determine the product you create, and it will eventually grow into a general product.

Liang Yu: The core contradiction here is the separation of training and reasoning. If the two can be combined and changed in real time, the capabilities will be terrifying, and it will be one of the technological trends worth looking forward to.

Q: Search is a widely accepted usage habit for most people. From traditional search to deep search, what do you think from a technical or product perspective?

Yang Jinsong: In fact, the user's understanding of the product is far less clear than what we want to convey to them. Regardless of whether it is a technology product or a fast-selling product, all kinds of publicity are actually just to make users remember a little bit of key information . This is display, and it should also become positioning.

In traditional search results, the first three results are the golden click positions, accounting for 90% to 95% of all clicks . The information on the second page may not even have a chance to be displayed. User expectations are based on the first few lines.

So Google works very hard to ensure that the information ranking is good, because good ranking means that the results provided meet the user's expectations. If the user finds it useful, he will come back to use it again. The first thing is to be fast, and the second is to be accurate.

Users are actually anxious and don’t ask questions. They keep changing search terms to find the answers they want. From this perspective, it is very good to be able to provide users with more valuable information based on the understanding, reasoning and summarizing capabilities of large models.

Moreover, the purpose of user search is to make decisions. Therefore, AI search must also be able to solve a decision-making problem for users efficiently and with quality.

How to build a business model may be a problem. After all, we can no longer use the traditional advertising model. But I am confident that as long as there is value, there is opportunity.

Ji Gege: Deep search is definitely valuable. The traditional mode can only look at three or four links, but AI can directly help me look at 30 links, which expands the search scope, provides more accurate information, and improves efficiency.

Deep search is more like an analyst, while Deep think is more like a scientist or philosopher who helps users think repeatedly.

Hou Hong: There are two value propositions for AI search in the future . One is convenience and time saving; the other is accurate information and trust in it. But at present, the formal logic is still accurate and the professionalism is still insufficient. So there may be vertical search in the future.

And it’s not just about searching. Searching is not for obtaining information, but for making decisions. Along this path, AI search will have many interesting functions and developments.

Q: Does Agent have a first-mover advantage? What factors should be accumulated to form a better advantage?

Liang Yu: When we talk about agents , there is an inevitable implicit meaning, which is to treat them as tools to a large extent. However, it is not easy for tools to occupy the minds of users on the C-end. As long as there are good tools, users will switch, and stickiness may not be enough.

For the B-side, there is a first-mover advantage because the resources on the B-side are limited. Customers are unwilling to build and experience the same tools repeatedly. If data, know-how, and scenarios can be combined, it will definitely be a stronger barrier.

Yang Jinsong: Having a large number of users can help improve the product and then continue to acquire users, which can form a first-mover advantage. In the general agent scenario, the first-mover advantage is not obvious because it is impossible to use large-scale users to continuously create better results. Moreover, the technical threshold for agent development is relatively low.

Only if you can become a new entry point as early as possible can you have a first-mover advantage. For ordinary startups, the opportunity to be acquired is also very good. However, it is actually quite challenging to form a solid first-mover advantage and be able to maintain it.

Ji Gege: I want to talk about how to build a first-mover advantage. My own experience is that if you are not a genius, you may start by doing the dirty work. Then start with the first customer, and build up accumulation and trust.

Hou Hong: Enterprises need strategies because they need to have competitive advantages and thus obtain excess profits. If you only do the dirty work, it will be difficult to achieve excess profits.

I think in addition to accumulating trust, it is also important to occupy a position. Find a position in the ecological niche that is not easily replaced. In addition, don't worry about being eliminated by competition from large companies. Because these projects also cultivate team capabilities and can look for other new opportunities.

It is difficult for entrepreneurs to find something they can do for 20 years, so it is also important to maintain an entrepreneurial mindset.

Q: What are the bottlenecks in the full implementation of Agent ? What events or landmark things may appear in the next two years that can convince us that Agent is accelerating?

Ji Gege: I think the most difficult thing for Agent to overcome in the C-end field is the inertia of existing users. This must respect the law of time. Although I have found that many AI products have surpassed the user experience of existing products, there is still time for the switch. Wait quietly, time will prove everything.

On the B side, data completeness is an important bottleneck. Another is accuracy. In many fields, 90% accuracy is not enough. However, this aspect should continue to improve. In October or the end of this year, we should see stronger inference models, and the cost will be very low. This will accelerate the implementation of large models.

Liang Yu: The first industries to explode will definitely be those with strong data completeness, such as finance. Data completeness here refers to large amounts of structured data that have been processed.

For the C-end, we need to observe those who are most sensitive to new technologies. In these small circles, there will be trends that the mainstream has not noticed. If it is related to AI, it will be a good outbreak point.

Yang Jinsong: In the previous generation of the Internet, there was a typical description of new technologies and new products, which was to consider: new experience - old experience - migration cost.

Currently, users have a problem of not knowing whether the results of AI are credible. This needs to be improved by technological advancement. In addition, if AI is used on a large scale in enterprises, the adaptation and change of organizational structure and culture must also be considered. Even some functions need to be redesigned and combined, which takes time and requires methodology to implement. If the traditional model is still used, the value of AI may be discounted.

Hou Hong: In addition to technology and products, there is another dimension, which is the capital investment drive to accelerate user education. The field I am focusing on now is called the Internet of Intelligent Bodies, rather than a single intelligent body. This is a mutually reinforcing thing at the ecological and network levels.