AI application development should not rely on large model iteration

Written by
Jasper Cole
Updated on:June-30th-2025
Recommendation

New thinking on AI entrepreneurship: How to avoid falling into the high cost trap in large model iteration.

Core content:
1. High cost and diminishing marginal benefits of large model iteration
2. Real challenges and paradoxes of commercial implementation
3. Lightweight development and hybrid architecture deployment focusing on scenarios

Yang Fangxian
Founder of 53AI/Most Valuable Expert of Tencent Cloud (TVP)

When I was communicating with investors yesterday, I thought of this topic: the iteration speed of large models is too fast, and entrepreneurs cannot keep up with the speed. It is too expensive. For entrepreneurs in the AI ​​field, it is necessary to filter out the necessary functions for the iteration of large models, otherwise AI entrepreneurs will suffer. We have considered this very clearly from the beginning.


1. Diminishing marginal benefits of technological iteration

In the development process of large models, the problem of diminishing marginal benefits of technology iteration has become increasingly prominent. On the one hand, the cost of computing power has shown an exponential growth trend. It is reported that the training cost of GPT-4 exceeds 100 million US dollars. This figure is undoubtedly astronomical for most small and medium-sized developers, making them discouraged from the track of large models. Such high computing power costs not only limit the popularization of large model technology, but also make companies face huge economic pressure when investing in research and development.

On the other hand, the risk of data contamination has intensified, becoming another obstacle to the development of large models. With the widespread application of AI technology, AI-generated content accounts for 32% of public data on the Internet. The quality of these AI-generated data varies greatly, which can easily pollute the training data of large models, thereby affecting the accuracy and reliability of the models.

Imagine that a large model trained on polluted data is like building a tall building on the beach. The foundation is unstable and may collapse at any time.

In the MMLU benchmark test that measures model capabilities, the model accuracy increased from 70.0% for GPT-3 to 86.4% for GPT-4. Although there has been some progress, the growth rate has slowed significantly.

Of course, this shows that it is becoming increasingly difficult for large models to achieve a qualitative leap under the current technological framework, and technological development seems to have reached a bottleneck.

2. The paradox of commercial implementation

I have seen data from some leading cloud vendors showing that 82% of enterprise large-scale model projects remain at the PoC (proof of concept) stage and are difficult to put into practical use. After investing a lot of resources in the development of large-scale model projects, enterprises are unable to obtain the expected commercial returns, which is undoubtedly a huge waste of enterprise resources.

Taking digital human live broadcast as an example, we have seen a brand invest 2 million to develop digital human anchors, hoping to achieve a breakthrough in the live broadcast field, reduce labor costs, and improve live broadcast efficiency. However, the actual conversion rate is only 15% of that of real-person live broadcasts, and the results are not satisfactory.

Although digital human anchors theoretically have advantages such as 24-hour uninterrupted live broadcast and relatively low cost, in actual applications, they are unable to establish an emotional connection with the audience like real-life anchors, and lack authenticity and affinity, resulting in low audience purchasing intention.

At the same time, the open source digital human system iterates very quickly, much faster than the digital humans developed by many AI entrepreneurs themselves.

Regarding this area, the results of a developer survey show that 67% of developers believe that the update speed of digital human models has exceeded the business adaptation capabilities.

So our advice to some new AI entrepreneurs is to avoid "big model dependency"

It is necessary to focus on the development of lightweight applications that focus on scenarios. This development method emphasizes targeted optimization and innovation of models according to the needs of specific scenarios to achieve higher efficiency and better performance. It abandons the pursuit of large and comprehensive large models and instead focuses on the precise application of models in specific scenarios, thereby reducing development costs and improving the practicality of applications.

In addition to scenario-focused lightweight development, flexible deployment of hybrid architecture is also an important innovation path for AI application development. This deployment method combines the advantages of large and small models, and flexibly selects the deployment method of the model according to different business needs and scenarios, thereby achieving optimal resource allocation and improved application performance.

Of course, AI application developers should build an anti-fragile application ecosystem service. This not only requires caution and rationality in technology selection to avoid blind reliance on large models, but also requires starting from multiple dimensions of value creation to provide users with more efficient, higher quality and safer services.
At the same time, adopting a continuously evolving agile development model enables developers to quickly respond to market changes, continuously optimize application performance, and improve user experience.

We also have some suggestions on technology selection, which is a "three no principles" to reduce technical risks and improve the stability and sustainability of applications.


The "three no principles" are: do not blindly pursue parameter scale, do not rely on a single cloud service, and do not ignore traditional technologies.


Do not blindly pursue parameter scale: In the wave of large model development, many people believe that the larger the parameter scale, the stronger the model performance. However, this is not always the case. Take a 10B parameter model as an example. In a specific mathematical reasoning scenario, its accuracy rate reached 93.03%, far surpassing GPT-4.


This shows that the performance of the model depends not only on the parameter scale, but also on factors such as the model architecture, training data, and algorithm. When choosing a model, developers should consider various factors based on specific application scenarios and requirements and choose the most suitable model instead of blindly pursuing parameter scale. Blindly pursuing large parameter models will not only increase development costs and consumption of computing resources, but may also lead to a decrease in the generalization ability of the model and fail to achieve optimal performance in practical applications.


It is easy to understand that we do not rely on a single cloud service. Due to the cost of computing power, we will definitely not be limited to one cloud service provider. But why do I say that we should not ignore traditional technologies?
Taking Red Bear AI’s own customer service scenario as an example, when building a customer service system for an enterprise, it adopted a hybrid architecture of rule engine + AI.
The rule engine using AI can quickly process a large number of common problems with high efficiency and accuracy, while AI technology can handle some complex and personalized problems and provide more intelligent services.
Through this hybrid architecture, the company has achieved efficient problem handling and high-quality customer service in customer service scenarios, demonstrating the advantages of combining traditional technology with AI technology. When developers are selecting technology, they should not ignore the value of traditional technology, but should organically combine traditional technology with emerging AI technology, give full play to their respective advantages, and provide users with more complete services.

AI development philosophy that returns to the essence of application


The development of large models has undoubtedly promoted the advancement of AI technology, but when we are caught in the arms race of "larger parameters, more data", we often tend to overlook that real innovation comes from a deep understanding of the scenario and the pragmatic application of technology.

The future of AI application development belongs to those "scenario-deepeners" who can grasp the technology trends and root in the pain points of the industry. Getting rid of excessive reliance on large models, returning to the essence of applications, and driving innovation to inject new vitality into the development of AI technology is the fundamental! .