Woter AI detection.Hurry - ends Jun 29th

New Year Sales :up to 80% OFF

AI Humanize AI Translator Bypass AI AI Rewriter AI Detector

PRICING

TRY FOR FREE

Live Review | No longer just “paper talk”, how to transform big model capabilities into actual business value

Written by

Clara Bennett

Updated on:June-19th-2025

With the rapid development of technology, the application potential of big models in various industries has become increasingly prominent, but how to efficiently transform big model capabilities into actual business value remains a core challenge facing enterprises.

Recently, InfoQ's "Geek Meet" X AICon live broadcast column specially invited Zheng Yan, chief architect of Huawei Cloud AI applications , to serve as the host. Together with Yang Hao, senior technical expert of Ant Group, and Wu Haoyu, senior technical director of Minglu Technology , they discussed how big models can drive business efficiency as the AICon Global Artificial Intelligence Development and Application Conference 2025 Shanghai Station is about to be held.

Some of the highlights are as follows:

When choosing a model, we should focus on three aspects: reasoning or generation, context length, and response performance.
Developing AI applications is like running a factory. Although what you do may seem high-end, in actual operation, you still have to work with customers in the "workshop" to gradually solve one problem after another.
An ideal AI agent should be similar to a living organism, with the ability to perceive, recognize and act, and be able to continuously iterate and provide feedback in practice.

At the AICon Global Artificial Intelligence Development and Application Conference to be held in Shanghai on May 23-24, InfoQ has specially set up a special topic on "Big Models Helping Business Efficiency Improvement Practices" . This topic will focus on key links such as model selection and optimization, application scenario implementation and effect evaluation, and share the practical experience of industry-leading companies.

At that time, Wu Haoyu, senior technical director of Minglu Technology, will give a speech on the theme of "Implementation of Generative Marketing Driven by Multimodal Large Models". You are welcome to participate on site~

Check out the conference schedule to unlock more exciting content: https://aicon.infoq.cn/2025/shanghai/schedule

The following content is based on the live broadcast shorthand and has been edited by InfoQ.

Scene Exploration

Zheng Yan: When exploring application scenarios of large models, companies often encounter demands that "look beautiful but are difficult to implement". How do you judge whether a scenario is worth investing in actual projects?

Wu Haoyu: When enterprises apply AI, they need to focus on three key points: first, identify the most important problems that are worth solving; second, ensure that there is high-quality relevant data to support AI applications; third, when efficiency is low or the solution effect is poor, AI can be used as an auxiliary tool to improve efficiency.

When enterprises choose AI application scenarios, they should follow the principles of high frequency and value. By identifying the most valuable and frequent problems, they can clarify the scope of solutions and invest resources reasonably to ensure that they can see results in the short term.

Yang Hao: AI applications in the financial field can be divided into three categories: First, improving the efficiency of basic operations. In the past, it was difficult to write audit rules clearly through line-by-line code during the engineering stage. After the application of AI, the effect of audit scenarios has been significantly improved. Second, risk prevention and control. We will build models based on different indicators, use big models to analyze and form SOPs. Third, creating incremental value. Treasury investment scenarios in the financial field can optimize investment decisions through big models.

When implementing specific scenarios, we focus on ROI, evaluate project requirements, personnel and card investment, and ultimately determine whether the effect can cover the investment cost.

Zheng Yan: If we only consider the cost of manpower and cards when thinking about ROI, the investment is actually very large. Will this limit our scenario choices?

Yang Hao: It does have an impact. For example, if we invest two people and two L20 reasoning cards, it can save the workload of five financial personnel, then we think the input-output ratio is positive.

Although AI applications are not yet fully mature and the initial technology costs are often higher than traditional technologies, in the financial field, we will prioritize high-priority scenarios based on three categories.

Zheng Yan: The value and long-term development trend of AI big models are indeed something we cannot ignore, but if we invest fully and try to redo all the scenarios with AI big models, the cost will be very high. Therefore, the key is to find a balance.

We have summarized a checklist for AI scene recognition internally, called "12 Questions for AI Scenes". Simply put, it is usually considered from three dimensions: the first dimension is business value, that is, commercial value. Although we will not accurately measure ROI to see whether this field should be carried out, it will be an important ranking factor. The next is maturity, as Professor Wu mentioned, the readiness of business, data and technology.

Finally, we added another dimension: whether there is the ability to continue operations. Because we usually believe that after AI applications are launched, they often cannot achieve the same results as ordinary employees, and we need to continue to invest energy in optimization and iteration.

Wu Haoyu: In the past, we needed a lot of data support in marketing work, mainly for writing reports and looking up data. In the past, we often used small models, which were low-cost but inflexible. When we looked at data in a new industry, we found that the entities we used before could not adapt to the new needs. At this time, we usually relied on manpower investment and performed a lot of manual labeling.

However, with the big model, the situation becomes simple. Business personnel only need to define the entity words in the new field, and the big model can automatically identify them. In this way, social media insight reports can be customized according to the industry. The more detailed the customer needs, the more detailed the report. The speed and quality of the report have also been significantly improved.

Zheng Yan: In the early stages of a project, how can we prove to decision-makers the cost-effectiveness of investing in a large model? What quantitative “value anchors” can you share?

Yang Hao: In the financial field, many issues can be measured by ROI. For scenarios where efficiency is improved, we measure it based on the number of orders. For example, if efficiency is improved through auxiliary tools or unattended mode, we will calculate how many man-hours this mode can save.

Financial executives are often most concerned about risk control rather than pure efficiency improvement. In this case, we first measure the risk exposure of the scenario and evaluate the risk control ratio that can be covered after the introduction of the big model.

For the incremental value creation part, such as intelligent fund allocation, structured deposits and quantitative investment, these can directly bring actual capital appreciation to the company and can clearly calculate how much money has been earned for the company.

In addition, scenarios such as tax planning can also collect data through large models to support relevant decisions. The benefits of these scenarios can be clearly measured, whether it is reducing risk exposure or improving labor efficiency, the cost of investment can be preliminarily estimated. If the ROI is not negative, the boss is usually willing to invest.

Zheng Yan: How are risks estimated?

Yang Hao: We usually scan risk exposure to determine the proportion of risk that can be controlled. For example, during the audit process, finance sometimes blindly audits some purchase orders or invoices, which may have huge risk exposures, especially when the single amount is hundreds of millions. When using a large model for auditing, we will audit these links one by one and control the corresponding risk ratio through the model.

Zheng Yan: In the end, manual verification is still required, right?

Yang Hao: When the accuracy of the large model is high enough and stable, we can achieve unmanned operation in some scenarios.

Wu Haoyu: When we do marketing, we often don’t just focus on money. We have connected with many multinational companies. In these companies, the Chinese region pays more attention to innovation. If a good AI application is implemented, it may become an opportunity for the headquarters to recognize and gain greater support.

We worked with a pharmaceutical client to help their internal consulting department focus on the satisfaction of frontline staff. This pharmaceutical company has a large number of business representatives who need to contact doctors, but because of the strong medical professionalism, frontline representatives often dare not ask questions directly, especially worried that asking too many questions will be regarded as unprofessional.

Therefore, we helped them create a knowledge base-based application that, in addition to the query function, also includes internal training and examinations. After this set of training and practice, their front-line representatives began to become more confident and dared to communicate with doctors. The frequency of meetings also increased, which greatly helped their sales work.

Technology Implementation

Zheng Yan: When choosing a large model technology route, different business scenarios may have completely different emphasis on model capabilities. Can you share your priorities when selecting technology based on your own practices? When transforming traditional systems, do you choose "disruptive reconstruction" or "gradual upgrade"?

Yang Hao: Different versions of models are suitable for different needs. When choosing a model, we mainly consider three factors: first, whether the focus of the scenario is on reasoning or generation; second, the length of the context. Some scenarios need to process long contexts, while other scenarios may only need short contexts; finally, response performance. In some scenarios, high-performance response is necessary, especially deep thinking models often respond slowly, and it may take several seconds to minutes to start returning results, which is unacceptable in some applications.

In addition, the choice of disruptive reconstruction or gradual upgrade also needs to be analyzed according to the specific scenario. There are three paradigms of AI application: AI Embedding, AI Copilot and AI Agent. Among them, the first two tend to be gradual upgrades, while AI Agent tends to be disruptive reconstruction.

Especially in the financial field, the third model (AI Agent) accounts for the majority, possibly more than 50%. Starting from the second half of 2023, we first carried out some AI embedding work to integrate AI capabilities into the existing financial system.

Users do not perceive this as an AI application, but only see the automated process. For example, a robot pops up in the lower right corner of the interface, and users can interact with it to perform intelligent analysis, review and other tasks. For AI Agent, we are defining digital employees, which is actually rebuilding the entrance to the entire financial system. This is a subversive reconstruction method.

Zheng Yan: The financial system is more digitally mature than other business areas. How can we deeply adopt AI and restructure it by more than 50% in such a mature digital system, and how can we ensure that the business can adapt to these changes?

Yang Hao: The financial field does have many sub-fields, and each sub-field basically has a backend management system. Under the new AI Agent model, we have designed an AI Native financial system that provides a unified entry point, connects the various subsystems with the backend, and these agents in different fields communicate through protocols.

From a business perspective, users no longer focus on the functions of each system, but on their own business needs. Our internal slogan is "from function to service". For example, reimbursement and reimbursement are contents that every company will be involved in.

Traditional systems require users to manually handle complex processes such as bills of lading, review and settlement, but now our system only requires users to simply enter a sentence, such as "I want to report an account", and upload the invoice. The subsequent bill of lading and review processes will be automatically completed by the system. This is an important change in user experience.

Wu Haoyu : For marketing clients, they are more concerned about whether the model can mine relevant information from a variety of materials. For example, if a customer queries strawberry-related content, but the historical report only contains blueberry data, strictly excluding blueberry content may result in failure to provide useful information. Therefore, the model needs to have a certain degree of flexibility.

For medical customers, they have very high requirements for accuracy, precision of citation, and interpretability. In this case, the model must answer strictly according to the original text and cannot generate or cite other knowledge on its own.

In the application of reasoning models, when processing reports and Q&A, we first conduct divergent reasoning to explore the user's possible needs and related questions, but when answering, we must ensure the high accuracy of the model and avoid excessive reasoning. These differences determine that when choosing a model, we need to balance the needs of the customer.

In the process of transforming the traditional system, we gradually upgraded it. For example, we added a plug-in like Copilot to the page, through which users can directly ask questions or operate. At the same time, we transferred some traditional judgment logic to the big model for processing, especially in scenarios involving key nodes.

In the past, these nodes relied on code rules or small model judgments, but now large models can better utilize workflow context to provide more accurate conclusions. Although the interface has not changed much, the system architecture has changed significantly.

Zheng Yan: When we actually select and adopt various models, we also consider one issue: we don’t want the types of models to be too divergent. If the technical team needs to be familiar with the capabilities, styles, architectures, and deployment methods of multiple different models, the cost will be quite high.

Therefore, when conducting POC (proof of concept) or prototype verification, we can diverge moderately, but in a production environment, we tend to converge.

Zheng Yan: Since OpenAI provided the Agent architecture, everyone has made innovations in Agent. In your respective fields, what innovations or practices have you made in Agent architecture?

Yang Hao: We defined an agent system ourselves and divided it into four main parts: perception, decision-making, execution and feedback.

Perception can be divided into active perception and passive perception. Passive perception is relatively simple, which is the information given to us by users through conversations. Active perception is that we label and profile users by perceiving their roles, positions, permissions, and tasks in internal enterprise applications.

The system will recommend relevant tasks and operations based on this information. The decision-making part involves storage and various decision models, which help the agent decide what to do and how to do it. The execution part involves calling various tools, such as API, SQL, etc. The agent completes the task by scheduling these tools.

We have innovated in the feedback loop. For example, if a user encounters a problem and thinks that the big model’s answer is wrong, but if no adjustments are made, the next time the user asks the same question, the big model may still not give the correct answer.

To address this issue, we built a feedback loop that allows users to provide formatted feedback and point out the shortcomings of the model in certain scenarios.

We organize this feedback information into a learning knowledge base and optimize the model performance through dynamic adjustments. Through this dynamic feedback mechanism, the agent can continuously learn and gradually improve the model's capabilities.

Zheng Yan: Can you give some examples of dynamic feedback?

Yang Hao: In the intelligent audit scenario, we pay attention to each audit point, such as checking whether the tax rate is consistent with the contract. If the tax rate extracted from the contract is wrong, the user can provide structured feedback, and the system will automatically generate feedback content.

After receiving feedback, we will manually confirm its quality to ensure the accuracy of the data, then add it to the knowledge base and update it regularly. The updated model is used to evaluate historical data. If the accuracy rate is improved, it will be officially put into use after grayscale testing. Ultimately, the model can understand and respond to feedback like a human, achieving the effect of intelligent optimization.

Zheng Yan: Not only can the big model "understand human language", but it can also allow users to participate in the continuous evolution of the big model, forming a very valuable cycle.

Yang Hao: The key is to let the business side truly become the "teachers" of big model applications.

Zheng Yan: This really becomes an "AI trainer" - users are constantly helping AI to train.

Wu Haoyu: In our content marketing system, we tend to view the entire system as an AI Agent, with the ultimate goal of achieving full automation of content production. We divide the content marketing agent into three parts: perception, cognition, and action.

In terms of perception systems, we need to understand what is happening in the market to avoid blindly creating content. Before doing marketing, we need to "look at five things": trends, industries, target groups, competitors and our own products. All this information is collected by our "Magic Cube Pro" system, which can extract relevant information from the market as the basis for content creation and determine the direction of creation.

In terms of cognitive systems, we created a system based on the Mingjing Supergraph multimodal large model to evaluate content by simulating people's subjective feelings. This system can simulate the reactions of people of different ages or genders through models. In this way, we can predict the audience reaction of advertisements in advance, avoid unnecessary content testing, reduce costs, and improve ROI.

The action system focuses on the automated production of advertising content and how to collaborate with humans to create content. After the ad is launched, it is necessary to collect data for iterative feedback to ensure that the ROI of the ad continues to improve. If an ad is effective, we can increase investment and push it to make it perform better.

Overall, the core of our feedback and action system is the iteration and feedback of content, through which marketing activities are automated. Our ultimate goal is to integrate the entire marketing process - from perception, cognition to action - into a coherent system. Without too much human intervention, advertisers can hand over content to AI Agent and expect returns with confidence.

Zheng Yan: MCP is very popular. How can we quickly support different technology stack applications, and should we decide to support it?

Wu Haoyu: MCP is very useful in developing new AI applications, but for some relatively mature products with fixed processes, the advantages of MCP are not as obvious as traditional technologies, and in some cases it is not even mature enough.

Therefore, for old products, we test and select them based on the existing situation; for new products, we adapt them more. Before, when we called internal tools, we usually used function calls, writing all the content into a very long prompt and handing it over to the big model for scheduling. Now, we have built MCP Server, which makes it easier for each team to access.

Nevertheless, we have also found that MCP changes a lot in new AI applications. Therefore, we are still using MCP with restrictions at present, and hope that the MCP protocol can mature as soon as possible so that we can use it more confidently.

Yang Hao: The application of MCP within Ant is quite radical. For example, the payment API of Alipay can now be combined with MCP to complete the payment operation directly in Agent, which is very advanced in the payment field. In the AI application process, we use various services within Ant as a client, which is used more frequently.

In addition, for some old systems in the financial field, many of which are based on Java architecture, we try to pilot MCP applications in these niche scenarios. In order to support MCP, we will support the application of MCP in some niche scenarios through modules such as Server list. Therefore, as consumers, we are more likely to call the MCP server inside Ant.

In the past two years, everyone focused on the research and development and improvement of models, and recently, MCP has begun to attract attention. It can be seen that everyone has shifted from simple volume models to volume applications. As a standardized communication protocol, MCP solves the engineering problems at the communication protocol layer. It is not an innovation at the model layer.

Zheng Yan: How did you achieve stable performance in the production environment from laboratory results? Can you reveal the design thinking of the key evaluation links?

Wu Haoyu: During the POC phase, everyone thought everything was going well, but only when it actually entered production and faced customers did we realize that the work had just begun. When facing an uncertain system, the most important thing is to do more testing. Testing should not only cover multiple scenarios, fields, and industries, but also be repeated, rather than just a one-time test.

For example, when launching a knowledge base system with a customer, we need to constantly test its materials. After the customer gives us feedback, we need to verify the solution. Sometimes we even need to manually organize the materials with the customer, because the quality of the materials provided by the customer may be very poor. We need to work with the customer to improve the quality of the materials, thereby improving the quality of the final questions and answers.

Of course, customers will have basic expectations and expect to achieve the desired results after certain tests and optimizations. You can’t keep modifying, so you need to set standards and keep working with customers. Making AI applications is like running a factory. Although what you do may seem high-end, in actual operation, you still have to work with customers in the “workshop” to gradually solve problems one by one.

Zheng Yan: When delivering the product to the customer, will you agree with the customer on an accuracy commitment indicator or other similar standards?

Wu Haoyu: The accuracy commitment indicator is usually based on the data set. Customers will provide common questions and question types in their daily Q&A. We will optimize based on these questions and strive to solve 90% of daily problems. After achieving this goal, we can deliver.

Zheng Yan: We cannot achieve "zero bugs" for technologies like AI big models, which means that reliability assessment ultimately depends on the evaluation set. However, the way the evaluation set is designed will affect the performance of the indicators, so various evaluation sets in the industry are also constantly iterating and optimizing.

Wu Haoyu: Customers are not concerned about the evaluation indicators you provide, but about their daily applications or business value. Therefore, each customer's evaluation set may be different, including document scope, content, and even the types of questions they want to ask. Therefore, the design and application of the evaluation set are indeed different for each customer.

Yang Hao: When making models, we often hear that "data determines the effect", and this rule still applies to ensuring the stability of the application. In the POC stage, the effect may be very good, but when facing more uncontrollable factors online, problems are exposed, essentially because the data set is not comprehensive enough.

So, how to solve it? In practice, first, we will set up a detailed indicator system according to the scenario. For example, in the audit scenario, we will design indicators such as precision, recall rate, and accuracy rate for different audit elements, audit points, and audit documents.

Second, during the launch process, we adopted two models. Initially, we did not completely replace manual work with AI, but used it as an auxiliary review, with the final decision relying on manual work. During this stage, we worked closely with the business side to analyze all error cases every week and continuously optimize the model.

In the early days of the launch, the accuracy of the audit scenario was only 20%, which was almost unusable. After three months of tuning, the accuracy rate increased to more than 90%, and the accuracy rate reached four 9s in the audit key point dimension.

The financial field has extremely high requirements for accuracy, so we first adopt the auxiliary audit mode, constantly aligning and adjusting to ensure accuracy. When the accuracy of a certain scenario is high enough, such as the accuracy of the audit under a single category remains 100% for three months, we will turn the scenario into unattended, and AI will automatically replace manual audit. But this does not mean that people are completely absent.

We have a follow-up inspection process to regularly spot check the documents audited by AI. If the AI audit is wrong, the system will fall back to auxiliary mode. This process provides room for error and allows us to gradually transition to a completely unattended mode.

Zheng Yan: What is the most effective measure in improving the accuracy from 20% to 90%?

Yang Hao: First, we designed a very detailed indicator system. Through these indicators, we can reverse the problem of each case. We work with the business side to align these problems one by one. We inject human experience into the model application, which is a very complex process.

Zheng Yan: Do you align human experience into the model through prompt engineering or through training?

Yang Hao: We first use engineering methods to achieve a certain level of scenario, then use high-quality data sets for training, and finally incorporate these experiences into the model.

Audience: What do you think of A to A?

Yang Hao: MCP solves the problem between people and skills, while A to A solves the problem between people. In the financial field, there are some scenarios involved. For example, when accessing new business, it is necessary to evaluate how to calculate, what the tax rate is, whether it is a related transaction, etc. These scenarios usually require communication and repeated collaboration between agents in different fields. However, we have not yet achieved direct communication between agents through A to A, and we are still using MCP more.

Wu Haoyu: A to A is actually a way for multiple agents to communicate with each other, which will bring more uncertainty. However, I believe that when the agent system is rich or complex enough, how agents interact with each other will be the focus of future research in the industry.

But now, our first priority is to ensure that the single agent is fully functional and can fully exert its skills, and then consider how to achieve A to A interaction.

Audience: How to solve the problem of hallucinations?

Wu Haoyu: In the process of building the knowledge base, we found that hallucination problems are often caused by the large model playing freely. The solution to this problem is to constantly adjust the prompt to ensure that the large model executes according to regulations. For example, if you find that it often gives some non-existent examples, you need to explicitly prohibit it from giving examples in the prompt and only allow it to quote the original text. Another common problem is that some models like to merge similar items, and sometimes merge incorrectly. In this case, you need to prompt the model not to merge similar items, but to answer directly according to the original text.

Zheng Yan: Many "jargons" and terms within the company are a common cause of hallucination problems. For example, many people don't understand what "film" means, and they can't understand the big model, which actually refers to PPT.

The industry we are in is highly technical and abbreviations are frequently used. Often we need to help big models understand these abbreviations and terms to resolve their ambiguity. Overall, with the advancement of technology, the hallucination problem of big models has become less and less from the perspective of indicators.

Audience: Can prompt words and model fine-tuning achieve an accuracy of four 9s?

Yang Hao: There are big differences between prompt words, and it is not easy to write a good prompt word. We have written a very strict expert framework for specific task processes, similar to SOP.

When executing a task, the model needs to execute step by step according to our requirements, and there may be dependencies between each step. Therefore, the accuracy evaluation needs to be carried out according to different scenarios and cannot be generalized.

Future Outlook

Zheng Yan: Now there are many excellent AI models, how should enterprise AI applications respond? When everyone is talking about AI Native, what qualities should the ideal "intelligent body" have in your mind? How far are we from this goal?

Wu Haoyu: The ideal AI agent should be similar to a living organism, with the ability to perceive, recognize and act, and be able to continuously iterate and provide feedback in practice.

In addition, the intelligent agent should have the ability to learn. Our current model evolution relies on a lot of computing power for training, but living things learn much faster than this. In the future, the ideal intelligent agent should be able to evolve quickly through a small number of samples or some kind of learning method, rather than retraining from scratch like now.

Yang Hao: The way companies deal with the development of large models can be discussed from two perspectives: models and applications. In terms of underlying large model training, companies need to quickly master model architecture, training methods, and optimization algorithms, especially reward function design, and focus on technical depth.

At the application level, the core is to quickly access, evaluate, and deploy new models and utilize their features. When replacing the underlying model, it is necessary to ensure that the accuracy of the new model is better than the existing model, otherwise it will affect the business.

Although ideal intelligent agents can evolve themselves, the intelligence of models in reality is limited. When evaluating intelligent agents, attention should be paid to design, data, domain knowledge, and dynamics. AI Native application design is different from traditional GOI. Appropriate cards, workflows, and graphs need to be designed. Complex task execution graphs are very different from traditional designs.

Enterprises need to have a deep understanding and mastery of internal data to ensure that the model can understand and process the data. Domain knowledge is crucial for expert systems, especially in the financial field, where accounting, tax laws, etc. Models should be dynamic and self-learn based on human feedback.

Zheng Yan: The development of big models is indeed very fast, worthy of the title of "changing with each passing day". Therefore, behind the changes, we need to grasp what remains unchanged in the development process of big models. I have briefly summarized it before, which is called "five changes", namely: stronger, cheaper, faster, longer (context) and more modalities. These trends have basically been maintained in the past three years.

From the perspective of AI engineering and application, we should avoid the main channel of these large models as much as possible and avoid "embroidering" in the process of rapid development of large models. After all, after upgrading a version, you may find that the few percentage points you have worked hard to improve can directly bring dozens of points of gain with the release of a stronger base model, and the previous investment will be wasted.

In addition, I think there is a problem that many peers do not pay attention to, that is, evaluation. Many people think that evaluation is very basic and low-level, and that it seems meaningless to do a large number of evaluation cases. But in fact, evaluation is the key to the continuous implementation of AI.

If the test set is good enough, it can restore the essence of the business well enough. If the evaluation project is done well enough, AI applications can be iterated at a faster speed. On this basis, AI can be optimized to be targeted. If the evaluation direction is wrong or biased, a lot of efforts will be wasted.

Zheng Yan: In terms of organizational capability building, what new types of positions have you observed emerging? What "super capabilities" do traditional teams need to supplement?

Yang Hao: The first position is "enterprise knowledge manager". AI applications still follow the principle of "the more data, the more intelligence". Therefore, internal applications of enterprises need high-quality data. The richer the knowledge, the more likely digital employees will become truly intelligent.

In addition, many Internet companies do not actually have a complete knowledge base. Especially when the business is developing rapidly, the knowledge base is often put in the back-end.

Next, what superpowers do traditional teams need to supplement? For example, in an engineering team like ours, the roles involved may include front-end engineers, back-end engineers, algorithm engineers, data engineers, quality engineers, etc. Front-end engineers used to mainly work on traditional GUI applications, such as stacked navigation bars and input boxes.

However, under the wave of AI, the front-end technology architecture needs to be upgraded and can no longer rely on traditional frameworks. In the past, back-end engineers mainly used technology stacks represented by Java and distributed system architectures. Now, AI applications rely more on Python technology stacks, and frameworks may turn to new tools such as LangChain. Algorithm engineers used to work on small models of machine learning and deep learning, but now they work on large models, especially transformer models, with completely different training methods.

In the past, data engineers may have used SQL more to process data, but now they need to do logical modeling, indicator engineering, and build data marts that conform to natural language interaction. In the past, quality engineers mainly focused on functional verification, but now their core task is to build evaluation sets and improve indicators such as accuracy, recall, and precision in scenarios. The key is to strengthen the supplementation of these skills.

Wu Haoyu: First of all, everyone needs to understand what AI can and cannot do. This is a process of exploration and understanding through continuous use of AI. For example, colleagues who write code need to know how to generate code through the AI code editor and understand what needs the code written by AI can meet. They need to interact with the AI editor continuously to find the most suitable workflow.

The second point is the role of AI in daily work. For example, our team members now basically use AI to write PPTs, which has brought about a huge change in PPT production. Even when writing product documents, AI is helping us complete these tasks.

Finally, it is the ability to understand AI Native products. How to present these uncertain contents to customers and make them look certain? This is not only a product design issue, but also requires colleagues in the R&D team to think about: how to ensure that the output content can control uncertainty to the greatest extent and provide a deliverable effect on this basis? This is also the ability we have been exploring and accumulating in our work.