Paradigm shift in AI service architecture: from “model as a service” to “agent as a service”

AI service architecture is undergoing a paradigm shift from "model as a service" to "agent as a service", and AI Agent will become the new future of AI technology.
Core content:
1. The evolution of AI service architecture: from MaaS to AaaS
2. The core features and advantages of AI Agent
3. The technical implementation path and application scenarios of AI Agent
In today's digital age, the rapid development of artificial intelligence (AI) technology is profoundly changing the way people live and work. From simple automated task processing to complex intelligent decision support, AI application scenarios are constantly expanding, and its service architecture is also undergoing unprecedented changes. In recent years, "Model as a Service" (MaaS), as an innovative service model, has laid a solid foundation for the widespread application of AI technology. However, with the continuous innovation and iteration of technology and the emergence of diversified and personalized market needs, AI service architecture is ushering in a profound paradigm shift - from "Model as a Service" to "Agent as a Service" (AaaS).
2025 is regarded as the "first year" of AI Agent. AI Agent has moved from concept to reality, from single function to multiple integration, and from laboratory to large-scale commercial application. The emergence of AI Agent has not only changed the way people interact with technology, but also brought unprecedented opportunities and challenges to all walks of life. From smart homes to smart transportation, from healthcare to financial services, the application scenarios of AI Agent are constantly expanding, and its influence is becoming increasingly significant.
AI Agent, or artificial intelligence agent, refers to an intelligent entity or software system that is goal-driven, perceives the environment, makes autonomous decisions, executes tasks , and learns from experience. Unlike traditional artificial intelligence models that mainly rely on clear instructions to perform tasks, the core features of AI Agents are their goal-driven, environmental perception, autonomy, adaptability and scalability. They can complete independent thinking, disassemble complex tasks, plan execution paths based on set goals, and adjust and optimize according to environmental feedback during execution, and even call external tools or knowledge bases to assist in completing tasks.
For example, Oracle describes AI Agent as a digital assistant or robot that can autonomously perform tasks according to directions set by humans ; Google Cloud emphasizes its reasoning, planning, memory capabilities, as well as autonomous learning, adaptability and decision-making capabilities .
Currently, mainstream AI Agents mostly use the Large Language Model ( LLM ) as their core brain , combined with multiple modules such as planning, memory, and tools. They have powerful capabilities that surpass the independent operation of a single AI model, and place more emphasis on autonomous decision-making and task execution rather than just passive response.
AI Agents can be classified from multiple dimensions to better understand their diversity and application potential in different usage scenarios.
1. Divided by technical implementation path
From the perspective of technical implementation path, AI Agent can be roughly divided into:
Rule-based Agents: This type of agent mainly relies on predefined rules and logic to make decisions and take actions. Early expert systems and some automated scripts can be classified into this category. Their behavior patterns are relatively fixed, and their adaptability and learning ability are limited, but they are still effective in scenarios with clear logic and stable environments. For example, some traditional intelligent customer service systems can answer user questions through preset processes and knowledge bases.
Traditional machine learning-based agents: These agents use machine learning algorithms (such as reinforcement learning and supervised learning) to learn decision-making strategies from data. For example, in AI games, agents trained by reinforcement learning can achieve goals in complex environments. They are more adaptable than rule-based agents, but usually require a lot of data training and specific model design.
LLM-based Agents: This type of agent is the mainstream direction of the current development of AI agents . Relying on the powerful natural language understanding, generation, reasoning and planning capabilities of the LLM, this type of agent can handle a wider range of more complex tasks. They can interact with users through natural language, understand fuzzy instructions, autonomously plan task steps, and call a variety of tools (such as search engines, databases, APIs, etc.) to achieve goals. For example, it can help users plan journeys, manage schedules, write emails, and even code programming.
2. Classification by product usage function
From the perspective of product functions, AI Agent can be divided into:
Information acquisition and analysis: Focus on extracting, integrating, and analyzing information from massive amounts of data and presenting it in a user-friendly manner. For example, an agent that can monitor specific industry dynamics, analyze market trends, and generate research report summaries.
Task automation: aims to automatically perform repetitive or process-based tasks to improve work efficiency. For example: automatic processing of email classification and message reply, data entry and cleaning, software testing, IT operation and maintenance management, etc.
Personal assistant: As the user's intelligent assistant, it provides personalized services, such as schedule management, meeting arrangements, information reminders, smart home control, etc.
Decision support: Providing basis and suggestions for user decision-making by analyzing data, simulating scenarios, and evaluating options. For example, Agents in the fields of financial investment, medical diagnosis, and supply chain optimization.
Creation and Generation: Assist or independently complete content creation tasks, such as writing articles, designing images, composing music, generating code, etc.
Entertainment interaction: In games, virtual social scenarios, etc., you can play the role of an intelligent NPC (non-player character) or virtual partner to provide a richer and more immersive interactive experience.
3. Classification by terminal application scenario
AI Agent is widely used and has penetrated almost all enterprise and consumer application scenarios:
Customer Service: Intelligent customer service can be online 24/7 to handle user inquiries, answer questions, provide after-sales support, and even proactively perform customer maintenance to improve customer satisfaction and operational efficiency.
Financial services: In the financial field, it can be used for smart investment consulting, credit approval, risk management, automated trading, personalized financial product recommendations, etc., to improve the intelligence level of financial services and risk control capabilities.
Education and training: Personalized tutoring agents can provide customized learning plans and educational resources based on students’ learning characteristics and progress, as well as real-time intelligent Q&A to enhance personalized education capabilities.
Healthcare: It can be used for auxiliary diagnosis (such as medical image analysis), personalized treatment plan recommendation, drug development, patient management and health consultation, etc., to alleviate the pressure on medical resources and improve service quality.
Retail and e-commerce: Smart shopping guides can recommend products based on user preferences, automatically process orders, optimize inventory management and logistics distribution, and improve shopping experience and operational efficiency.
Content creation and media: It can assist in news writing, image generation, video editing, marketing copywriting, etc., improving content production efficiency and creative diversity.
Software development and IT operation and maintenance: It can assist in code writing, automated testing, monitor system operation status, predict and handle faults, and improve software development efficiency and system stability.
Smart Manufacturing: In the industrial field, AI Agent can be used for predictive maintenance of equipment, production process optimization, quality control, supply chain collaboration, etc., to promote the transformation of manufacturing industry towards intelligence.
The industry chain, from basic research, technology development to application implementation, jointly promotes the rapid development of the AI Agent industry. Internet giants, AI technology companies, cloud computing vendors, industry solution providers and many start-ups are actively making plans, forming a diversified and highly coordinated ecosystem.
1. Infrastructure Layer
This is the cornerstone that supports the operation of AI Agent, mainly including:
Computing infrastructure: represented by high-performance computing chips (such as GPU, TPU, NPU, etc.), servers, and data centers, providing powerful computing power for AI agent training and reasoning. Cloud computing platforms (such as AWS, Azure, Google Cloud, Alibaba Cloud, Tencent Cloud, etc.) play a key role in providing elastic computing power.
Data resources: High-quality, large-scale datasets are a prerequisite for training powerful AI agents. This includes general datasets, industry-specific datasets, and data generated by user interactions. Data collection, cleaning, annotation, management, and security also constitute an important part of the infrastructure.
Network and storage: High-speed, low-latency network connections and efficient, scalable storage systems ensure efficient data transmission and rapid response of AI Agents.
2. Core layer algorithm and large language model (Algorithm & LLM Layer)
This layer is the "brain" and core driving force of the AI Agent:
Large Language Models (LLMs): As the core engine of current AI Agents, LLMs (such as GPT series, LLaMA series, Claude series, Wenxin Yiyan, Tongyi Qianwen, etc.) provide powerful natural language understanding, generation, reasoning, knowledge integration and a certain degree of planning capabilities. The scale of the model, the quality and diversity of training data, and fine-tuning techniques all directly affect the performance of the Agent.
Core algorithms: In addition to LLMs themselves, they also include specific algorithms related to agents, such as planning algorithms (such as ReAct, Tree of Thoughts), memory mechanism algorithms (such as long-term memory, short-term memory management), tool calling and collaboration algorithms, multi-agent collaboration algorithms, and reinforcement learning algorithms (used to optimize agent behavior).
AI framework and development platform: Provides tools and platforms for model training, fine-tuning, deployment, and agent building, such as TensorFlow, PyTorch, LangChain, AutoGen, MetaGPT, etc., which lowers the threshold for agent development and accelerates application innovation.
3. Middle-tier Agent Components & Platforms
This layer connects core technologies with specific applications and provides modular capabilities for building and operating Agents:
Agent component vendors: provide specific functional modules that make up the Agent, such as more sophisticated perception modules, more powerful planning modules, more reliable memory modules, and rich tool set interfaces.
Agent operation and integration platform: provides platform services for agent creation, deployment, management, monitoring and iteration. These platforms may be aimed at developers or business personnel in specific industries, support low-code/no-code agent creation, and integrate them into existing business processes.
4. End Layer: Products/Applications
This is the final user-oriented form of AI Agent, which is reflected in various specific products and services:
General AI Agent products: such as personal intelligent assistants (integrated in operating systems, smart speakers, and mobile phone apps), general task processing platforms, etc., designed to meet users' diverse daily and work needs.
Vertical industry AI Agent applications: specialized AI Agent solutions developed to address the pain points and needs of specific industries (such as finance, retail, education, healthcare, manufacturing, etc.). For example, intelligent customer service, financial risk control agents, medical auxiliary diagnosis, etc.
Embedded AI Agent: embed agent capabilities into existing software, hardware or services to improve their intelligence level. For example, embed writing assistant agents into office software and embed smart shopping guide agents into e-commerce platforms.
The development of AI Agent can be traced back to the early days of artificial intelligence research, and it has continued to evolve with the advancement of computer science, machine learning, natural language processing and other technologies. Its development process is a continuous evolution from theory to practice, from special to general, and from auxiliary to autonomous. At present, we are in a critical period of rapid development and application of AI Agent driven by large models. In the future, its capabilities and application scenarios will continue to expand.
1. The embryonic stage and theoretical exploration period (1950s-1980s)
Founding ideas: Alan Turing and other pioneering research laid the theoretical foundation for the concept of intelligent agents. In the 1950s, John McCarthy proposed the concept of "artificial intelligence" and began to explore how machines can simulate human intelligence. Early researchers began to think about computing entities that can act autonomously and interact with the environment.
Early Agent Concept: During this period, Agent was more of a theoretical concept and philosophical discussion. For example, some researchers introduced Agent into the field of artificial intelligence to explore its autonomy, responsiveness and other characteristics.
Landmark event: The emergence of early logical reasoning systems and expert systems, although not fully in line with the definition of modern agents, reflected the initial attempt of machines to perform complex tasks.
2. The development period of symbolism and connectionism (1980s-early 2000s)
Technology development path: Symbolic AI emphasizes knowledge expression and logical reasoning, which promotes the development of rule-based agent systems. At the same time, connectionism (neural networks) began to revive, laying the foundation for learning-based agents.
The rise of multi-agent systems (MAS): Researchers began to focus on how multiple agents interact, collaborate, and negotiate to solve complex problems, and the theory and application of multi-agent systems began to develop.
Product application stage: Some agent applications have emerged in specific fields, such as distributed computing, information retrieval, simple robot control, etc. For example, in 1997, IBM's "Deep Blue" defeated the world chess champion, demonstrating the powerful capabilities of AI in specific fields. Although the agent characteristics are different from the modern definition, it represents a breakthrough in AI in complex decision-making tasks.
Stage characteristics: The agent has limited autonomy and intelligence, and mainly relies on artificially designed knowledge and rules, or acquires capabilities through learning in a relatively simple environment.
3. Machine learning and Internet-driven period (early 2000s - late 2010s)
Technology development path: The rapid development of machine learning, especially reinforcement learning and deep learning, has given agents stronger learning and adaptability. The popularization of the Internet has generated massive amounts of data, providing the possibility of training smarter agents.
Product application stage: AI Agents are beginning to be applied in a wider range of fields, such as personalized recommendations from search engines, intelligent customer service in e-commerce, initial exploration of autonomous driving, and the emergence of various intelligent assistants (such as Siri, Alexa, Google Assistant, etc.). Although these assistants are still limited in autonomy, they demonstrate the potential of Agents as user interaction interfaces.
Stage characteristics: Agents begin to have stronger environmental perception and data-driven decision-making capabilities, and human-computer interaction becomes more natural. However, their versatility and task generalization capabilities still need to be improved.
4. The explosion of agents driven by large language models (from the early 2020s to the present)
Technology development path: Large language models (LLMs) based on the Transformer architecture have made breakthrough progress, demonstrating powerful natural language understanding, generation, reasoning and learning capabilities. This provides a core engine for building more general and intelligent AI agents.
Product application stage: AI agents based on LLM have emerged rapidly, such as AutoGPT, MetaGPT, BabyAGI, and agent platforms and applications launched by major technology companies. They can autonomously decompose complex tasks, plan execution steps, call external tools (APIs, databases, code interpreters, etc.), and reflect and learn, demonstrating unprecedented autonomy and task completion capabilities.
Landmark event: OpenAI's release of the GPT series of models and its application exploration in the agent field have triggered widespread global attention and research and development enthusiasm for AI agents. Industry leaders such as Bill Gates also highly praised the potential of AI agents, believing that they will completely change the way people interact with computers.
Stage characteristics: AI Agents' autonomy, versatility, natural interaction, and complexity of task processing have all reached unprecedented heights. Multimodal capabilities (processing text, images, audio, video, and other information) have also become an important direction for Agent development. The industry has begun to evolve from "model as a service" to "agent as a service . "
With the rapid development of AI Agent technology and fierce competition in the domestic market, more and more Chinese AI Agent companies are turning their attention to overseas markets to seek new growth opportunities and profit margins. These overseas companies have emerged in the global AI Agent market with their product and technology innovations, deep understanding of specific scenarios, and flexible and diverse business models.
1. HeyGen (formerly Shiyun Technology)
Company profile and core product introduction: HeyGen, formerly known as Shiyun Technology (founded in 2020) registered in China, is a startup company focusing on AI video generation technology. Its core product is a powerful AI video generation tool that can achieve a variety of innovative functions, including but not limited to: users can generate virtual digital people with realistic lip synchronization by uploading photos or video clips, quickly convert text content into oral videos with virtual anchors, support users to create personalized virtual images, and provide multilingual voice translation and video localization functions, such as naturally converting the lip shape and voice of characters in English videos into Chinese or other languages. HeyGen's technology has broad application prospects in many fields such as content creation, marketing promotion, online education, and corporate training.
Comparison of domestic and overseas business: domestic vs overseas
HeyGen initially started in China, but then made strategic adjustments. In 2023, the company cancelled its domestic entity, moved its headquarters to the United States, and turned to overseas markets. The main considerations behind this decision include higher user willingness to pay in overseas markets, more mature SaaS software consumption habits, larger market profit margins, and a more favorable valuation environment. Compared with the increasingly fierce competition and price sensitivity in the domestic market, overseas markets provide HeyGen with broader commercialization prospects.
A. Profit model: HeyGen mainly adopts a paid subscription model in overseas markets. Users can choose different levels of subscription packages according to their needs, and enjoy different amounts of video generation time, advanced features (such as higher resolution, exclusive digital human images, API access, etc.) and customer support services. This model helps the company obtain stable and predictable recurring revenue. The company has revealed that it has been profitable since the second quarter of 2023 and has more than 40,000 paying customers.
B. Revenue share: According to public information, after HeyGen fully turned to overseas markets, its annualized recurring revenue (ARR) grew rapidly from about $1 million to more than $35 million within a year. After the company's transformation, domestic business has basically stopped, and overseas markets have contributed almost all of its revenue.
Financing: HeyGen announced the completion of a $60 million Series A financing in early 2024, led by Benchmark Capital, a top Silicon Valley venture capital firm, with participation from Conviction, Thrive Capital, and Bond Capital. After this round of financing, the company's valuation reached approximately $500 million, a significant increase from the previous round of $75 million, reflecting the capital market's high recognition of its technological strength and overseas market prospects.
2. Laiye Tech
Company profile and core product introduction: Laiye Technology was founded in 2015 and is China's leading artificial intelligence and robotic process automation (RPA+AI) solution provider. The company is committed to helping companies achieve business process automation and intelligent transformation through AI technology. Its core products and services include:
Conversational AI platform: Build intelligent customer service, voice assistants, chatbots, etc. for scenarios such as customer service and internal employee support.
Intelligent Document Processing (IDP): Use AI technology to automatically extract, understand, and process information from various types of documents (such as contracts, invoices, and reports).
RPA (Robotic Process Automation): Provides software robots to simulate human operations and automatically perform repetitive, rule-based computer tasks.
Enterprise Smart Assistant: Combining the above capabilities, we can create a unified smart work portal for enterprises to assist employees to complete their work efficiently. Laiye Technology's solutions are widely used in finance, insurance, energy, manufacturing, retail, medical and other industries, helping enterprises reduce costs, increase efficiency and enhance competitiveness.
Comparison of domestic and overseas business: domestic vs overseas
Laiye Technology has had a global vision since its inception, and began to expand overseas markets on a large scale around 2021. The company has established offices in the United States, Europe and other places, and is actively exploring emerging markets such as Southeast Asia, Latin America (such as Brazil), and the Middle East. Compared with the domestic market, Laiye Technology believes that the overseas market has certain advantages in terms of profit margins and the maturity of the business environment. The domestic market is extremely competitive, with frequent price wars, while overseas customers have a higher recognition of the value of software and services, are more willing to pay, and can accept a more reasonable price system.
A. Profit model:
Software licensing and subscription fees: Selling licenses for its RPA platform, conversational AI platform and other products to corporate customers or providing subscription-based services.
Solution and project implementation fees: Provide customized AI automation solutions for large corporate customers and charge corresponding project consulting, development and implementation fees.
Partner Ecosystem Benefits: By establishing partnerships with consulting companies, system integrators, etc. around the world, we can jointly expand the market and share benefits. In overseas markets, since customers recognize the value of payment, it is easier for the company to establish a sustainable profit model.
B. Revenue share: Although Laiye Technology is actively expanding its overseas business, as a company that grew up in China, its early revenue mainly came from the domestic market. With the continuous expansion of overseas business, the proportion of overseas revenue is expected to continue to increase. The exact share of domestic and foreign revenue is usually not disclosed as public information. Its partner and senior vice president Fan Lihong once said that once the overseas market is entered and established, relatively stable income can be obtained.
Financing: Laiye Technology has completed multiple rounds of financing, attracting well-known domestic and foreign investment institutions including Sequoia China, Tiger Global Management, Lightspeed China, Microsoft, etc. Disclosed financing: In 2021, it completed a C++ round of financing of US$70 million, and the cumulative financing amount has exceeded US$160 million (as of that time).
3. Waveform AI
Company profile and core product introduction: Waveform Intelligence is an AI startup focused on large models for long-text content generation. Its core technology lies in the development of AI models that can understand and generate high-quality long-form texts (such as novels, scripts, in-depth articles, etc.). The company's main product is an AI creation tool called "WaWa Writing", which aims to assist writers, screenwriters, content creators and other users to improve their creative efficiency and expand their creative boundaries. The tool can provide story ideas, plot development suggestions, character creation assistance, text polishing and even complete chapter generation, and is particularly suitable for scenarios that require a lot of creative writing.
Comparison of domestic and overseas business: domestic vs overseas
Waveform Intelligence is currently actively preparing and exploring overseas markets. According to the observations of its founding team, users in overseas markets (especially those in specific language markets) are relatively more willing and accustomed to paying for content, which provides a more favorable soil for the commercialization of AI writing tools. The company has begun training multi-model versions that support multiple languages (such as Spanish, French, Japanese, and at least 13 other languages), and has preliminarily verified the product-market fit (PMF) in some small language markets, and plans to focus on these directions in the future.
A. Profit model: Waveform Intelligence's profit model is expected to revolve around its core product "WaWa Writing"
Subscription service (SaaS): Provides different levels of subscription packages for individual users and professional creators, and charges differentiated fees based on functional permissions, word count limits, supported languages, etc.
API authorization: Provide API interface services to enterprises or platforms with content generation needs, allowing them to integrate Waveform Intelligence's long text generation capabilities into their own products or workflows, and charge according to the number of calls or specific agreements.
Customized model service: Provide customized long text generation model training and deployment services for large customers in specific fields or with special needs. Its business model is still in the exploration and verification stage, especially in overseas markets.
B. Revenue share: As Waveform Intelligence is still in the early stages of development and market expansion, especially its overseas business is still in its infancy, the detailed domestic and overseas revenue share data is not yet clear or public. The company is currently focusing more on product polishing, user accumulation and business model verification.
1. Difficulties in the development of AI Agents
The development of AI Agent mainly faces challenges from the computing power layer:
High training and reasoning costs: Current advanced AI agents, especially those based on large language models, have huge demands for computing resources (especially high-end GPUs) during training and reasoning. This not only leads to high hardware procurement and maintenance costs, but also makes cloud computing service fees a huge expense, limiting the participation of small and medium-sized enterprises and research institutions.
Insufficient computing power supply: The tight supply of high-end AI chips worldwide has further exacerbated the computing power bottleneck problem, making obtaining sufficient computing power a constraint on the development of many AI Agent projects.
Energy consumption issues: The huge energy consumption caused by large-scale models and high-intensity computing has also raised concerns about environmental sustainability, and the demand for green computing power is becoming increasingly urgent.
2. AI Agent Industry Solutions
In response to the above computing power issues, the industry and academia are exploring solutions from multiple aspects:
Algorithm and model optimization: Research more efficient model compression techniques (such as pruning, quantization), knowledge distillation, sparsification and other methods to reduce the model size and reduce the amount of inference calculations. Develop more efficient training algorithms to shorten training time.
Dedicated AI chips and hardware acceleration: Continue to develop and promote dedicated chips (ASICs) for AI computing characteristics to improve energy efficiency. Use programmable hardware such as FPGA to accelerate specific agent tasks.
Edge computing and terminal intelligence: Deploy some agent computing tasks to edge devices or terminal devices to reduce dependence on cloud computing power, reduce latency, and protect user privacy.
Develop green computing power: adopt more energy-efficient computing architectures and cooling technologies, and use renewable energy to power data centers.