AI application development must first understand these concepts: intelligent agent, LLM, RAG, prompt word engineering

Written by
Silas Grey
Updated on:July-16th-2025
Recommendation

Explore new perspectives on AI application development and gain in-depth understanding of core concepts such as agents and LLM.

Core content:
1. The definition of agent and its difference from traditional artificial intelligence
2. Key capabilities and application scenarios for building a general agent platform
3. The working principle of the large language model (LLM) and how to train and use it

Yang Fangxian
Founder of 53AI/Most Valuable Expert of Tencent Cloud (TVP)

What is an Agent?

    An autonomous system based on LLM (Large Language Model) that can perceive the environment, make decisions and perform actions to achieve specific goals. Unlike traditional artificial intelligence, Al Agent imitates human behavior patterns to solve problems, and achieves autonomous operation by gradually completing given goals through independent thinking and calling tools.

General Agent Platform

    Driven by Agent as the core technology, we build a general intelligent agent platform. By tackling key issues in the intelligent agent's perception, memory, planning and execution capabilities, we can adapt to the ever-changing actual business and daily office needs, provide more personalized and precise services, and help engineers free their brains and hands, think more comprehensively and act more accurately, thereby promoting its application in more complex scenarios.



What is LLM (Large Language Model)

    Big language models are a type of artificial intelligence model based on deep learning, designed to process and generate natural language text. By training on large-scale text data, big language models can understand and generate text similar to human language and perform various natural language processing tasks. 

Training and use of LLM

    LLM can understand and generate text similar to human language and perform various natural language processing tasks. Its specific application scenarios include but are not limited to text generation, machine translation, summary generation, dialogue system, sentiment analysis, etc. It has strong generalization ability and can handle a variety of tasks.

LLM Training

The training process of LLM is divided into two stages: pre-training and fine-tuning.

  • Pre-training phase

    The model performs self-supervised learning on large-scale unlabeled text data to learn universal language representations.

  • Fine-tuning stage

    The model performs supervised learning on labeled data for specific tasks and adjusts model parameters to suit specific task requirements.

Use of LLM

    On the one hand, for intuitive daily use, users input questions (prompt words) and the large model gives answers to the questions.

    On the other hand, for AI application programming based on LLM, answers to questions can be obtained by calling the LLM API in a specified format.

Agent framework based on LLM

  • LLM: Benchmarking the human brain, thinking about how to solve problems and what kind of answers to give.

  • Memory: long-term memory plus short-term memory. That is, the historical records used by the agent, system data, and various intermediate information generated during the execution of the agent.

  • Planning skills: cue word arrangement, intention understanding, task decomposition, and self-reflection.

  • Tool usage: Various tool interfaces that the agent may use in performing tasks.



Transformer Architecture

    The core technical architecture of LLM is Transformer, which is a deep learning model based on the self-attention mechanism. The key to the Transformer architecture is that it can process sequence data in parallel, which greatly improves the training efficiency and performance of the model.

Parameter scale

    LLM usually uses large-scale neural networks with the number of parameters ranging from millions to billions. For example, the Tongyi Ganwen (Qwen-7B) has a parameter scale of 7 billion. Training data requires high-quality, pre-processed multimodal data. The increase in parameter scale enables the model to have stronger learning and generalization capabilities and can handle complex language tasks, but it also brings a significant increase in computational costs and resource requirements.



What is RAG

    When LLM answers user questions, it is based on the text data used to train LLM. When faced with questions about unknown knowledge, it cannot answer correctly and is prone to produce wrong results, which is the illusion of a large model.

What is RAG

    RAG (Retrieval-augmented Generation) is a natural language query method that obtains additional information from external knowledge sources through a retrieval information component and feeds it to the LLM prompt to answer the required question more accurately. The LLM is enhanced with additional knowledge to answer questions to reduce the tendency of LLM to produce hallucinations.

Reducing hallucinations with RAG

    Based on RAG technology, a knowledge base can be built so that LLM can use this knowledge base as a basis when answering questions and have the ability to answer relevant content in the knowledge base.

RAG Advantages

    The knowledge base created based on RAG technology can add, delete and modify documents more conveniently and can support more frequent updates.

RAG's overall process

The overall process of RAG is divided into two steps:

  • The first is prior indexing, which is the process of building a knowledge base from private documents; this is the blue dotted link in the figure.

  • The second is instant querying, which is the process of querying and answering questions for the constructed knowledge base. This is the red dotted link in the figure. First search, then generate.

Effects of RAG

  • First, it gives LLM the ability to answer questions about private knowledge bases, thus reducing the illusion;

  • Second, it provides the original source of the reference in the answer to improve the search efficiency, and at the same time facilitates direct comparison with the original text to ensure the accuracy of the LLM answer. It plays an important role in intelligent question and answer, document summarization, data collation and other fields.



What is a prompt word (engineering)

    Prompts are text or instructions that provide input to the LLM to guide it to generate specific outputs.

Prompt word

    There are two types of prompts: system prompts and user prompts. User prompts are user questions; system prompts are a set of initial instructions or background information built into the artificial intelligence application that points to the LLM, which is used to guide the behavior and response mode of the LLM.

    Generally speaking, prompt words refer more to user prompt words, that is, questions sent by users to LLM.

The influence of prompt words on LLM

    When generating text, LLM will try to understand and generate corresponding responses based on its understanding. The quality of the answers generated by LLM is affected by the user's prompt words. More complete prompt words can enable LLM to better understand user intentions and give more appropriate and complete answers.

How to optimize prompt words

    When asking user questions, instructions should be expressed clearly and specifically, and specific requirements should be put forward; if there are requirements for the output format of the LLM, it is best to provide reference text as an example.

How to write better prompts

    The basic components of a more complete prompt word:

  • Instructions: Actions that require the model to take on the text.

  • The object of the instruction: the text that needs to be processed by the model.

  • Example: Case or mental model prompt.

  • Output requirements: requirements for the content and format of output content;

  • Exception situations: Exception handling mechanism when the model cannot be executed or instruction information is missing.

    The following figure shows a specific example (asking about travel plans). It can be found that when directly using LLM provided by OpenAI, for example, in order to obtain a better question-answering experience, it takes a long time and more effort to write better and more complete prompts, and the user experience may become worse.