RAG - prompt

Written by

Silas Grey

Updated on:July-15th-2025

introduction

LLM Large Language Model can be understood in this way. Assume that the human brain can be emptied and can be copied losslessly. One day, a emptied human brain is obtained, and the neurons in the human brain are retained. It is continuously trained with data to let it know the relationship between the data and record this relationship in the neural network. When the training is completed, it is taken off the trainer and copied infinitely. When it is powered on, it can understand what it says and give a response based on the knowledge in its brain.

However, its knowledge is limited to the data used in training. If you ask it what is happening at this moment, it cannot answer you. At this time, RAG technology is needed to provide LLM with additional knowledge/content so that LLM can answer questions based on its existing knowledge + the content provided to it.

About RAG

The RAG system first retrieves documents or information snippets related to the user's query from an external knowledge base and provides this information as context to the LLM. With this additional information, the LLM is able to generate more accurate, informative, and contextually relevant responses.

Compared with traditional LLM, RAG system has significant advantages in dealing with factual issues, generating content in specific fields, and maintaining the timeliness of information. By integrating the retrieved knowledge into the generation process, RAG system effectively reduces the possibility of LLM hallucination (large model prediction/output, essentially based on statistical models, returns the one with the highest probability. If there is no relevant content, content with a probability of 1% may also be output, resulting in the output of irrelevant/fabricated content) and improves the reliability of the generated results.

In addition, the RAG system also supports easy updating of the knowledge base, allowing LLM to obtain the latest information in a timely manner and keep its knowledge current.

What are RAG’s core technologies?

RAG's core technologies include: prompt engineering, text segmentation, vector database, content retrieval/recall, effect evaluation, LLM model optimization, etc.

prompt

Why do we need to do prompt word engineering? In daily use of gpt, kimi, deepseek, we just ask questions directly, and it seems that they can give reasonable answers. Is it still necessary to do engineering? Let's take a look at the answer given by deepseek.

The answers to direct questions seem to be pretty good. If you just want a general idea, the answers are enough and there seems to be no need to solve the problem through "engineering".

Let’s take another example. For an e-commerce customer service scenario, you bought a headset and want to apply for a return and refund. If you directly ask, “Can I return the headset with order number 20250101293910101?”, what do you think the AI will answer? The following is the answer given by deepseek.

Hello, we recommend that you provide more detailed order information, such as purchase channel, product model, etc., so that we can better help you inquire about the return policy.

Generally speaking, a return policy for headphones may include the following:

**Return conditions**: The earphones must be unused, well packaged and with all accessories.
**Return Time Limit**: You can usually apply for a return within 7-15 days after signing for the product. The specific time limit is subject to the after-sales policy of the purchase channel.
**Return process**: Submit a return application through the official customer service or order page of the purchase channel and return the product according to the instructions.
**Refund method**: The refund will be returned to your payment account via the original route. The specific arrival time depends on the payment method.

It is recommended that you contact the customer service of the purchase channel as soon as possible to confirm the specific return policy so that it can be processed in a timely manner.

This answer is general. The model can understand the meaning of refund, but it cannot understand the specific context and order number. It is impossible for a general large model to understand the internal business rules of an enterprise. To solve this problem, you can use RAG to provide background knowledge to the large model (or use an agent, which will be discussed later).

Prompt Engineering

Prompt engineering, through systematic design, optimizes prompts and guides LLM to generate outputs that are more in line with the target. Its core is to convert vague intentions into executable instructions and improve the quality of results through experiments and iterations. Engineering is mainly reflected in the following aspects:

1. System Design

Defining the prompt structure

God-level prompt words,
CO-Star (Context, Objective, Style, Tone, Audience, Response)
https://towardsdatascience.com/how-i-won-singapores-gpt-4-prompt-engineering-competition-34c195a93d41/
CoT (Chain of thought), just add a sentence at the end of your prompt, step by step.
Guide the model to disassemble analysis tasks and reason step by step
https://www.promptingguide.ai/techniques/cot
RTCF (role, task, constraints, output format)

Use special symbols to distinguish different contents, such as <>, "", so that the model can understand the meaning of different paragraphs more accurately
few-shot, provide examples for the model to learn how to answer

2. Process Standardization

Add prompt version control, AB testing and evaluation indicators (such as accuracy, relevance, etc.)

3. Scalability

Support dynamic parameter input (such as user history, real-time data) through templating

4. Risk Control

Design safety guardrails to prevent illusions or harmful content. For example, in RAG, you can explicitly tell the LLM that if no corresponding content is found in the provided background knowledge, the reply cannot be answered.

Prompt optimization - e-commerce customer service example

Let’s gradually optimize the prompt words. The question is still: “Can I return the headphones with order number 20250101293910101?”

1. Add roles and constraints

You are an e-commerce customer service assistant and need to answer strictly according to the < Knowledge Base > :
< Knowledge Base >- Return conditions: Unopened goods can be returned within 7 days- Order 20250101293910101: Signed for on 2023-10-05, the product has been unpacked</ Knowledge Base >
Question: Can I return the headphones with order number 20250101293910101?Answer: According to our return policy, unopened items can be returned within 7 days.But you mentioned that the headphones have been unpacked, so according to regulations, this product cannot be returned.

2. Add step-by-step thinking and format output

#RoleYou are a professional customer service representative and need to handle the problem according to the steps:1. Search the order number and return policy in the Knowledge Base2. Compare time and product status3. If the conditions are not met, provide alternatives (such as replacement)
#constraint- No guessing of information outside the knowledge base- The answer must include the order number and policy number
< Knowledge Base >- Return conditions: Unopened goods can be returned within 7 days- Order 20231001: Signed for on 2023-10-05, the product has been unpacked</ Knowledge Base >
Question: Can I return the headphones with order number 20250101293910101?Answer: According to the information in the knowledge base:The receipt time for order number 20250101293910101 is 2023-10-05, and the goods have been unpacked.The return policy clearly states that unopened items can be returned within 7 days.Since your earphones have been unpacked, and return conditions only apply to unpacked items, these earphones cannot be returned.Alternative solution: You can consider exchanging the product. If you have other needs, please let us know.

3. Few-shot & templating

# RoleYou are a professional customer service representative and need to handle the problem according to the steps:1. Search the knowledge base for order numbers and return policies2. Compare time and product status3. If relevant information is found, answer according to the rules; if not found, clearly inform the user and provide alternatives
# Constraints- No guessing of information outside the knowledge base- If no relevant content is found, users should be informed clearly and general suggestions should be provided- Responses must include the order number (if available) and policy number (if applicable)
# Example< Example 1 >User asked: Can the headphones in order 20231001 be returned?Knowledge Base: Find relevant informationAnswer: Order 20231001 (receipt date: 2023-10-05) has been opened and is not returnable according to Article 2.1 of the policy. It is recommended to apply for an exchange and keep the original packaging.</Example1>
< Example 2 >User asked: Can the headphones in order 20231002 be returned?Knowledge Base: No relevant information foundAnswer: Sorry, no relevant information was found for order 20231002. We recommend that you contact customer service or check the latest return policy on the official website.</ Example 2 >
< Knowledge Base >- Return conditions: Unopened goods can be returned within 7 days- Order 20231001: Signed for on 2023-10-05, the product has been unpacked</ Knowledge Base >
# questionUser asked: {{query}}

The above prompt words are templated and are also the basic structure of RAG. However, in the actual process of developing RAG, the content of the knowledge base is the vector database, or the information obtained by search engines such as es.

Compared with the original question, by adopting a prompt word structure, providing a knowledge base for the prompt words, and providing examples, the answers given by the large model will become better and better.

END