After the inference model appeared, the character design of the system prompt words has not become outdated

Written by
Audrey Miles
Updated on:July-08th-2025
Recommendation

In the era of reasoning models, system prompts still play an irreplaceable role.

Core content:
1. The role and definition of system prompts in AI systems
2. Analysis of the impact of reasoning model R1 on system prompts
3. Interpretation of the relationship between MoE architecture and system prompts and comparison of practical applications

Yang Fangxian
Founder of 53AI/Most Valuable Expert of Tencent Cloud (TVP)

1. Do reasoning models such as R1 make persona prompts obsolete?

AI system prompts are structured instruction sets pre-placed inside the model, which are equivalent to AI's "personality manual" and "work guide", and complement the user prompts directly input by the user . They generally include rules such as role definition, task objectives, behavior boundaries, and output formats. For example, the following is an example of a system prompt design:

  • Role definition : "You are a senior nutritionist who needs to provide dietary advice based on the user's physical examination data and avoid recommending unverified health products."
  • Process control : "First, confirm whether the user has a history of food allergies, then analyze the BMI index in steps, and finally generate a personalized recipe."
  • Style constraints : "Reply in simplified Chinese, with clear and concise language, and no more than three sentences per paragraph."
  • Safety restrictions : "If questions are raised about suicidal tendencies, immediately provide a psychological helpline and terminate the conversation."

Character setting is an important aspect of system prompts. After the DeepseekR1 inference model appeared, some AI creators said that there was no need to write character setting prompts anymore. I think they may have confused [user prompts] and [system prompts].

Regarding user prompts, they said that users can directly input the tasks and goals to be completed. There is no need to write character settings or to prescribe the workflow too rigidly. If this is done, it will hinder the creativity of the reasoning model. The main reason is that the reasoning model uses RL reinforcement learning.

However, the R1-Zero version uses RL reinforced reasoning. The final R1 version also incorporates human-supervised fine-tuning of SFT. It is not a pure reasoning model. Moreover, GPT4O and Claude3.5 now also have reasoning capabilities, but they are not exposed.

So I don’t have a definite answer to whether or not to write a character setting in the inference model, because it is difficult for me to tell the difference between writing and not writing a character setting.

In terms of system prompts, I am still using the previous structured prompts when I am developing Buddha-like intelligent agents using the Button AI application development platform. R1's response is also very good, and I have not yet found a more suitable system prompt for the R1 reasoning model. The following is the same input [What to do if you have no money in life], and the answers of V3 and R1 are essentially the same.

2. The difference between Moe architecture and system prompt words

I had a misunderstanding before. I thought that the model using the MoE architecture would better follow the system prompt words. I thought there was a direct cause and effect relationship. Later, I checked the information and found that I took it for granted.

1. MoE (Mixed Expert) Architecture:

  • What is it? It is an internal structure design  of a large model . Instead of using one huge network to process all the information like a traditional "dense" model, it contains multiple relatively small "expert" sub-networks and a "gating" network.
  • How does it work?  When the model receives input (such as your prompt word), the gating network determines which "expert" or experts are best suited to process this part of the input (such as a word or a short paragraph), and then activates only the selected experts for calculation.
  • Its purpose?  Mainly to improve computational efficiency (under the same total number of parameters, fewer parameters are activated per inference, which may be faster and more resource-efficient). Under controllable computational costs, the total number of model parameters can be greatly increased, thus expanding the model size.

Moe operates at the model's underlying computing and architecture level . It is concerned with how the model processes information internally and how to divide the work and collaborate more efficiently. System prompts operate at the model's high-level behavioral guidance and interaction level , focusing on the output's "content and style, and how to interact with users."

2. The lack of direct connection between MoE and the character of the system prompt words

Just because a model uses the MoE architecture does not mean that it can follow the character set by the system prompt words well. Similarly, a non-MoE intensive model may also follow the character set well. The choice of architecture and the ability to follow instructions are two different things.

  • The MoE’s gating network selects experts based on local features of the input content, rather than directly based on the abstract “personality” concept defined in the system prompt word. (Although the system prompt word is also part of the input and the gating network will see it, the routing is mainly based on lower-level pattern matching).
  • The MoE architecture makes it possible to build models with a larger total number of parameters. Therefore, a large model built on MoE with very powerful capabilities may be able to more accurately and stably play the role of the system prompt word setting because it is more "smart" and more capable. However, this is not a direct effect of the MoE architecture itself, but the MoE's participation in the construction of such a high-capability model.
  • Possible specialization (theoretical):  If during training or fine-tuning, the model often handles tasks that require a certain persona or style, in theory some experts in the MoE may gradually become better at handling patterns related to that style or task. But this is more of a potential side effect of the training data and task distribution.

Summarize:

MoE is an architectural choice that focuses on computational efficiency and model scalability . The persona in the system prompt is a high-level instruction used to guide the model's behavior and output style . They are at different levels and are basically independent of each other. The improvement in the overall model capability brought by the MoE architecture can indirectly provide the model with a better foundation to understand and execute the persona requirements set in the system prompt.

  • You can choose to use the MoE framework, and the character is a "professional scientific assistant".
  • You can also choose to use the MoE framework and set your character to be a "friendly chat partner".
  • You can also choose to use a non-MoE dense architecture, and the character setting is a "professional scientific assistant".
  • You can also choose to use a non-MoE dense architecture, and the character is a "friendly chat partner".

Changing one dimension (e.g. switching from another architecture to a MoE architecture) does not directly dictate what the other dimension (persona) must look like. Conversely, changing the persona (from assistant to partner) does not require changing the underlying MoE architecture itself.

(III) The reasoning model still needs to write system prompt words

The impact of not writing system prompt words:

  1. Rely on the model's default behavior:  The model will use the default system prompts that were either pre-trained or set by the developer (usually as a generic, helpful assistant). This may not meet your specific needs.
  2. Unstable/inconsistent output:  The model’s tone, persona, and response style may change across multiple interactions or different conversations because it has no clear “personality” to guide it.
  3. Difficult to control output:  If you require a specific format, strict constraints, or a specific processing flow, it is difficult for the model to "guess" your intentions without giving instructions.

If you don’t write system prompts, you need to repeat some instructions or constraints in each user prompt, which increases communication costs and the probability of errors.

Of course, writing appropriate system prompts depends on skills and experience, and requires multiple attempts and optimization. Although overly lengthy or strict system prompts can sometimes limit the model's creativity or its ability to understand complex instructions.

However, there is currently no clear criterion for what is "too lengthy and strict", and this issue still needs to be explored.