Prompt words

Written by
Audrey Miles
Updated on:July-01st-2025
Recommendation

Explore the prompt word technology of artificial intelligence language model and unlock the secret of efficient communication and creation.

Core content:
1. The definition of prompt words and their application in large models
2. In-depth analysis of prompt word engineering and tokenization
3. The principle of large model LLMs and its interactive relationship with prompt words

Yang Fangxian
Founder of 53AI/Most Valuable Expert of Tencent Cloud (TVP)

Knowledge is the beginning of action, and action is the completion of knowledge. - Wang Yangming

1. General Outline

2. Follow the steps to explain

1. What is a prompt?

A prompt is a text instruction used to guide large models (such as ChatGPT) to understand requirements.

2. What is the prompt word project?

Prompt Engineering is a technology that guides the Large Language Model (LLM) to generate output that is more in line with the expected output by designing and optimizing input instructions (Prompt). Its core lies in:

  • Controlling outputs: Reducing the randomness of models through structured instructions.
  • Unleash capabilities: Unlock the model’s potential performance in specific tasks (e.g., reasoning, creation, analysis).
  • Aligning intent: Translating human needs into expressions that the model can understand.

3. What is a token?

In natural language processing (NLP), a token is the smallest unit of text that a model processes.

1. Tokenization

  • Split the input text into discrete units (e.g., words, subwords, symbols) that the model can understand
  • For example, the sentence "ChatGPT is powerful!" may be split into ["Chat", "G", "PT", " is", " powerful", "!"] (the specific word segmentation method depends on the model's tokenizer).

2. Token meaning

  • The model understands the context through the sequence of tokens and predicts the next token based on probability.

3. Token restrictions

  • The model has an upper limit on the total number of input and output tokens (for example, GPT-4 supports 8k/32k/128k tokens).
  • The number of tokens directly affects the model's computational workload and API call fees.
  • Each token carries different semantic information (such as punctuation marks vs. professional terms).

4. Relationship between token and prompt word

1. The prompt word is a sequence of tokens

  • The prompt word will be converted into a token sequence by the tokenizer, and the model generates output based on these tokens.
  • For example: Prompt "Write a five-character quatrain about autumn" → Token sequence [write, a, poem, about, autumn, of, five-character, quatrain].

2. The number of tokens determines the “horizon” of the model

Context Window :

The total number of tokens that the model can process is limited (e.g. 4096 tokens), and the excess will be truncated. The longer the prompt word is, the more tokens it occupies, and the fewer tokens are left for output.

Position Sensitive :

The model is sensitive to the position encoding of tokens, and key instructions should be placed at the front (to avoid truncation).

Attention weights :

In the self-attention mechanism, different tokens have different weights. Example: Repeating key tokens (such as "code, Python, efficient") in the prompt word can strengthen the model's focus.

5. What are LLMs?

1. What is the Large Model (LLM)

Large models refer to language models with a huge number of parameters (usually reaching billions or even hundreds of billions). They are based on deep learning technology (especially the Transformer architecture) and can understand and generate human language.

Typical representatives: OpenAI's GPT series (such as GPT-3, GPT-4), Google's PaLM, Meta's LLaMA, Anthropic's Claude, etc.

2. Transformer Architecture

  • The self-attention mechanism enables the model to capture long-distance dependencies between tokens
  • Transformer's parallel computing capability enables it to efficiently process large-scale data

3. Large-scale pre-training

  • The large model is pre-trained through large-scale unsupervised learning (generating the next token)

4. Parameter scale

  • The number of parameters of large models usually ranges from billions to hundreds of billions.

For example, GPT-3 has 175 billion parameters. A larger number of parameters means that the model can store more knowledge.

5. Context Window

  • Large models can process long text sequences. The context window determines the number of tokens that the model can process simultaneously.

For example, the context window of GPT-4 is extended to 32K tokens. Long context windows enable the model to better understand complex tasks and long documents.

6. Fine-tuning and alignment

  • After pre-training, large models can be adapted to specific tasks or human preferences through fine-tuning or alignment.
  • Fine-tuning: Perform supervised learning on a specific dataset to optimize model performance
  • Alignment: Make the model output more consistent with human values ​​through human feedback reinforcement learning (RLHF)

6. What are the steps from prompt to output?

  • 1. User input command (prompt)
  • 2. Instruction preprocessing

Text cleaning: remove garbled characters/sensitive words

  • 3. Vector encoding

Word segmentation: break the sentence into tokens (such as "deep learning" + "learning"); vectorization: convert each token into an n-dimensional mathematical vector; position encoding: mark the order of words

  • 4. LLM calculation

Attention mechanism: 1. Find keywords (similar to highlighting key points when reading) 2. Knowledge retrieval: activate related memory blocks (such as loading the physics knowledge tree when asking "quantum computing") 3. Logical reasoning: perform if-then judgment (if a "compare" instruction is detected, start the comparison module)

  • 5. Content Generation Layer

Text decoding: Converting mathematical vectors back into text

  • 6. Result optimization layer

Formatting: Automatically add Markdown

  • 7. Delivery Response Layer

Interaction design: Add action buttons (such as "Refine answer"/"Expand case")

7. Prompt word engineering technology

Very strong prompt word document: https://www.promptingguide.ai/zh/techniques/cot

8. Note

1. The prompt word itself is not memorable

Each input is an independent event

2. Conversational Short-term Memory‌

Automatically retain context during continuous conversation (up to 4,000 words)

3. Long-term memory needs to be customized

Implemented through "memory library + vector search" (need to develop interface)

3. Summary of prompt practice

Relevant prompt words and the development of large model projects will be released later, so stay tuned.