Woter AI detection.Hurry - ends Jun 29th

New Year Sales :up to 80% OFF

AI Humanize AI Translator Bypass AI AI Rewriter AI Detector

PRICING

TRY FOR FREE

Pre-training, fine-tuning, cue engineering, and RAG that everyone can understand (I promise)

Written by

Caleb Hayes

Updated on:June-21st-2025

In this article, we use a familiar case to help you thoroughly understand the "high-level" concepts of large language models:

• Pre-training
• Fine-tuning
• Prompt word engineering
• RAG (Retrieval Enhanced Generation).

This case is the learning and examination process that each of us has experienced .

After reading this article, you will have a new understanding of how AI works and will be more comfortable using it in the future.

Tips: This article was edited and polished by DeepSeek based on my recordings. I only made some minor adjustments.

Pre-training

At the beginning of each semester, the teacher would give us a bunch of textbooks and then explain the knowledge points in class.

In this process, we gradually internalize, understand and absorb a semester's worth of knowledge - this is actually the pre-training process of the large language model .

There is a key distinction here: rote memorization vs true understanding.

• Memorize by rote : You can only remember the questions and standard answers. You can answer the original questions in the exam, but you are helpless if the questions are slightly changed.
• Real understanding : You can integrate and apply what you have learned to different question types and scenarios - this is the core strategy of modern large language models

AI answers = exams

When you see the test questions, you will:

1. Understand the topic requirements
2. Recall relevant knowledge in the brain
3. After a series of reasoning and knowledge integration
4. Finally, write down the answer word by word

This is exactly how AI answers questions !

When you ask the AI a question, it also:

1. Understand your question (cue words)
2. Calling the knowledge learned from pre-training
3. Generate answers word by word through internal reasoning and integration

Prompt words vs. exam questions

Sometimes, you learn too much and too diverse things, and when faced with a very simple problem, you may not know which knowledge point is related.

At this time, clarity of the title is crucial .

If the teacher clearly states when setting the question:

• Which chapter and section to examine?
• What specific knowledge needs to be called upon
• Format requirements for answers

Then no matter how messy your knowledge is, as long as the question is clear enough, you can accurately call on relevant knowledge to answer it.

The Essence of Prompt Word Engineering

This is the essence of the prompt word project !

When you ask the AI:

• The simpler and more ambiguous the question, the more likely the AI will give a "random answer" (actually, it is randomly calling relevant knowledge)
• The more detailed the question, the clearer the direction, and the more standardized the format, the higher the quality of the AI answer

Tips: AI does not give random answers, but its knowledge is too vast. When the question is unclear, it can only randomly select relevant knowledge to answer.

Fine-tuning = doing real questions

Many people mistakenly believe that "AI doesn't know this knowledge, so it can be fine-tuned." This is a misunderstanding of fine-tuning !

Fine-tuning is more like a teacher taking you through the real questions before the exam:

• You don’t know what questions will be on the college entrance examination
• However, the teacher will explain the correct answers and solution ideas of previous years’ exam questions
• In this way, you learn "how to answer better"

The essence of fine-tuning is to teach the AI to answer better , rather than teaching it new knowledge.

If AI has never learned a certain knowledge point, it will be useless no matter how many real questions (fine-tuning) it does!

Continue pre-training = repeat

The only way to make AI master the knowledge points that it originally did not know is to continue pre-training (equivalent to students repeating):

• Prepare corpus containing proprietary knowledge (such as internal company documents)
• These corpora contain explanations of knowledge and various relationships
• AI acquires new knowledge by continuing to learn from this material

By the same token, if the college entrance examination tests a concept that you have never learned before, even if you are given a reference book, you will not be able to grasp it and answer it correctly in a short period of time.

——This is the limitation of RAG (Retrieval Augmented Generation).

RAG = Open Book Examination

RAG is like an open book exam. The key is how to prepare and organize the "cheat sheet" :

• Bad practice : Bringing a whole book into the exam room
• The right approach : Organize knowledge points and keyword indexes in advance to quickly locate relevant content during the exam

So do RAG:

• The fragments must be carefully cleaned and disassembled to ensure they are intact.
• Keep the paragraphs as short as possible to make them easier to copy

The currently popular "personal knowledge base" products simply split documents (such as each paragraph is 2,000 words), which may cause knowledge points to be fragmented and ultimately the quality of AI answers to be low.

Good student bad student

A large language model is like a "good student with excellent academic performance", but a good student may not always get high scores .

Because it does not have the discernment and judgment of a "good student":

1. It requires "good questions" (clear prompts), otherwise it will randomly select knowledge points to answer questions
2. If the materials provided for the open-book exam are wrong or incomplete, they will not be corrected but copied.
3. When you encounter knowledge that you have never learned before, no matter how many "real questions" you do, it will not help

For example, I asked the Claude model "What is MCP" (a new feature officially released by Claude), and it made up complete nonsense! Because:

• When MCP is released, the model has "graduated" (training is complete)
• It doesn’t understand, but it will make it up

Summarize

The logic of knowledge learning and output between humans and large language models is surprisingly consistent . Understanding this, you can:

• Use AI tools more effectively
• Reasonable expectations of AI’s capabilities
• Choose the appropriate method (pre-training, fine-tuning or RAG) for different needs

Now, do you have a new understanding of how large language models work?