Don’t just think “Please help me write…” anymore. See how Google deconstructs Prompt Engineering (full of useful information)

Written by

Silas Grey

Updated on:July-02nd-2025

Don’t just think “Please help me write…” anymore. See how Google deconstructs Prompt Engineering (full of useful information)

The internet is full of prompt "best practices" guides, many of which are superficial. You may have seen a lot of them, and they feel like recipes, telling you to add a few spoonfuls of this and a few spoonfuls of that. But today, let's get a little hardcore and see how the search giant Google systematically understands and practices prompt engineering. This white paper from Google peels off the shell of prompt engineering and gets to the heart of it. Get ready, this is something worth saving and pondering.

Why is Prompt Engineering so important?

First, you need to understand what an LLM (Large Language Model) is. Don't be fooled by those fancy terms, it is essentially a prediction engine . You give it a piece of text, and it predicts the next most likely word (or token) based on the massive amount of data it has "seen". It repeats this process over and over again, adding the predicted word to the input and continuing to predict the next one.

The prompt you write is to set the initial state for this prediction engine and guide it to the prediction sequence you expect . This is like assigning a task to an extremely smart apprentice who lacks specific goals. The clearer and more clever you are, the better he will complete it. On the contrary, vague instructions will only produce mediocre or even wrong results.

Therefore, Prompt Engineering is not magic, it is a craft, which is about how to accurately and efficiently guide this prediction engine to produce what we want . Everyone can write a prompt, just like everyone can write a few lines of code, but to write a good prompt, to write a prompt that can stably solve the problem, you need to understand the mechanism behind it and master some skills.

Controlling LLM output: the knobs you need to know

If you interact directly with the model API or a platform like Vertex AI (rather than a simple chatbot), you will find that there are many parameters that you can adjust. These parameters are like knobs that you can turn to control the behavior of the LLM output:

Output length (Max Tokens):
This determines how much content the model can generate at most. Note that it does not make the model more concise, but it stops when the word limit is reached. If it is too short, the words may not be finished, and if it is too long, it may increase the cost and response time, or even cause the model to continue to output meaningless "filler words" after completing the task.
Temperature:
Control the randomness of the output. With a low T value (e.g. close to 0) , the model tends to choose the most likely word, and the output is more stable and certain, which is suitable for tasks that require factual and fixed answers. With a high T value (e.g. close to 1) , the model will consider more possibilities, and the output will be more diverse and "creative", but may also be more outrageous. T values that are too high or too low may lead to "repeated loop" bugs.
Top-K & Top-P:
Both of them are used to restrict the model to select from the words with the highest probability. Top-K only considers the words with the top K probability. Top-P (Nucleus Sampling) selects those words with cumulative probability reaching P. They can both adjust the diversity and randomness of the output. Usually you can use them together, for example, first filter the candidate words by Top-K and Top-P, and then use Temperature to sample from them.

The key is that these parameters affect each other . Setting one parameter to an extreme may make other parameters invalid (for example, when Temperature=0, Top-K/P is irrelevant). There is no universal setting, you need to experiment and adjust according to the specific task. Want stable results? Try low T. Want to be creative? Increase T, K, and P.

Core Prompting Technology: From Entry-level to Advanced "Moves"

After understanding the basic principles and parameters, the next step is to learn the specific prompting techniques. This part is the real dry goods. Mastering them can significantly improve your collaboration efficiency with LLM:

Zero-shot:
The simplest and most direct way is to describe the task or question directly without giving any examples. For example: "Classify this movie review as positive, neutral, or negative: [movie review text]". This is sometimes enough for simple tasks or powerful models.
Few-shot / One-shot:
This is a key step to improve performance. Give the LLM one or a few examples that demonstrate the format or pattern of input and output you expect . Just like teaching a child, show them by example. The model will imitate your examples to complete the task. The examples should be high quality, diverse, and even consider edge cases.
Role/System/Contextual Prompting:

Role Prompting:
Have the LLM play a specific role, such as "You are now an experienced Python programmer" or "Explain black holes to a 5-year-old." This effectively sets the tone, style, and intellectual scope of the output.
System Prompting:
Give more explicit instructions or rules, such as "answers must be in JSON format", "language style should be humorous", "answers should be respectful to others".
Contextual Prompting:
Provide background information related to the current task. For example, when generating article suggestions, first tell it "You are writing articles about retro arcade games from the 80s for Niche."
All three are often used in combination to accurately guide the model.

Chain of Thought (CoT):
This is a great tool for LLM to handle complex reasoning tasks . Instead of asking for the answer directly, it is better to let it "think step-by-step". LLM will output the reasoning process first, and then give the final answer. This is particularly effective for math problems, logic problems, etc., and can significantly improve accuracy. It is even better when combined with Few-shot CoT.
Self-consistency:
An advanced version of CoT. For the same problem, use a higher temperature to generate multiple different CoT reasoning paths, and then see which final answer appears the most times, and choose that one. Similar to "collective voting", multiple sampling is used to improve the stability and accuracy of the results, especially when the reasoning path is not unique.
Step-back Prompting:
When encountering a complex problem, instead of asking it directly, let the LLM think about a more general or higher-level question/principle related to the problem first, and then use this "step back" insight as context to solve the original specific problem. This can activate the model's deeper knowledge and improve the insight of the answer.
ReAct (Reason & Act):
Let LLM not only think, but also "act". "Action" here usually refers to calling external tools, such as searching the Internet, running a code interpreter, etc. LLM will generate the thinking process and the next action to be performed (such as searching for a keyword), and obtain the observation results after executing the action. Then, based on the observation results, it will continue to think and act, forming a "think-act-observe" cycle until the problem is solved. This is the basis for building a more powerful agent.
Code Prompting:
LLM is also good at programming. You can use it to:

Write code:
“Write a Python script that reads all the .txt files in a folder and adds a ‘DRAFT_’ prefix to the beginning of each file.”
Explanation of the code:
"Explain what this Bash script does?"
Translation code:
“Translate this Bash script into Python code.”
Debugging and reviewing code:
“This Python code reports an error [error message]. Please help me find out what is wrong and give me suggestions for corrections.”

Best Practices for Becoming a Prompt Master (Condensed Version)

Do you feel that the amount of information is overwhelming after reading so many technologies? Don’t worry, just remember the following core principles and keep applying and experiencing them in practice:

Provide Examples:
Few-shot prompts usually work much better than zero-shot ones. Clear examples are the best teachers.
Design with Simplicity:
Prompt language should be direct and clear, avoiding ambiguity and unnecessary complexity. If you find it confusing, your LLM will probably be too. Use verbs to give clear instructions.
Be Specific:
Don't just say "write a blog", say "write a three-paragraph blog about the top five gaming consoles in an informative and engaging style." Be clear about length, format, style, and key points.
Instructions over Constraints:
Try to tell the model "what to do" rather than "what not to do" . Positive instructions are usually more efficient and flexible than a bunch of negative constraints. Of course, safety and fixed format constraints are sometimes necessary.
Controlling output formats (Output Formats):
For tasks such as extracting information, classification, and sorting, outputting in a structured format such as JSON or XML is usually more stable, more reliable, and reduces hallucinations. Be careful to handle possible JSON truncation issues. Input can also be normalized using Schema.
Iterate & Document:
Prompt Engineering is an experimental science . You need to keep trying, adjusting, and evaluating the results. The most important thing is to record every attempt you make in detail (which model you used, what parameters, complete prompt, output results, and effect evaluation). This is not only for easy review and debugging, but also the key to maintaining stable results between different model versions. It is recommended to use a table to record.

Conclusion

This white paper from Google is more of a thinking framework and toolbox than a guide. It tells us that effective prompt engineering is not a one-day job. It requires you to understand the working principle of LLM, master various guidance skills, and hone them through a lot of practice and iteration.

Every technology and best practice here has great potential to improve the efficiency of AI collaboration. Save this article, read it from time to time, and try it out in your own scenarios. Mastering this craft will benefit you greatly, whether you want to improve work efficiency or explore new AI applications and business opportunities.

If you want the big model to write high-quality short stories, and are interested in other content, you are also welcome to click the link below. The effect is good and many friends say it is good.

Click here: Super Writing Prompts and the Strongest Writing Guidance

The effect is as follows