AI also needs to "take notes": Karpathy sees the future from Claude's 16,000-word prompt

Written by
Iris Vance
Updated on:June-22nd-2025
Recommendation

Karpathy deeply analyzes the Claude system prompts and reveals the future development direction of AI.

Core content:
1. The length of the Claude system prompts is much longer than OpenAI, containing 16,739 words
2. The prompts cover tool definitions, user preferences, styles, etc.
3. A large number of temporary modification traces in the prompts reflect the need for continuous iteration and update of AI

Yang Fangxian
Founder of 53A/Most Valuable Expert of Tencent Cloud (TVP)

A few days ago, Cluade's new system prompt words were leaked. It was 16,739 words long, which was very long.

In comparison, OpenAI’s o4-mini system in ChatGPT has 2,218 words, just 13% of Claude’s.

?

What is the system prompt word

LLM's system prompt is a "one-page instruction manual" given to AI at the beginning of a conversation, telling it what role to play, what rules to follow, and how to answer the user.

Let's take a look at what the main contents of such a long prompt are:

  • The tool definition occupies the largest proportion, which details the specific information of the 14 MCPs that Claude can call. The short one is only a dozen lines long, while the long one, such as the introduction of Google Drive search, is more than 1,700 words.
  • Then there is the user preferences and style section, which mainly specifies in detail how Claude should behave and respond to user requests, as well as what to do and what not to do. For example, when dealing with design calculations, there are also issues involving knowledge deadlines, and how to write when users ask for poetry.
  • The final citation instructions, artifact instructions, search instructions, and Google integration notes are actually also about tool usage, but this part is not related to MCP and is therefore singled out.

Moreover, the entire prompt is full of traces of temporary modifications. These modifications are often not in the form of lists in XML or Markdown format, but are just a paragraph that looks like a patch for some hot events or problem fixes.

If you are using any gmail tools and the user has instructed you to find messages for a particular person, do NOT assume that person's email. Since some employees and colleagues share first names, DO NOT assume the person who the user is referring to shares the same email as someone who shares that colleague's first name that you may have seen incidentally (eg through a previous email or calendar search). Instead, you can search the user's email with the first name and then ask the user to confirm if any of the returned emails are the correct emails for their colleagues.

Claude The entire system prompt is so long that maintenance, updating and even version control should require a special process, otherwise it is easy to have problems after going online. I don’t know what the specific process is.


In addition to learning Cluade's writing prompts, I shared this mainly because I saw a point Karpathy made today.

Inspired by Claude’s word prompt, he said that the current learning method of large language models (LLMs) still lacks an important paradigm, which he called “system prompt learning . ”

Currently, the two mainstream learning methods of our LLM - pretraining and fine-tuning (including supervised learning SL and reinforcement learning RL) - both rely on updating model parameters, but this is not completely consistent with some human learning methods.

?

Current LLM Learning Paradigm

  • Pretraining : It is mainly used to allow the model to acquire extensive knowledge. Through large-scale corpus training, the model learns language, common sense, and world knowledge.
  • Finetuning (SL/RL) : Let the model form "habitual behavior", such as better following instructions, optimizing conversation style, etc. This process is also achieved by adjusting model parameters.

However, when humans learn new knowledge or solve new problems, they often do not directly "rewrite brain parameters", but instead retain experience and strategies in an explicit form through "taking notes" or "self-reminding".

For example, when you encounter a certain type of problem, you will summarize the experience of "this is how you can do it next time you encounter a similar situation." This method is more like constantly editing your own "system prompts" instead of retraining your brain every time.

Therefore, "system prompt learning" is a mechanism between model parameters and external memory. He believes that LLM should also have a similar "note-taking" capability to store problem-solving strategies, experience, and general knowledge in the form of explicit text, rather than relying entirely on parameter updates.

For example, all of Claude’s system prompts above are written by humans, which is inefficient and difficult to expand. Karpathy believes that ideally, the model should be able to automatically generate and optimize these prompts through “system prompt learning”, just like humans summarize their own experience.

?

Advantages of system prompt learning

  • More efficient data utilization : Through explicit “review” or “summary”, the model can absorb feedback more efficiently, which is higher-dimensional and richer than a simple reward scaler.
  • Stronger generalization : Explicit policies and experience summarization help the model transfer and apply knowledge to new tasks.

Karpathy compared the current LLM to the protagonist in the movie "Memento", who has no "memo" or "draft book" of his own and can only rely on parameters to remember everything. In fact, Master Zang's web page prompts are already playing this role, and a considerable part of them are co-created with AI.

He also said that if “system-prompted learning” can be realized, it will become a new and powerful learning paradigm in the LLM field .

However, there are still many problems that need to be solved, such as:

  • How to automatically edit and optimize system prompts?
  • Is it necessary to design a learning mechanism for the "editing system" itself?
  • How to gradually transform explicit knowledge into the "habitual" parameters of the model?


System prompts are like a personal assistant's instruction manual. The more detailed they are, the more accurate the results.

When communicating with AI, it’s better to be clear than vague. Learn to be specific about requirements and boundaries, just like Claude’s prompts.

Use lists, formatting, and examples. AI is more likely to understand structured instructions.

"Prompt Engineering" is not a high-level technology, but an extension of daily communication skills. Ordinary people can also master it.

If you are interested, you can take a detailed look at Claude's prompt words and learn from them.

Moreover, Karpathy's system prompt learning can actually be partially implemented semi-automatically now, and practitioners can refer to it.