90% of AI conversations are stupid, the core reason is memory problems

Explore the core obstacles of AI conversational intelligence and uncover the secrets of the memory module.
Core content:
1. Why AI conversations seem "stupid": the problem of mixing models and domain knowledge
2. Two major requirements for AI clones: consistency of thinking and consistency of style
3. The importance of memory modules: parameter memory and contextual unstructured memory
In fact, whether it is a conventional AI application or the Agent framework that everyone is talking about now:
There has always been a problem that is difficult to solve: how to mix models with domain knowledge (personal knowledge) .
Because most companies use models roughly and directly provide prompt words , such as the part where we generated opinions for the article "Why AI multi-round conversations are so stupid".
This kind of generated prompt words, in essence, uses the knowledge of the model itself, so it cannot be called a qualified avatar . For example, my AI avatar has a speech:
Your hyena philosophy is quite good! But the Huawei plan was successful because the top management broke the monopoly of the veterans. Forcing the directors to decentralize power but allowing the board of directors to form cliques is no different from asking a cripple to run a marathon.
Logically, I would never make such a statement. The core reason is: I am not familiar with Huawei, and all my cases come from my daily work. This is why it feels so clear to everyone when they read it .
The point here is that every time the AI speaks, it must meet expectations. It must have my knowledge and habits . It has two requirements:
When generating opinions, they are consistent with my thinking; When expressing opinions, they should be consistent with my style;
All of this actually requires only one thing: the model has memory function ...
LLM Memory
Memory has always been the focus of research in the Agent era, and it is also a stumbling block that is difficult to overcome in current AI applications.
In fact, many companies that lack traffic are happy to see this , because applications in the AI era have become less of a technical secret , and data assets may be their last barrier.
On the other hand, every model release in the past few days may subvert some startups. For example, the release of GPT has brought many teams working on Wenshengtu to a standstill.
But memory is a little different: the hallucination problem is logically difficult for the model to solve, so it is definitely not wrong to do RAG based on the knowledge base .
To gain a deeper understanding of memory issues, we can start from two perspectives:
First, storage : how should the data (memory) of a large model be stored? Second, application : how to enhance the model's contextual understanding ability through data (memory). In addition to the most basic memory, it also involves issues such as updating, forgetting, and the comprehensiveness of memory .
Then another paper made a basic classification of memory, which I think is pretty good and can be used directly:
https://arxiv.org/pdf/2505.00675
1. Parameter memory
The so-called parametric memory is the built-in memory of the model, which is what we often call the model's own knowledge base. It is formed through pre-training and fine-tuning including RL.
This built-in knowledge is an immediate, long-term, persistent memory that enables rapid, context-free retrieval of facts and common-sense knowledge.
In other words: the fine-tuned model generalizes better than the cue word .
However, the problem is also very clear: first, there is a lag in knowledge time, and more importantly, various domain knowledge is missing . If only parameter memory is used, then the model is similar to an employee on probation .
2. Contextual Unstructured Memory
Contextual Unstructured Memory can be understood as multimodal information, including text, images, audio and video.
They can enable the model to have the ability to read, see and hear, and are born to solve the problem of agent perception ability.
3. Contextual Structured Memory
Contextual Structured Memory is our most common knowledge structure.
Such as knowledge graphs, relational tables, or ontologies , while remaining easy to query. These structures support symbolic reasoning and precise querying, often complementing the association capabilities of pre-trained language models.
PS: Although it can be used directly, the nutrition of AI papers is really low now ...
About the processing of knowledge
Consolidation | MemoryBank | ||
Indexing | HippoRAG | ||
Updating | NLI‑transfer | ||
Forgetting | |||
Retrieval | LoCoMo | ||
Compression | xR |
I won’t dissect the paper one by one, I will just interpret it according to my understanding. The so-called memory operation is to convert the volatile short-term context to the persistent long-term memory. Its core difficulty is:
Which content to choose? What format to save in? How to make LLM really “rememberable” in the future?
For example, I wrote a 40-lesson management course . Now I want to create an AI clone. How should I consolidate the knowledge here? How can I use the least amount of work to make LLM both find and "remember" my content?
1. What content is stored?
The whole process is divided into three layers: external RAG layer → structured layer → light fine-tuning layer , in order of value. The first is content selection, which can be done with:
Every time a lesson is uploaded, the "extraction + summary" script is run to write the three levels of content in the above table into a layered database, and a small amount of manual proofreading is done to ensure the accuracy of key concepts .
2. What format to save
This is actually quite simple. Just use the structured knowledge base + RAG together . The processed data will look like this:
{
"id" : "L17-okr-loop" ,
"type" : "concept" ,
"title" : "OKR Cycle" ,
"summary" : "Set goals → Key results → Align → Check → Review" ,
"keywords" : [ "goal management" , "OKR" , "cycle" ],
"lesson" : 17,
"timestamp" : "2025-05-10T12:00:00Z" ,
"importance" : 0.9
}
In fact, the structured knowledge base will directly introduce the knowledge graph here .
3. Make LLM memorable
The so-called recall rate is high. There are many strategies here. For example, we first filter the relevant information in 40 lessons based on the explicit words in the question (narrowing the vector search range and reducing the delay by 40-60%).
This means that you can use the model to optimize a wave of questions, extract keywords, and then search.
Secondly, select knowledge (≤ 500 items) with a “high frequency hit rate > 30%” and “the answer can be explained in one step”.
That is to say, use some strategies to discard most of the unnecessary returns .
This is actually the general RAG operation, which is actually quite simple to explain, so you can appreciate it yourself...
Finally, let’s talk about the issue of knowledge graphs.
Knowledge Graph
Regarding how knowledge graphs can enhance large models, there have been previous articles introducing this: Knowledge Graphs
Today, we will continue with the previous case study: building my own AI avatar . The most critical challenge here is how to make the model truly "inherit" my knowledge system and thinking mode.
Here we will show you how to transform 40 management courses into a knowledge graph and achieve deep collaboration with a big model to create an AI avatar that truly "understands you".
1. Knowledge Extraction
Converting 40 management courses into a knowledge graph is not a simple text conversion, but requires the establishment of a three-level knowledge representation system of concept layer, relationship layer and case layer:
Conceptual layer: extract the core management theories, methodologies and tool frameworks in the course Node examples: OKR cycle, 5W2H analysis method, Drucker's five tasks Attributes include: definition, proposer, applicable scenarios, advantages and disadvantages Relationship layer: Establish multi-dimensional associations between concepts "OKR cycle" → "derived from" → "MBO theory" "5W2H analysis method" → "can be used for" → "problem diagnosis scenario" Case layer: connecting abstract theory with concrete practice "Huawei Department Wall Case" → "Verification" → "Barriers to Cross-Departmental Collaboration" "Netflix Culture Change" → "Embody" → "Situational Leadership Theory"
In fact, knowledge organization directly determines the quality of subsequent model answers, so it is worth spending a lot of effort here!
2. Graph Construction
There are many frameworks for graph construction. We will briefly describe them here. Taking the "target management" module as an example, its knowledge graph fragments may include:
{
"nodes" : [
{
"id" : "MBO" ,
"type" : "concept" ,
"label" : "Management by Objectives (MBO)" ,
"properties" : {
"definition" : "a goal-oriented management approach proposed by Peter Drucker in 1954" ,
"core_principles" : [ "goal setting" , "self-control" , "results orientation" ],
"lesson_reference" : [ "L03" , "L17" ]
}
},
{
"id" : "OKR" ,
"type" : "concept" ,
"label" : "OKR goal management method" ,
"properties" : {
"derived_from" : [ "MBO" , "SMART principle" ],
"implementation_steps" : [ "Goal setting" , "Key result definition" , "Regular review" ],
"case_studies" : [ "Google 2018 OKR Implementation" , "ByteDance Bimonthly OKR" ]
}
}
],
"edges" : [
{
"source" : "OKR" ,
"target" : "MBO" ,
"type" : "derived_from" ,
"weight" : 0.9
},
{
"source" : "OKR" ,
"target" : "SMART" ,
"type" : "enhanced_by" ,
"weight" : 0.7
}
]
}
This structured representation makes knowledge traceable (each conclusion has a course source) and composable (different concepts can be freely associated).
3. Search
Next comes the key retrieval enhancement phase. When the AI avatar needs to answer user questions, it uses a three-stage process of graph retrieval → vector screening → context construction :
1. Graph pattern matching: Convert natural language questions into graph queries , such as:
Question: "How do OKR and KPI work together?" → Match the "OKR" and "KPI" nodes and the paths between them
2. Subgraph extraction and vector screening: In fact, it is to extract all the knowledge with relatively close concepts:
(OKR Basic Principles)
│─┬─ Includes: [Challenging goal setting] (weight 0.9)
│ ├── Conflict: [Feasibility Assessment] (need to be balanced)
│ └── Application: [Google’s 2014 OKR Reform]
│
(Goal Setting Theory)
│─┬─ Source: [Drucker's MBO theory]
│ └── Tools: [SMART principle]
3. Context Construction: Converting Search Results into Natural Language Prompts
This is relatively simple, for example: "According to courses L17 and L23: 1) OKR focuses on goal orientation, KPI focuses on indicator measurement... 2) Huawei's practice shows that..."
4. Strengthening the thinking chain
When generating responses, the AI avatar simulates the thought process of a professional consultant:
Concept positioning: This question involves the comparison between OKR and KPI in the field of goal management Knowledge extraction: Three relevant cases and two theoretical frameworks in the recall course Integration of viewpoints: My typical analysis angle is to first distinguish the applicable scenarios and then discuss the integration methods Style adaptation: Use the three-part expression of problem-root cause-solution
You are a senior management consultant, please answer according to the following structure:
1. The essence of the problem: use one sentence to point out the core contradiction
2. Theoretical basis: cite 2-3 key concepts from the course
3. Practical case: briefly describe a relevant business case
4. Personal opinion: Use "I think" to express a clear position
Current question: {User question}
Related knowledge: {retrieved subgraph information}
V. Case
Finally, let me give you another example. For example, a fan asked a question today: How can parachuted executives quickly establish team prestige?
The processing flow of AI clones should be:
First, knowledge retrieval
Matched to the "New official takes three actions upon taking office" case (L12) Relating "Sources of Power Theory" (L08) and "Situational Leadership Model" (L15) Recalling the "Three-month Survival Rule" I mentioned in internal training
Second, opinion generation
1. The essence of the problem: This is about the balance between leadership legitimacy and change management
2. Theoretical basis:
- According to course L08, sources of power include position power and personal power
- Situational leadership theory emphasizes that different leadership styles are needed at different stages
3. Typical cases:
- Case study of Zhang Yong’s arrival in Alibaba mentioned in Course L12: Only three things were done in the first 30 days...
4. My opinion:
I believe that parachuted leaders should avoid the "proven themselves" trap and should:
- 70% of the time is spent listening and diagnosing
- Solve 1-2 obvious pain points first to build trust
- Build change capital through quick wins
Third, style adjustment
Add a catchphrase: "Remember, management is a craft" Use common parallel sentences: "First, you must... Second, you must... Third, you must..." Maintain the iconic 70% theory + 30% case narrative ratio
Through this technical solution of deep integration of knowledge graphs and large models, AI clones will no longer be simple repeaters , but truly become digital twins with a consistent worldview and professional judgment...
Conclusion
The above actually revolves around model memory. I believe everyone has understood it by now. The so-called model memory is the application of the knowledge base: external RAG casts short-term perception, knowledge graph constructs long-term index, and lightweight fine-tuning implants personality.