AI Memory Augmentation Beyond RAG: Accelerating Contextual Understanding for Conversational Agents by 2025

Written by
Caleb Hayes
Updated on:June-30th-2025
Recommendation

Explore how AI can go beyond RAG through memory enhancement technology to achieve more accurate context understanding of dialogue agents.

Core content:
1. The wide application of RAG in the field of AI and its limitations
2. The fundamental difference between human memory and RAG in plot context
3. Go beyond RAG and build AI dialogue agents with humanized memory capabilities

Yang Fangxian
Founder of 53AI/Most Valuable Expert of Tencent Cloud (TVP)

introduction

introduction

Every week, we see new papers and new approaches to Retrieval-Augmented Generation (RAG). RAG architectures are everywhere: GraphRAG, GraphRAG, HybridRAG, HippoRAG, and countless other variants. The AI ​​community has embraced RAG as a potential solution to many of the limitations of large language models (LLMs). However, as we build more complex AI systems, especially conversational agents that interact with users in complex ways, we find that RAG alone is not enough.

This article explores why, despite their usefulness, RAGs are fundamentally different from true memory systems, and why we need to go beyond RAGs to develop AI with more humanlike memory capabilities. As we’ll see, memory is more than just retrieving information—it’s also about understanding context, making associations, and, perhaps most importantly, knowing what to forget.

Current status of RAG

Retrieval-augmented generation has become an indispensable component of modern AI systems. The core working principle of RAG is:

  1. 1. Receive inquiries or reminders
  2. 2. Search the knowledge base to retrieve relevant documents or paragraphs
  3. 3. Incorporate the retrieved information into the context window of the language model
  4. 4. Generate a response based on the input and retrieved information

Consider a simple example: a user asks: "What are the main outcomes of the Paris Climate Agreement?" A standard LLM might provide a general answer based on its training data, which may be outdated or incomplete. However, the RAG-enhanced system first searches its knowledge base for specific documents about the Paris Climate Agreement, extracts relevant passages about its outcomes, and then uses this retrieved information to generate a more accurate and up-to-date response.

RAG has proven to be very effective in many applications, especially when building AI-driven search capabilities or processing large amounts of data that need to be fed to LLMs. It helps to base the AI's responses on factual information and reduce hallucinations - instances where the model generates information that seems reasonable but is incorrect.

For example, when Meta launched its RAG-based Llama 2 system, they reported a 20% to 30% reduction in hallucination rates compared to the base model. Similarly, companies that implemented the RAG architecture for customer service applications have seen significant improvements in the accuracy of responses to specific product or policy questions.

As one RAG developer put it: “If you’re building AI-driven search, maybe RAG is a good approach, but if you’re building an agent or a conversational agent or something more complex that interacts with a user, RAG is not enough.”

This limitation becomes apparent as we move beyond simple question and answer into more complex scenarios that require continuous interaction, personalization, and adaptation to changing context.

Why RAG is not a real memory

Despite its usefulness, RAGs differ from human memory in several fundamental ways:

1. Lack of plot context

In human memory, information does not exist in isolation. Facts are always intertwined with related events, life experiences, and other contextual elements. This context gives facts richer meaning and helps us understand them more fully.

Think about the difference in understanding between historical facts you learn from a textbook and those you experience through a powerful documentary or museum visit. Emotional impact, visual context, and narrative structure all have an impact on how you store and recall this information. This rich interweaving of contextual elements is what makes human memory so powerful and nuanced.

For example, if you learn about the fall of the Berlin Wall through a documentary that includes interviews with families who were reunited after decades of separation, your memory of this historical event will be intertwined with the emotional stories you heard. When you later recall facts about the Berlin Wall, these emotional elements may also be activated, providing a richer, more contextual understanding.

As explained in the presentation:

“Information doesn’t exist in isolation—it’s not encyclopedic knowledge. It’s always intertwined with related events, life experiences, and other things. You need episodic memory and context to have a richer interpretation of the facts.”

RAG systems typically retrieve information based solely on semantic similarity or relevance ranking, stripping it of this critical plot context. A RAG system might retrieve a text describing when the Berlin Wall fell, but it cannot capture the emotional or experiential context that shapes how humans understand and relate to this information.

Current RAG implementations treat documents as isolated units of information, failing to capture how they relate to a specific experience, conversation, or interaction. This means that while RAGs can provide factual information, they lack the emotional resonance and personal relevance that makes human memory so effective in learning and decision-making.

2. Limited associative construction

Our minds organize memories through complex networks of associations. When your brain retrieves information about the color red, it may automatically retrieve related concepts such as orange or other semantically related items. These associative mechanisms are fundamental to our thinking and reasoning.

Consider what happens when you think of the word "beach." Your mind likely activates a whole network of related concepts: sand, ocean, sunscreen, vacation, concrete memories of beaches you've visited, the feel of the sun on your skin, the sound of waves. These associations are not just semantic—they span sensory modalities, emotions, personal experiences, and abstract concepts.

"Your semantic memory has a group of memories that are retrieved together because they are semantically close. If your brain retrieves something about the color red, it will also retrieve something about orange and other things related to it - that's just the way our brains are organized."

This associative structure allows for creative connections and insights that go far beyond simple information retrieval. You might make unexpected connections between the rhythm of ocean waves and a piece of music you’re composing, or between beach erosion patterns and a business problem you’re trying to solve.

Current RAG systems have difficulty replicating this rich associative structure. Even graphical approaches that attempt to capture relationships between documents or concepts typically rely on predefined connection types or simple co-occurrence statistics. They lack the multi-dimensional, cross-modal associations that are characteristic of human memory.

For example, a RAG system might associate documents containing the words “beach” and “ocean” based on their semantic similarity, but it would not automatically make connections to related sensory experiences, emotional states, or abstract concepts unless explicitly programmed.

Although graph-based RAG methods attempt to simulate this associative structure, they still cannot reach the rich multi-dimensional associations in human memory. Building effective associativity in AI systems remains a major challenge.

3. Searching without understanding

Perhaps the most significant limitation of RAG is that it retrieves without understanding. As mentioned in the presentation: "You can retrieve some documents, you can build BM25 ranking or other ways of ranking information and relevance, but you still don't understand what you retrieved."

Consider a RAG system that answers questions about climate change. It might retrieve and combine information from multiple scientific papers, policy documents, and news articles. The system can find documents containing relevant keywords and even rank them by relevance using sophisticated algorithms like BM25 or neural embeddings. However, it does not truly understand the scientific concepts, causal relationships, or policy implications discussed in these documents.

This lack of understanding is particularly evident when dealing with nuanced topics. For example, if asked about the relationship between climate change and extreme weather events, a RAG system might retrieve paragraphs stating statistical correlations without grasping the underlying causal mechanisms or the scientific debate about attribution. It might juxtapose contradictory information from different sources without being aware of the contradiction.

Let’s look at a more concrete example: If a RAG system is asked “How does carbon pricing affect industrial competitiveness?” it will likely retrieve documents that mention carbon pricing and industrial competitiveness. However, without true understanding, it cannot independently assess the methodological quality of different studies, identify possible biases in industry-funded studies, or understand the underlying economic mechanisms. It is merely matching patterns, without understanding.

RAG systems can find and rank information based on a variety of metrics, but they don’t necessarily understand what they retrieve in the same way that humans do. Natural language understanding remains a huge challenge, and without it, RAGs are merely mindlessly shuffling symbols without grasping their meaning.

4. No forgetting mechanism

Counterintuitively, one of the most important aspects of human memory is our ability to forget. "Our brains aren't designed to remember things — our brains are more like forgetting things. We are forgetting machines."

This forgetting mechanism is critical for mental health and cognitive function. When it fails, we may experience PTSD and other mental health challenges. Current RAG systems often do not incorporate principled forgetting mechanisms to prioritize important information while discarding irrelevant information.

Consider what happens when you move to a new city. Over time, you gradually forget the detailed layout of your old neighborhood—the location of each store, the names of minor streets—while retaining important information and emotional memories. This selective forgetting is crucial; without it, navigating your new environment will be filled with irrelevant information from the past.

Similarly, when you change jobs, you gradually forget the specific details of the daily processes of your old workplace while retaining the valuable skills and knowledge you acquired there. This forgetting is adaptive, allowing you to focus on your new role without being overwhelmed by outdated procedures.

In contrast, current RAG systems typically retain all information indefinitely. They may assign lower relevance scores to some documents over time, but they have no mechanism to truly “forget” outdated or irrelevant information. This can lead to several problems:

  1. 1.  Information overload : As the knowledge base grows, retrieval becomes increasingly challenging and computationally expensive.
  2. 2.  Outdated information : If not actively forgotten, outdated information will persist in the system. For example, a RAG system may retrieve old product specifications or deprecated API documents mixed with current information.
  3. 3.  Contextual confusion : For dialogue systems, the inability to forget can lead to confusion as the system tries to remain consistent with everything it has ever been told, even if the context changes.

“Imagine having a friend who never forgets anything — having a relationship with them would be quite complicated.” Similarly, conversational agents that lack forgetting mechanisms may become unwieldy over time because they retain every piece of information, regardless of its current relevance.

For example, imagine a customer service AI that remembers every interaction a customer has had with a company, going back many years. Without a proper forgetting mechanism, it might continually refer to resolved issues from the distant past or apply outdated policies. This would create a frustrating experience for customers and reduce the effectiveness of the AI ​​system.

Why RAG is not enough to support advanced AI systems

In addition to its conceptual limitations as a memory system, RAG faces several practical challenges that limit its effectiveness for building truly advanced AI:

1. Context Window Limitation

Even in the case of retrieval, LLMs are limited by their context window. Complex reasoning requires synthesizing information from many sources, which may exceed these limits.

Consider a legal assistant AI that helps prepare cases involving hundreds of prior cases, statutes, and legal opinions. Even if the RAG system is able to identify relevant documents, the language model’s context window may only accommodate a small portion of them at a time. This forces humans to chunk the information, potentially destroying important connections and limiting the system’s ability to reason across the full range of relevant material.

Similarly, imagine a medical diagnostic system that needs to consider a patient’s entire medical history, relevant medical literature, similar case studies, and current symptoms. The fragmentation imposed by context window limitations may prevent the system from making connections between distant but related pieces of information.

These limitations become increasingly problematic as we move from simple fact retrieval to complex reasoning tasks. Although context windows have grown substantially—from 2,048 tokens in early GPT models to 32,000 or more in more recent systems—they still impose artificial limitations that human memory does not face. Humans can seamlessly integrate decades of experiential information when needed, without arbitrary limits on the amount of context we can consider simultaneously.

Some researchers have proposed sliding window methods or recursive summarization techniques to address these limitations, but these workarounds introduce their own problems, including potential loss of detail and increased computational overhead.

2. Retrieval quality bottleneck

The effectiveness of RAG depends on its retrieval component. Poor retrieval—whether due to insufficient indexing, imprecise queries, or insufficient knowledge base content—will result in poor quality results.

This is particularly evident when dealing with nuanced queries. For example, if a user asks: “What are the ethical implications of using predictive algorithms for criminal sentencing?”, the quality of the response depends entirely on whether the system is able to retrieve documents specifically addressing the ethical dimension, rather than just general information about technical aspects of predictive algorithms or criminal sentencing.

Retrieval challenges are exacerbated when dealing with:

  • •  Implied information needs : When a user’s query does not explicitly mention all relevant aspects of their information need. For example, “Is this investment a good idea?” implies a need for information about risk, return, market conditions, and the user’s financial goals, none of which are explicitly mentioned.
  • •  Evolving topics : For emerging topics, terminology has not yet been standardized, or the relevant information is scattered across documents using different terminology.
  • •  Conceptual Inquiry : Questions that require conceptual understanding rather than fact retrieval, such as “How does confirmation bias affect scientific research?”

Current RAG systems often struggle with these challenges. Vector similarity search, while powerful, can miss relevant information that uses different terminology or approaches a topic from a different angle. Hybrid retrieval systems that combine semantic search and keyword matching help address some of these issues, but still fall short of human understanding of information needs.

3. Static Knowledge Representation

Most RAG implementations use fixed vector representations that do not evolve based on new understanding or connections between information.

Consider how human understanding of concepts develops over time. When you first learn a complex topic like quantum physics, you probably form a basic mental model. As you learn more, that model becomes more nuanced and interconnected with other knowledge. As you encounter terms like “superposition” or “entanglement” in different contexts and applications, your understanding evolves.

In contrast, most RAG systems represent documents as static vectors that do not change once created. If new information emerges that changes the meaning or importance of existing documents, the system does not automatically update its representation to reflect these changes.

For example, a RAG system containing medical information might have documents about a particular treatment. If new research shows that the treatment is ineffective or harmful, the system would not automatically update the vector representations of the existing documents to reflect them as outdated or controversial information.

4. Limited self-reflection

As mentioned in the presentation: “RAG has no self-reflection and self-improvement capabilities.” Without the ability to evaluate the quality and relevance of the information retrieved, the RAG system cannot iteratively improve its own performance.

Human memory and cognition involve constant self-monitoring and evaluation. When we attempt to recall information, we sense whether our recollection is accurate or complete. We can identify gaps in our knowledge or recognize when we need to seek additional information. This metacognitive awareness is essential for effective learning and problem solving.

For example, when doctors are diagnosing a patient with unusual symptoms, they may realize that the pattern does not fit neatly into any of the conditions they are familiar with. This realization prompts them to consult additional resources, seek expert advice, or consider rare conditions that they would not normally include in their differential diagnosis.

Current RAG systems lack this self-awareness. When they retrieve information that is only indirectly related to a query, they typically have no mechanism to recognize this mismatch or adjust their retrieval strategy accordingly. They cannot identify gaps in the knowledge base or situations where the information retrieved is insufficient to answer the question.

Some advanced RAG implementations incorporate relevance feedback or uncertainty estimation, but these approaches are still far from human metacognitive capabilities. Without this self-reflection, RAG systems cannot effectively learn from mistakes or adjust their strategies based on past performance.

What is real memory?

To move beyond RAGs and toward more human-like AI memory, we need systems that:

1. Multimodal memory structure

Human memory operates across different modalities, languages, and information types. Advanced AI memory systems require “rich structured representations and understandings of things,” potentially including multilingual structures where different languages ​​can represent the same concept.

Our memories are not limited to a single format or modality. We remember faces, voices, smells, emotions, facts, procedures, and narratives—all through interconnected yet unique memory systems. Each of these modalities contributes to a rich, multidimensional representation of our experiences and knowledge.

Think about how you remember your childhood birthday parties. You might recall:

  • • Visual elements: decorations, cakes, faces of people
  • • Sounds: laughter, birthday song, certain dialogues
  • • Emotions: excitement, happiness, and perhaps some moments of disappointment
  • • Procedural memory: how to play the games you have played
  • • Semantic information: who is present, how old you are, what gift you received

These different types of information are stored and accessed through different but interconnected memory systems. When one element is activated, it often triggers related memories across modalities.

Current AI systems, including RAG, typically operate within a single modality — usually text. Even when they process multiple modalities, such as images and text, they typically convert everything into a single representation format (such as an embedding vector), which loses the unique characteristics of different types of information.

For example, a memory of a sunset at the beach contains a visual component (the color of the sky, the texture of the sand), an auditory component (the sound of the waves), an emotional component (a feeling of calm or awe), and possibly a semantic component (knowledge about why the sunset has a particular color). A truly multimodal memory system would preserve these unique aspects while maintaining the interconnections between them.

Advanced AI memory systems should support:

  • •  Unique but interrelated representations of different types of information
  • •  Cross-modal associations , allowing activations to propagate between modalities
  • •  Modality-specific processing that respects the unique characteristics of different types of information
  • •  Unified access , enabling retrieval of relevant information independent of the original modality

In this regard, some promising research includes multimodal transformers, cross-modal retrieval systems, and neural-symbolic architectures that combine neural representations with symbolic reasoning.

2. Proactive Refactoring

"Recollection is active construction—we retrieve memories and construct them on the fly." For example, the hippocampus is responsible for the reconstruction of timelines and timing. This active construction process allows us to simulate the future and think about possibilities.

Human memory is not like a video recording that plays back exactly what is stored. Instead, it is a process of reconstruction. When we memorize, we do not simply retrieve a complete memory; we reconstruct it from fragments, filling in the gaps based on schemas, expectations, and related memories.

Consider what happens when you recall a conversation you had last week. You don’t remember every word verbatim. Instead, you reconstruct the gist of what was said, perhaps accurately remembering a few key phrases. You fill in the gaps based on your understanding of the person you were speaking to, the context of the conversation, and your knowledge of the topic being discussed.

This reconstructive nature of memory has several important implications:

  1. 1.  Memory is creative : We don’t just retrieve; we recreate memories every time we recall them. This allows us to tailor our memories to current needs and situations.
  2. 2.  Memory is plastic : Our memories can change over time as we reconstruct them in slightly different ways, incorporating new information or perspectives.
  3. 3.  Memory supports imagination : The same mechanisms that help us reconstruct past events also enable us to imagine future situations or counterfactuals.

Current RAG systems lack this reconstructive property. They retrieve existing text paragraphs, but do not actively reconstruct the information to fit the current context or fill in the gaps between retrieved fragments. They can combine or summarize retrieved paragraphs, but this is far from true reconstructive memory.

For example, if asked about a specific aspect of climate change that is not directly covered in any single document in its knowledge base, a RAG system might retrieve several relevant documents but have difficulty reconstructing the specific information needed. In contrast, a human expert might reconstruct an answer by combining fragments of knowledge from different sources, filling in the gaps based on an overall understanding of the field.

While this can sometimes lead to false memories, this active reconstruction is essential for a truly advanced memory system. Without it, we only have simple retrieval, not true memory.

An advanced AI memory system should include:

  • •  Schema-based reconstruction , using general knowledge structures to fill in gaps in specific memories
  • •  Context-sensitive recall , adapting reconstructed memories to current needs and situations
  • •  Creative simulation abilities that go beyond retrieval to support imagination and counterfactual reasoning
  • •  Confidence estimates of reconstructed elements , distinguishing directly retrieved information from inferred or reconstructed components

Recent work on generative retrieval models and neural schema networks shows promise in this regard, but much work is still needed to achieve truly human-like reconstructive memory.

3. Associative Memory Network

Memory should be associative, creating rich semantic connections between related concepts. These associations make reasoning more flexible and more similar to the way humans think.

Human memory is associative in nature. Concepts, experiences, and facts are interconnected through complex associative networks, enabling flexible retrieval and creative connections of information. Unlike the rigid hierarchical organization of traditional computer storage systems, associative memory enables us to access information through multiple paths and make unexpected connections, leading to insights and innovations.

Consider how you retrieve information about "Paris". Depending on the context, you might access this information in the following ways:

  • • Geographical associations (it is in France, the capital of Europe)
  • • Cultural associations (Eiffel Tower, French cuisine, art museums)
  • • Personal associations (a holiday you took there, a friend who lives there)
  • • Historical associations (French Revolution, World War II events)
  • • Literary or artistic associations (Hemingway's A Moveable Feast, Impressionist paintings)

These rich, multidimensional associations make the retrieval of information more flexible and contextually appropriate. If you are planning a trip, your brain may activate associations related to travel; if you are discussing art history, it may activate associations related to art.

Current RAG systems typically rely on similarity-based retrieval, which lacks this rich associative structure. Documents may be retrieved based on their overall similarity to the query, but the system does not maintain explicit associations between different pieces of information, which could enable more flexible and creative retrieval paths.

For example, if a user asks for “innovative urban mobility solutions,” a traditional RAG system might retrieve documents containing similar terms. An associative memory system can activate a network of related concepts—bike-sharing programs, congestion pricing, self-driving cars, urban planning principles—even if these are not explicitly mentioned in the query.

Advanced AI memory systems should include:

  • •  Explicit associations between concepts, facts, and experiences
  • •  Multiple types of associations (semantic, temporal, causal, etc.) to capture different ways in which information is related
  • •  Associative strength , which reflects the strength of the connection between different pieces of information
  • •  Contextual activation , dependent on current focus and target
  • •  Spreading activation mechanism, which enables retrieval to proceed along the associative chain

Research on graph neural networks, associative memory models, and knowledge graphs provides promising directions for building such systems, but creating truly human-like associative memory remains a major challenge.

4. Adaptive Forgetting

“Without forgetting, memory simply cannot function.” Forgetting is closely related to attention—deciding which information deserves attention and which can be discarded.

Contrary to popular belief, forgetting is not simply a failure of memory—it is an adaptive trait that helps us function effectively in a complex and changing world. By selectively retaining important information while discarding irrelevant information, our memory system optimizes utility rather than perfect recall.

Consider what would happen if you memorized every detail of your daily commute for the past five years—every car you passed, every pedestrian you saw, every slight change in your route. This information would overwhelm your memory system and interfere with your ability to recall the truly important information. Instead, your brain cleverly retains the important features that remain stable (general routes, notable landmarks) while discarding irrelevant details.

Forgetting has several important functions:

  1. 1.  Reduce Interference : By discarding outdated or irrelevant information, forgetting helps prevent interference with new learning and recall of important information.
  2. 2.  Promote generalization : Forgetting specific details can help extract general patterns and principles from experience.
  3. 3.  Adapting to a changing environment : As the environment changes, forgetting allows us to update our memories and behaviors accordingly.
  4. 4.  Emotion Regulation : The ability to forget painful experiences (or at least reduce their emotional impact) is critical to mental health.

Current RAG systems often lack principled forgetting mechanisms. They may implement simple time-based decay or relevance thresholds, but these fall far short of the complex, context-sensitive forgetting processes in human memory.

For example, a conversational AI using RAG might retain every detail of past conversations indefinitely, leading to increasingly irrelevant retrieval as the knowledge base grows. Without adaptive forgetting, the system might continue to retrieve information about the user’s past interests or needs, even if that information has changed significantly.

Advanced AI memory systems should include:

  • •  Importance-based retention , which prioritizes information based on its usefulness, emotional importance, and relevance to current goals
  • •  Context-sensitive forgetting , adjusting the forgetting rate based on environmental changes or user needs
  • •  Interference-based forgetting , taking into account possible interference between different memories
  • •  Strategic consolidation process that strengthens important memories while allowing less important memories to fade

Recent research on neural networks with controlled forgetting, adaptive memory networks, and reinforcement learning methods for memory optimization shows promise in this direction.

5. Hierarchical organization

Memory systems should be hierarchical to represent our semantic hierarchy, allowing for different levels of abstraction and conceptualization.

Human memory operates at multiple levels of abstraction simultaneously. We can zoom in to recall specific details, or zoom out to access general principles and categories. This hierarchical organization allows for efficient storage, flexible retrieval, and powerful conceptualization capabilities.

Consider how you organize your knowledge about animals:

  • • At the highest level, you have the general concept of “animal”
  • • Under that, you might have categories like “Mammals,” “Birds,” “Reptiles,” etc.
  • • Each category contains subcategories (e.g., "Mammals" includes "Primates," "Carnivores," "Rodents," etc.)
  • • At a lower level, you have specific species (e.g., “tiger”, “elephant”)
  • • At the most detailed level, you might have specific examples (e.g., “The tiger I saw at the zoo last year”)

This hierarchy enables you to:

  • • Infer new instances based on class membership
  • • Access information at the appropriate level of detail for a given task
  • • Generalize knowledge from specific instances to broader categories
  • • Efficiently navigate between different levels of abstraction

Current RAG systems often lack this rich hierarchical organization. While they may use clustering or classification schemes to organize documents, these approaches often fail to capture the nested multi-level hierarchical structures that are unique to human conceptual knowledge.

For example, if a user asks "transportation options in European cities," a traditional RAG system might retrieve documents containing these terms. A hierarchically organized memory system can recognize the relationships between specific modes of transportation (bus, tram, subway), specific cities (Paris, Berlin, Barcelona), and the general concepts they embody, allowing for a more nuanced and comprehensive response.

An advanced AI memory system should include:

  • •  Explicit representation of hierarchical relationships between concepts at different levels of abstraction
  • •  Flexible navigation between different levels of the hierarchy based on current needs
  • •  Inheritance mechanism that allows attributes and relationships to propagate through the hierarchy
  • •  Hierarchical reasoning skills to reason between different levels of abstraction

Research on hierarchical neural networks, concept lattices, and ontology-based knowledge representation provides promising directions for building such systems.

Promising directions for AI memory systems

Several approaches show potential for developing more sophisticated AI memory systems:

  1. 1.  Hierarchical memory architectures that operate at different timescales and levels of abstraction Recent research has explored multi-level memory architectures that combine rapidly changing working memory components with more stable long-term memory systems. These approaches typically use different encoding mechanisms and update rules to process different types of information and at different timescales. For example, systems like Hierarchical Transformer Memory (HTM) use nested attention mechanisms to capture information at different levels of abstraction and at different timescales. Some systems use rapidly changing memory buffers to process recent context while using slower changing components to process stable knowledge, mimicking the complementary learning systems theory of human memory.
  2. 2.  Event-based memory systems that use temporal graphs to capture relationships between events Event-based approaches organize memory around discrete events and their temporal and causal relationships. These systems can capture narrative structure and plot context that simple document retrieval systems cannot. For example, research on event- and entity-centric knowledge graphs builds explicit representations of events, participants, and their temporal and causal relationships. These structured representations can support more complex reasoning about event sequences, their causes and consequences, and their relationship to broader historical or personal narratives.
  3. 3.  Associative memory networks , which establish rich semantic connections between related concepts Associative memory networks explicitly represent connections between different pieces of information, allowing flexible multi-path retrieval and creative connections between seemingly unrelated concepts. Recent approaches include associative memory-based knowledge graphs, which combine semantic network structures and neural embeddings to capture explicit and implicit relationships between concepts. Some systems implement spreading activation mechanisms, allowing retrieval to proceed along associative chains, mimicking the associative nature of human recall.
  4. 4.  Neuro-symbolic approaches , combining neural networks with symbolic reasoning to better represent and manipulate information Neuro-symbolic systems combine neural network methods (excelling at pattern recognition and conceptualization) with symbolic reasoning (excelling at explicit representation and logical reasoning). These hybrid approaches can capture statistical patterns in data as well as the structured, combinatorial nature of human knowledge. For example, systems like the Neuro-Logic Machine and the Neuro-Symbolic Concept Learner use neural networks to learn representations of concepts and relations, combined with symbolic reasoning mechanisms to manipulate these representations according to logical rules. This approach can support flexible pattern recognition and precise logical reasoning.
  5. 5.  Forgetting mechanisms that strategically decide what information to keep and what to discard Adaptive forgetting methods implement principled mechanisms to determine what information to keep and what to discard, based on factors such as relevance, importance, and potential interference. Recent work includes adaptive forgetting neural networks, which achieve learnable forgetting rates that depend on the importance and usefulness of different pieces of information. Some methods use reinforcement learning to optimize forgetting strategies that maximize the usefulness of retained information while minimizing memory and computational costs.

in conclusion

RAGs represent an important step toward more powerful AI systems that are able to access external knowledge. However, equating RAGs with memory fails to capture the rich, dynamic, and comprehensive nature of the human memory system.

As we work to develop more advanced AI, especially conversational agents and systems that interact with humans in complex ways, we need to move beyond simple retrieval. We need to develop systems that can truly remember, reflect, and learn from experience — systems that understand the importance of context, associations, and even forgetting.

Consider the difference between:

  • • A search engine that retrieves documents containing the term “climate change adaptation”
  • • A colleague who can discuss climate adaptation strategies based on an understanding of the field, connect relevant concepts, recall relevant examples, and construct new insights by combining information from different sources

The first is retrieval; the second is actual memory. As AI systems become more integrated into our lives and work, we need them to act more like colleagues than search engines.

This does not mean abandoning RAG—it remains a valuable tool for integrating AI systems with factual information. But it does mean recognizing its limitations and working to supplement it with more sophisticated memory mechanisms that capture the active, constructive, associative, and adaptive properties of human memory.

The message is clear: if you want to build something truly advanced, think about memory systems, not just retrieval systems. RAG is not enough — it’s just the beginning of our journey toward AI systems with truly human-like memory capabilities.

This article combines insights from technical demonstrations of AI memory systems with additional research on the limitations of RAGs and the need for more complex AI memories. The field is still evolving rapidly, and new AI memory approaches may emerge that address the limitations discussed.