Dynamic data is too much of a hassle! If static RAG can’t handle it, try ZEP and let the Agent call the real-time knowledge graph.

Written by
Caleb Hayes
Updated on:June-13th-2025
Recommendation

A new breakthrough in dynamic data management, the ZEP system allows Agents to call real-time knowledge graphs, making it easier to cope with changes.

Core content:
1. The limitations of traditional RAG systems in dynamic information processing
2. The time-aware knowledge graph and three-layer graph structure design of the ZEP system
3. Dual time axis modeling and intelligent edge failure mechanism to effectively handle information conflicts and time reasoning

Yang Fangxian
Founder of 53A/Most Valuable Expert of Tencent Cloud (TVP)

 

Have you ever encountered this kind of trouble: you have built a perfect RAG system, but the Agent always answers with outdated information, or looks confused when faced with changes in historical preferences?

Three months ago, he said he liked aggressive investment strategies, two weeks ago he changed his mind and wanted a conservative allocation, and today he wants to try emerging markets. The traditional RAG system can only retrieve document fragments at a loss and cannot understand this dynamic evolution at all .

This is not a problem with your system, but rather a natural limitation of static RAG.

Traditional RAG is not suitable for dynamic scenes

Three Achilles' heels for static document retrieval

The traditional RAG system is essentially a "document library" that assumes that knowledge is fixed and unchanging, which makes it incapable of handling dynamic business scenarios.

First , when new information conflicts with old information, the system cannot intelligently determine which one is more credible and often returns contradictory information to the user all at once.

Secondly , the lack of understanding of the time dimension makes the system unable to distinguish between "user preferences last year" and "user current needs", causing the recommendation results to deviate from the actual situation.

The pain points in enterprise scenarios are more obvious

In enterprise-level applications, this limitation will be infinitely magnified.

For example, if you are developing a customer service agent, customer A has experienced a transformation from a startup to a medium-sized enterprise in the past year, and its needs have shifted from cost control to efficiency improvement, but the traditional RAG system will still recommend cost optimization solutions based on historical documents.

This disconnect not only affects user experience, but may also cause business losses.

ZEP: Time-aware Knowledge Graph

Core innovation: three-layer graph structure design

The core of the ZEP system is the Graphiti engine, which uses an ingenious three-layer knowledge graph architecture to solve the pain points of traditional RAG:

First layer: Episode sub-graph

  • •  Functionality : Non-lossy storage of raw conversations, text or JSON data
  • •  Features : retains complete contextual information just like human episodic memory

Second layer: Semantic Entity subgraph

  • •  Function : Extract entities and relationships from raw data
  • •  Features : Organically integrate new and old information through entity resolution technology

The third layer: Community subgraph

  • •  Function : Cluster related entities through label propagation algorithm
  • •  Features : Forming high-level conceptual understanding

This design allows the system to retain details while still being able to perform abstract reasoning.

Dual timeline modeling: solving the fundamental problem of information updating

The most unique innovation of ZEP is the introduction of a dual timeline modeling mechanism :

Timeline Type
Record content
effect
Event timeline (T)
The time when real-world events occur
Accurately reflect the temporal relationship of facts
Transaction Timeline (T')
The time it takes for the system to receive and process information
Tracking information acquisition and update process

This design allows the system to accurately handle complex time relationships such as "the project mentioned by the user two weeks ago was actually started three months ago."

Intelligent edge failure mechanism

Traditional systems are often helpless when faced with information conflicts, but ZEP elegantly solves this problem through the LLM-driven edge failure mechanism :

  1. 1.  Conflict detection : When the system detects that new facts have semantic conflicts with the information in the existing knowledge graph
  2. 2.  Automatic marking : mark conflicting old information as invalid
  3. 3.  Time record : the specific time point when the record becomes invalid

This mechanism allows the agent to accurately answer complex questions involving temporal reasoning, such as "when did the user change his preference?"

Three-step memory retrieval

Step 1: Hybrid Search Strategy

ZEP's retrieval system uses three complementary search methods to maximize recall:

  • •  Cosine similarity search : captures semantic relevance
  • •  BM25 full text search : handles keyword matching
  • •  Breadth-first search : discover implicit connections in graph structures

This design is particularly suitable for dealing with the problem of reference resolution when users ask "how is the progress of that project?"

Step 2: Intelligent reordering

After retrieving candidate results, ZEP uses a variety of re-ranking strategies to improve accuracy:

  • •  RRF and MMR algorithms : traditional re-ranking methods
  • •  Re-ranking based on graph distance : considering the degree of association between entities
  • •  Frequency weight adjustment : give higher priority to information frequently mentioned by users

Step 3: Context Construction

The final step is to convert the retrieved and reordered nodes and edges into an LLM-friendly text format:

  • • Label each fact with a valid time frame
  • • Provide a concise summary description for each entity
  • • Ensure that the Agent understands the timeliness and importance of the information when generating a response

An example of a ZEP context construction template, clearly marking the time range and entity information of the fact.

Significantly exceeds existing best practices

DMR benchmark test: A small lead shows the truth

In the Deep Memory Retrieval benchmark:

  • •  ZEP accuracy : 94.8%
  • •  MemGPT accuracy : 93.4%
  • •  Improvement : Although it may not seem like much, considering the small size of the DMR test set, this result is quite good

Comparison of Deep Memory Retrieval benchmark results. ZEP achieved the best performance on multiple models.

 

 

LongMemEval: True Strength Demonstrated

In the more challenging LongMemEval test, ZEP's advantages are fully demonstrated:

index
Improvement
illustrate
Accuracy
+18.5%
Facing long conversations with an average of 115,000 tokens
Response Delay
-90%
Greatly improve system response speed

 

LongMemEval benchmark results show that ZEP significantly improves accuracy while significantly reducing latency

 

Performance analysis of different question types

Detailed classification results show that ZEP has the most significant improvement on complex reasoning tasks:

Task Type
Original accuracy
ZEP Accuracy
Improvement
Single-session preference understanding
30%
53.3%
+23.3%
Temporal reasoning tasks
36.5%
54.1%
+17.6%

 

Detailed performance analysis of LongMemEval for each problem type. ZEP has a clear advantage in complex reasoning tasks.


A complete solution for production


Architecture design: modular system composition

Core components :

  • •  Engine : The core engine implemented in Python
  • •  Storage : Backend storage based on Neo4j graph database
  • •  Interface : REST API service (based on FastAPI) and MCP server

Deployment advantages :

  • • Deployable as a standalone service
  • • Can be embedded into existing AI application architecture

Multi-model support: adapting to different technology stacks

Supported LLM providers :

  • • OpenAI
  • • Google Gemini
  • • Anthropic Claude
  • • Groq
  • • Other models accessed through OpenAI compatible API

Optimized features :

  • • Optimized specifically for models that support structured output
  • • Ensure accuracy of entity and relation extraction
  • • Support the selection of the most suitable LLM service based on cost, performance and compliance requirements

Performance Optimization: From the Lab to Production

Key optimization measures :

Optimization Project
Technical Solution
Effect
Query acceleration
Neo4j Parallel Runtime Features
Speed ​​up complex query execution
Graph structure optimization
Dynamic Community Update Algorithm
Reduce the frequency of graph structure reconstruction
Search Optimization
Hybrid search strategy
Minimize computational overhead while ensuring recall

Performance indicators :

  • • Respond to user queries within milliseconds
  • • Meeting the needs of real-time interaction

Intelligent customer service system based on ZEP

In order to better demonstrate the practical application value of ZEP technology, I wrote an example of Zep intelligent customer service. Using Jina as the embedding model

Scenario 1: VIP customer complaint handling chain

Client Background : Mr. Li, a diamond member and CEO of a large enterprise, requires a quick response

History :

  • • 15 days ago: Network problem complaint (third time this month)
  • • 8 days ago: Network issues have not been resolved, affecting meetings

Current complaint : "I need to communicate directly with your technical director. This quality of service is unacceptable!"

System performance :

  • • Accurately cite historical complaint records
  • • Priority treatment based on Diamond membership status
  • • Proactively arrange for technical director to meet with

Scenario 2: Family customer group relationship network

Customer Relations :

  • • Mr. Zhang (FAM001): Family master account, gold member, focusing on cost-effectiveness
  • • Mrs. Zhang (FAM002): ordinary member, often travels abroad and needs international roaming

Interaction scenario :

  • • Mr. Zhang consulted about the family package
  • • Mrs. Zhang inquired about international roaming discounts

System performance :

  • • Automatically identify family members
  • • Comprehensive service programs for the entire family
  • • Cross-referral related services

Scenario 3: Warning of old customer churn

Customer background : Mr. Wang, 5-year loyal customer, silver member, recently used less frequently

Risk signal : "I've seen other operators offer more favorable promotions recently. Do you have any retention policies?"

System performance :

  • • Identify churn risk signals
  • • 5 years of loyal customer history
  • • Proactively provide retention policies and preferential schemes

Core functional modules :

1. Dynamic customer profile management

  • • Build and update complete customer profiles in real time
  • • Includes basic information, VIP level, service preferences, and historical complaint records
  • • Stored in the form of knowledge graph, automatically discovering associations

2. Time-aware conversation memory

  • • Record every customer interaction into a time knowledge graph
  • • Accurately understand time-related references
  • • Enable coherent conversations across sessions

3. Intelligent Question Classification and Routing

  • • Automatic classification: product consultation, technical support, account management, complaint suggestions, business handling
  • • Call the corresponding knowledge base and processing flow according to the classification results

4. Relationship Network Mining

  • • Automatically identify relationships between customers (family members, business colleagues, etc.)
  • • Support cross-selling and family package recommendations

5. Customer churn warning

  • • Analyze historical behavior patterns and recent interactions
  • • Identify potential churn risks
  • • Trigger corresponding retention strategies

Traditional RAG vs ZEP Intelligent Customer Service: Core Differences

Dimensions
Traditional RAG Intelligent Customer Service
ZEP Intelligent Customer Service
Memory Mechanism
Static document retrieval
Dynamic Knowledge Graph
Time understanding
No concept of time
Dual timeline modeling
Relationship Mining
Unable to understand entity relationships
Automatically build relationship networks
Degree of personalization
Based on keyword matching
Deep learning based on historical behavior
Contextual coherence
Single-turn conversation
Cross-session context understanding
Information Update
Need to retrain
Real-time incremental updates
Conflict Management
Unable to handle conflicting information
Smart edge failure mechanism

Performance and technical advantages

Response quality improvement

  • •  Personalization level : significantly improved, able to accurately reference customer historical information
  • •  Classification accuracy : over 95%, significantly reducing misleading responses
  • •  Conversational naturalness : cross-conversational contextual understanding makes conversations more fluent

System scalability

  • •  Real-time integration : New customer information and conversation records can be integrated into the knowledge graph in real time
  • •  Continuous learning : The system's learning ability continues to increase as data accumulates
  • •  Flexible configuration : supports flexible configuration and expansion of various customer service scenarios

Challenge: The gap between ideal and reality

LLM Dependency: Balancing Cost and Accuracy

Main challenges :

  • • Key steps such as entity extraction, relationship identification, time parsing and conflict detection all require the use of LLM
  • • Increased operating costs and introduced potential accuracy risks
  • • Smaller models may result in malformed output and extraction failures
  • • The cost of large models may limit the commercial application of the system

Solution :

  • • Specialized models fine-tuned for specific tasks
  • • Significantly reduce inference cost and latency while maintaining accuracy

Graph structure complexity: scalability and maintenance challenges

Problems :

  • • As the size of the knowledge graph grows, the complexity of the graph structure increases exponentially
  • • Challenges to the query performance and maintenance efficiency of the system
  • • After long-term operation, regular full-graph reconstruction is still required to ensure the accuracy of community division

Technical requirements :

  • • Graph database index optimization
  • • Professional tuning of query plan selection
  • • Increased the complexity of system operation and maintenance

The lack of evaluation benchmarks: how to measure real results

Current issues :

  • • Existing benchmarks for evaluating memory systems have significant limitations
  • • Most test sets are small and focus on simple fact retrieval tasks
  • • Failure to fully reflect the value of dynamic knowledge graphs in actual business scenarios

Influence :

  • • Performance comparisons between different systems become difficult
  • • Affects the judgment of the direction of technical improvement