HippoRAG 2 released, GraphRAG abdicates~

Written by

Silas Grey

Updated on:July-14th-2025

To address the limitations of existing retrieval-augmented generation (RAG) systems in simulating the dynamics and associativity of human long-term memory , a new framework HippoRAG 2 is proposed and will be open sourced.

Continuous learning capabilities are evaluated on three key dimensions: fact memory, perceptual construction, and relevance. HippoRAG 2 outperforms other methods ( RAPTOR, GraphRAG, LightRAG, HippoRAG ) in all benchmark categories, bringing it closer to true long-term memory systems.

The core idea of the HippoRAG 2 framework : HippoRAG 2 is based on HippoRAG's personalized PageRank algorithm, which pushes the RAG system closer to the effect of human long-term memory through deep paragraph integration and more effective use of online LLM.

Offline indexing:

We use LLM to extract triples from paragraphs and integrate them into an open knowledge graph (KG).
Detect synonyms through embedding model and add synonym edges in KG.
Combine the original paragraph with the KG to form an open KG containing concepts and contextual information.

Online search:

Use embedding models to link queries with triples and paragraphs in the KG and determine seed nodes for graph search.
The retrieved triples are filtered through LLM to keep the relevant triples.
A personalized PageRank algorithm is applied for context-aware retrieval, ultimately providing the most relevant paragraphs for downstream question-answering tasks.

Baseline methods: including classic retrievers (BM25, Contriever, GTR), large embedding models (GTE-Qwen2-7B-Instruct, GritLM-7B, NV-Embed-v2), and structure-enhanced RAG methods ( RAPTOR, GraphRAG, LightRAG, HippoRAG ).

Evaluation metrics: The question answering task uses the F1 score, and the retrieval task uses passage recall@5.

Performance improvement: HippoRAG 2 outperforms other methods on all benchmark categories, with an average F1 score of 7 percentage points higher than standard RAG, especially on the associative memory task.

A HippoRAG 2 pipeline example

https://github.com/OSU-NLP-Group/HippoRAG From RAG to Memory: Non-Parametric Continual Learning for Large Language Modelshttps://arxiv.org/pdf/2502.14802