GraphRAG cost is 10% off, KET-RAG multi-granularity indexing framework is open source

Written by
Caleb Hayes
Updated on:July-12th-2025
Recommendation

KET-RAG: The new framework significantly reduces the cost of knowledge retrieval and opens a new era of efficient generation.

Core content:
1. The challenges of the existing Graph-RAG system between cost and retrieval quality
2. The three major innovations of the KET-RAG framework: knowledge graph skeleton, text-keyword bipartite graph, and dual-channel retrieval
3. The performance advantages and cost reduction effects of KET-RAG on real datasets

Yang Fangxian
Founder of 53AI/Most Valuable Expert of Tencent Cloud (TVP)
Quick Summary

Research pain point : Existing graph-based retrieval-augmented generation ( Graph-RAG ) systems face a dilemma when processing large-scale documents.

  • On the one hand, although the KNN graph method based on text block similarity is low-cost, it cannot capture the entity relationships within the text, resulting in poor retrieval and generation quality ;

  • On the other hand, although the knowledge graph-based (KG-RAG) method can improve the retrieval quality by extracting entities and relationships, its high indexing cost makes it difficult to apply on a large scale. For example, the indexing cost of processing 5GB of legal documents may be as high as $33,000 .

Innovation breakthrough :
KET-RAG (Knowledge-Entity-Text Retrieval-Augmented Generation), a multi-granularity indexing framework, is proposed. KET-RAG achieves efficient and low-cost knowledge retrieval through the following innovations:
  • Knowledge graph skeleton : Build the knowledge graph only from core text blocks, greatly reducing indexing costs.
  • Text-keyword bipartite graph : As a lightweight alternative to the knowledge graph, it achieves efficient retrieval by associating keywords with text blocks.
  • Dual-channel retrieval strategy : Combining the advantages of the knowledge graph skeleton and the text-keyword bipartite graph to balance retrieval quality and cost.
Application value : Eight solutions are evaluated on two real-world datasets, and the results show that KET-RAG outperforms all competitors ( Text-RAG, KNNG-RAG, KG-RAG, Hybrid-RAG, Skeleton-RAG ) in terms of indexing cost, retrieval effect, and generation quality.
It is worth noting that  the retrieval quality of KET-RAG  is comparable to or even better than that of Microsoft's Graph-RAG , while the indexing cost is reduced by more than an order of magnitude .
? Plan details

The core of the KET-RAG framework is to combine the multi-granularity index structure, which includes the following parts:

  1. Skeleton-RAG : Select important text blocks from the KNN graph through the PageRank algorithm, and build a knowledge graph only for these core text blocks to reduce indexing costs.

  2. Text-Keyword Bipartite Graph (Keyword-RAG) : Split all text blocks into sub-blocks and construct a graph of keywords and sub-blocks. Keywords and their neighboring text blocks are used as candidate entities and relations for lightweight retrieval.

  3. Dual-channel retrieval : In the retrieval stage, KET-RAG combines the advantages of the knowledge graph skeleton and the text-keyword bipartite graph, and balances the contributions of the two by adjusting the retrieval ratio parameter (??) to improve the retrieval quality.

  4. Parameter optimization : By adjusting parameters such as the input text block size (ℓ) and the number of segmentation levels (??), the retrieval and generation performance are further optimized.

Through this multi-granularity indexing and dual-channel retrieval strategy, KET-RAG significantly reduces the indexing cost while ensuring the retrieval quality, providing an efficient and low-cost solution for large-scale knowledge retrieval and generation tasks.

      https://arxiv.org/pdf/2502.09304KET-RAG: A Cost-Efficient Multi-Granular Indexing Framework for Graph-RAG