GraphRAG cost is 10% off, KET-RAG multi-granularity indexing framework is open source

Written by

Caleb Hayes

Updated on:July-12th-2025

Quick Summary

Research pain point : Existing graph-based retrieval-augmented generation ( Graph-RAG ) systems face a dilemma when processing large-scale documents.

On the one hand, although the KNN graph method based on text block similarity is low-cost, it cannot capture the entity relationships within the text, resulting in poor retrieval and generation quality ;
On the other hand, although the knowledge graph-based (KG-RAG) method can improve the retrieval quality by extracting entities and relationships, its high indexing cost makes it difficult to apply on a large scale. For example, the indexing cost of processing 5GB of legal documents may be as high as $33,000 .

Innovation breakthrough :

KET-RAG (Knowledge-Entity-Text Retrieval-Augmented Generation), a multi-granularity indexing framework, is proposed. KET-RAG achieves efficient and low-cost knowledge retrieval through the following innovations:

Knowledge graph skeleton : Build the knowledge graph only from core text blocks, greatly reducing indexing costs.
Text-keyword bipartite graph : As a lightweight alternative to the knowledge graph, it achieves efficient retrieval by associating keywords with text blocks.
Dual-channel retrieval strategy : Combining the advantages of the knowledge graph skeleton and the text-keyword bipartite graph to balance retrieval quality and cost.

Application value : Eight solutions are evaluated on two real-world datasets, and the results show that KET-RAG outperforms all competitors ( Text-RAG, KNNG-RAG, KG-RAG, Hybrid-RAG, Skeleton-RAG ) in terms of indexing cost, retrieval effect, and generation quality.

It is worth noting that the retrieval quality of KET-RAG is comparable to or even better than that of Microsoft's Graph-RAG , while the indexing cost is reduced by more than an order of magnitude .

? Plan details

The core of the KET-RAG framework is to combine the multi-granularity index structure, which includes the following parts:

Skeleton-RAG : Select important text blocks from the KNN graph through the PageRank algorithm, and build a knowledge graph only for these core text blocks to reduce indexing costs.
Text-Keyword Bipartite Graph (Keyword-RAG) : Split all text blocks into sub-blocks and construct a graph of keywords and sub-blocks. Keywords and their neighboring text blocks are used as candidate entities and relations for lightweight retrieval.
Dual-channel retrieval : In the retrieval stage, KET-RAG combines the advantages of the knowledge graph skeleton and the text-keyword bipartite graph, and balances the contributions of the two by adjusting the retrieval ratio parameter (??) to improve the retrieval quality.
Parameter optimization : By adjusting parameters such as the input text block size (ℓ) and the number of segmentation levels (??), the retrieval and generation performance are further optimized.

Through this multi-granularity indexing and dual-channel retrieval strategy, KET-RAG significantly reduces the indexing cost while ensuring the retrieval quality, providing an efficient and low-cost solution for large-scale knowledge retrieval and generation tasks.

https://arxiv.org/pdf/2502.09304KET-RAG: A Cost-Efficient Multi-Granular Indexing Framework for Graph-RAG