Typical RAG systems and papers for 2024 (with AI notes, sources, and summary information), and a RAG overview and test benchmarking material at the end of the article, I hope that reading this article will help you speed through RAG!

Previous article.

Looking back at 2024, large language model are changing day by day, and hundreds of intelligences are competing. As an important part of AI applications, RAG is also a "group of heroes and lords". At the beginning of the year, ModularRAG continues to heat up, GraphRAG shines, in the middle of the year, open-source tools are in full swing, knowledge graphs create new opportunities, and at the end of the year, graph understanding and multimodal RAG start a new journey, so it's almost like "you're on your own and I'm on my own", and there are so many strange techniques, the list goes on and on!

Here we have selected typical RAG systems and papers in 2024 (including AI notes, sources, and abstract information), and at the end of the article, we have attached RAG overview and test benchmark materials, hoping to implement this article, counting sixteen thousand words, to help you get through RAG quickly.

1. GraphReader [graphical expert

GraphReader: Like a tutor who is good at making mind maps, GraphReader transforms long text into a clear knowledge network, allowing AI to explore along the map and easily find the key points needed for the answer, effectively overcoming the problem of "getting lost" when dealing with long text.

- Time: 01.20

- Paper: GraphReader: Building Graph-based Agent to Enhance Long-Context Abilities of Large Language Models

GraphReader is a graph-based agent system designed to process long text by building it into a graph and using the agent to explore the graph autonomously. Upon receiving a problem, the intelligent body first performs a step-by-step analysis and formulates a reasonable plan. Then, it calls a set of predefined functions to read node contents and neighbors to facilitate the exploration of the graph from coarse to fine. Throughout the exploration process, the intelligent body continuously records new insights and reflects on the current situation to optimize the process until it gathers enough information to generate an answer.

2. MM-RAG [Multi-faceted Hand].

Multifaceted: like an all-rounder that can be proficient in vision, hearing and language at the same time, it not only comprehends different forms of information, but also switches and correlates between them comfortably. Through the comprehensive understanding of various information, it can provide smarter and more natural services in a variety of fields such as recommendation, assistant and media.

- Time: 01.22

The development of multimodal machine learning is presented, including contrast learning, arbitrary modal search implemented by multimodal embedding, multimodal retrieval augmented generation (MM-RAG), and how to build multimodal production systems using vector databases. The future trends of multimodal AI are also discussed, emphasizing its application prospects in areas such as recommender systems, virtual assistants, media and e-commerce.

3. CRAG [self-correction

Self-correction: like an experienced editor, it first screens the preliminary information in a simple and quick way, then expands the information by searching the Internet, and finally ensures that the final presentation is both accurate and reliable by disassembling and reorganizing the content. It's like putting a quality control system on RAG to make the content it produces more trustworthy.

- Time: 01.29

- Thesis: Corrective Retrieval Augmented Generation

- Project: https://github.com/HuskyInSalt/CRAG

CRAG improves the quality of retrieved documents by designing a lightweight retrieval evaluator and introducing large-scale web search, and further refines the retrieved information through decomposition and reorganization algorithms to enhance the accuracy and reliability of the generated text. CRAG is a useful supplement and improvement to the existing RAG technology, which enhances the robustness of the generated text by self-correcting the retrieval results.

4. RAPTOR [hierarchical summarization

Layered summarization: like a good librarian of the organization, the content of the document from the bottom up into a tree structure, so that information retrieval can be flexible shuttle between different levels, both to see the overall summary, but also in-depth details.

- Time: 01.31

- Thesis: RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval

- Project: https://github.com/parthsarthi03/raptor

RAPTOR (Recursive Abstractive Processing for Tree-Organized Retrieval) introduces a new method of recursively embedding, clustering, and summarizing blocks of text to build trees with different levels of summarization from the bottom up. At inference time, the RAPTOR model retrieves from this tree, integrating information from long documents with different levels of abstraction.

5. T-RAG [Personal Advisor].

Private consultant: like an in-house consultant familiar with the organizational structure, adept at organizing information using a tree structure to provide localized services efficiently and economically while protecting privacy.

- Time: 02.12

- Paper: T-RAG: Lessons from the LLM Trenches

T-RAG (Tree Retrieval Augmentation Generation) combines RAG with a fine-tuned open-source LLM that uses a tree structure to represent entity hierarchies within an organization augmenting contexts, leveraging locally-hosted open-source models to address data privacy concerns while addressing inference latency, token usage costs, and regional and geographic availability issues.

6. RAT [Thinker

Thinker: like a reflective tutor, instead of coming to a conclusion all at once, we first have initial thoughts and then utilize the retrieved relevant information to continuously review and refine each step of the reasoning process to make the chain of thought more rigorous and reliable.

- Time: 03.08

- Thesis: RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Horizon Generation

- Project: https://github.com/CraftJarvis/RAT

RAT (Retrieval Augmented Thoughts) After generating an initial zero-sample chain of thoughts (CoT) and revising each thought step individually using retrieved information related to the task query, current and past thought steps, RAT significantly improves performance on a wide variety of long-haul generation tasks.

7. RAFT [master of open-book]

Open-book masters: like a good test taker, they not only find the right references, but also accurately cite key elements and clearly explain the reasoning process so that the answer is both evidence-based and sensible.

- Time: 03.15

- Paper: RAFT: Adapting Language Model to Domain Specific RAG

RAFT aims to improve the model's ability to answer questions in domain-specific "open-book" environments by training the model to ignore irrelevant documents and answer questions by quoting verbatim the correct sequences from the relevant documents, which, in combination with thought-chaining responses, significantly improves the model's reasoning ability.

8. Adaptive-RAG [Adaptive-RAG].

Adaptive-RAG: In the face of questions of different difficulty, it will intelligently choose the most appropriate way to answer them. Simple questions will be answered directly, while complex questions will consult more information or reason in steps, just like an experienced teacher who knows how to adjust teaching methods according to the specific problems of students.

- Time: 03.21

- Thesis: Adaptive-RAG: Learning to Adapt Retrieval-Augmented Large Language Models through Question Complexity

- Project: https://github.com/starsuzi/Adaptive-RAG

Adaptive-RAG dynamically selects the most appropriate retrieval augmentation strategy based on the complexity of the query, and dynamically selects the most appropriate strategy for LLM from the simplest to the most complex. This selection process is implemented through a small language model classifier that predicts the complexity of the query and automatically collects labels to optimize the selection process. This approach provides a balanced strategy that seamlessly adapts between iterative and single-step retrieval-enhanced LLMs as well as no-retrieval approaches for a range of query complexities.

9. HippoRAG [hippocampus].

HippoRAG: Like the human brain's hippocampus, it skillfully weaves new and old knowledge into a web. Rather than simply piling up information, each new piece of knowledge finds its most appropriate home.

- Time: 03.23

- Thesis: HippoRAG: Neurobiologically Inspired Long-Term Memory for Large Language Models

- Project: https://github.com/OSU-NLP-Group/HippoRAG

HippoRAG is a novel retrieval framework inspired by the hippocampal indexing theory of human long-term memory, aiming at deeper and more efficient knowledge integration of new experiences. HippoRAG synergistically orchestrates LLMs, knowledge graphs, and personalized PageRank algorithms to mimic the different roles of the neocortex and hippocampus in human memory.

10. RAE [Intelligent Editing

Intelligent Editing: Like a careful news editor, RAE not only digs deep into the relevant facts, but also finds out the key information that can be easily ignored through chain reasoning, and at the same time, knows how to cut redundant content to ensure that the final information presented is both accurate and concise, avoiding the problem of "talking a lot but not reliable".

- Time: 03.28

- Paper: Retrieval-enhanced Knowledge Editing in Language Models for Multi-Hop Question Answering

- Project: https://github.com/sycny/RAE

RAE (Framework for Enhanced Model Editing for Multi-Hop Question Answering Retrieval) first retrieves edited facts and then optimizes the language models through context learning. The mutual information maximization based retrieval approach leverages the reasoning power of large language models to identify chained facts that may be missed by traditional similarity-based searches. In addition the framework includes a pruning strategy to eliminate redundant information from retrieved facts, which improves editorial accuracy and mitigates the phantom problem.

11. RAGCache [Warehouseman].

Warehouser: like a large logistics center, puts frequently used knowledge on the most accessible shelves. Knows how to maximize pickup efficiency by placing frequently used packages at the door and infrequently used ones in the back bins.

- Time: 04.18

- Paper: RAGCache: Efficient Knowledge Caching for Retrieval-Augmented Generation

RAGCache is a novel multi-level dynamic caching system tailored for RAG that organizes the intermediate states of retrieved knowledge in a knowledge tree and caches them in both GPU and host memory hierarchies.RAGCache proposes a substitution strategy that takes into account LLM reasoning features and RAG retrieval patterns. It also dynamically overlaps retrieval and reasoning steps to minimize end-to-end latency.

12. GraphRAG [Community Summary

Community Summary: First, we make sense of the relationship network of the residents in the community, and then we make a profile for each neighborhood circle. When someone asks for directions, each neighborhood circle provides clues that are finally integrated into the most complete answer.

- Time: 04.24

- Paper: From Local to Global: A Graph RAG Approach to Query-Focused Summarization

- Project: https://github.com/microsoft/graphrag

GraphRAG constructs graph-based text indexes in two stages: first, an entity knowledge graph is derived from the source documents, and then community summaries are pre-generated for all groups of closely related entities. Given a problem, each community summary is used to generate a partial response, and then all partial responses are summarized again in the final response to the user.

13. R4 [Orchestration Master

Orchestration Master: act like a master typographer and improve the quality of your output by optimizing the order and presentation of material to make content more organized and focused without changing the core model.

- Time: 05.04

- Thesis: R4: Reinforced Retriever-Reorder-Responder for Retrieval-Augmented Large Language Models

R4 is used to learn document ordering for Retrieval-Augmented Large Language Models, thereby further enhancing the generative power of large language models while a large number of their parameters remain frozen. The reordering learning process is divided into two steps based on the quality of the generated responses: document order adjustment and document representation enhancement. Specifically, document order adjustment aims to organize the retrieved document ordering into beginning, middle and end positions based on graph attention learning to maximize the reinforcement reward of response quality. Document representation enhancement further refines the retrieved document representations of poor quality responses through document-level gradient adversarial learning.

14. IM-RAG [self-talk].

Self-talk: When encountering a problem, one will calculate in one's mind "what information do I need to look up", "is this information enough", and improve the answer through continuous internal dialog, this "monologue" ability is like a human expert, which is able to gradually deepen the thinking and solve complex problems.

- Time: 05.15

- Paper: IM-RAG: Multi-Round Retrieval-Augmented Generation Through Learning Inner Monologues

IM-RAG supports multi-round retrieval-augmented generation by connecting IR systems with LLMs through learning Inner Monologues. The approach integrates an information retrieval system with large-scale language models to support multiple rounds of retrieval enhancement generation by learning Inner Monologues. During inner monologue, the large language model acts as the core reasoning model, which can either pose a query through the retriever to gather more information or provide a final answer based on the dialog context. We also introduce an optimizer that improves the output of the retriever, effectively bridging the gap between the reasoner and the information retrieval modules with varying capabilities, and facilitating multi-round communication. The entire inner monologue process is optimized via Reinforcement Learning (RL), where a progress tracker is also introduced to provide intermediate step rewards, and answer predictions are further optimized individually via Supervised Fine Tuning (SFT).

15. AntGroup-GraphRAG [A Hundred Ways].

Hundreds of strengths: Bringing together a hundred strengths of the industry, it specializes in using multiple ways to quickly locate information, providing both accurate retrieval and understanding of natural language queries, making complex knowledge retrieval both cost-effective and efficient.

- Time: 05.16

- Project: https://github.com/eosphoros-ai/DB-GPT

Ant TuGraph team based on DB-GPT built open source GraphRAG framework , compatible with vector , graph , full text and other knowledge base indexing base , support for low-cost knowledge extraction , document structure mapping , graph community summarization and hybrid retrieval to solve QFS questions and answers. Also provides support for diverse retrieval capabilities such as keywords, vectors and natural language.

16. Kotaemon [Lego].

Lego: a ready-made Q&A building block set that can be used directly, but also can be freely disassembled and remodeled. Users want to use it, the development to change it, as you wish without losing the rules.

- Time: 05.15

- Project: https://github.com/Cinnamon/kotaemon

An open source clean and customizable RAG UI, used to build and customize their own document question and answer system . Both the needs of end users and developers have been taken into account.

17. FlashRAG [Treasure Chest

Hundred Treasure Box: packages various RAG artifacts into a toolkit, allowing researchers to build their own retrieval models as they wish, like picking building blocks.

- Time: 05.22

- Thesis: FlashRAG: A Modular Toolkit for Efficient Retrieval-Augmented Generation Research

- Project: https://github.com/RUC-NLPIR/FlashRAG

FlashRAG is an efficient and modular open-source toolkit designed to help researchers reproduce existing RAG methods and develop their own RAG algorithms within a unified framework. Our toolkit implements 12 state-of-the-art RAG methods and collects and organizes 32 benchmark datasets.

18. GRAG [Detective

Detective: Not satisfied with surface clues, dig deeper into the network of connections between texts, track the truth behind each piece of information like solving a case, and make the answer more accurate.

- Time: 05.26

- Thesis: GRAG: Graph Retrieval-Augmented Generation

- Project: https://github.com/HuieL/GRAG

Traditional RAG models ignore the connections between texts and the topological information of databases when dealing with complex graph-structured data, which leads to performance bottlenecks. GRAG significantly improves the performance and reduces the illusions of the retrieval and generation process by emphasizing the importance of subgraph structures.

null

19. Camel-GraphRAG [Left and Right Open Bows

Left and right open bow: one eye scans the text with Mistral to extract intelligence, and the other eye uses Neo4j to weave a web of relationships. Finding the left and right eyes work together to find similarities as well as tracking along the clue graph, making the search more comprehensive and accurate.

- Time: 05.27

- Project: https://github.com/camel-ai/camel

Camel-GraphRAG relies on the Mistral model to provide support for extracting knowledge from given content and constructing knowledge structures, and then storing this information in a Neo4j graph database. A hybrid approach combining vector retrieval with knowledge graph retrieval is then used to query and explore the stored knowledge.

20. G-RAG [Stringer Artifacts].

Crosstalk Artifact: Instead of searching for information alone, it builds a network of relationships for each knowledge point. Like a socialite, not only know the specialties of each friend, but also clear who and who is a drinking buddy, looking for answers directly along the way.

- Time: 05.28

- Paper: Don't Forget to Connect! Improving RAG with Graph-based Reranking

RAG still has challenges in dealing with the relationship between documents and the context of a question, and the model may not be able to effectively utilize documents when their relevance to the question is not obvious or only contains partial information. In addition, how to reasonably infer connections between documents is also an important issue.G-RAG implements a graph neural network (GNN)-based re-ranker between the RAG retriever and the reader. The method combines the connection information between documents and semantic information (by abstracting semantic representation graphs) to provide a context-based ranker for RAG.

21. LLM-Graph-Builder [Porter].

Mover: gives an understandable home to confusing text. Not simply carry, but like an obsessive-compulsive patient, label each knowledge point, draw relationship lines, and finally build a well-organized knowledge building in Neo4j's database.

- Time: 05.29

- Project: https://github.com/neo4j-labs/llm-graph-builder

Neo4j open source LLM-based extraction of knowledge graph generator , you can convert unstructured data into a knowledge graph in Neo4j . The use of large models to extract nodes , relationships and their attributes from unstructured data .

22. MRAG [Octopus

Octopus: Instead of growing only one head to dead-end the problem, it grows multiple tentacles like an octopus, and each tentacle is responsible for grasping an angle. Simply put, this is the AI version of "multi-tasking".

- Time: 06.07

- Paper: Multi-Head RAG: Solving Multi-Aspect Problems with LLMs

- Project: https://github.com/spcl/MRAG

Existing RAG solutions do not focus on queries that may require access to multiple documents with significantly different content. Such queries arise frequently but are challenging because the embeddings of these documents may be far apart in the embedding space, making it difficult to retrieve them all. This paper introduces Multihead RAG (MRAG), a novel scheme that aims to fill this gap with a simple yet powerful idea: utilizing the activation of Transformer's multihead attention layer, rather than the decoder layer, as a key for fetching multifaceted documents. The driving motivation is that different attention heads can learn to capture different data aspects. Utilizing the corresponding activations produces embeddings representing data items and various aspects of the query, thus improving the retrieval accuracy of complex queries.

23. planRAG [strategist

Strategist: first develops a complete battle plan, then analyzes the situation based on rules and data, and finally makes the best tactical decision.

- Time: 06.18

- Thesis: PlanRAG: A Plan-then-Retrieval Augmented Generation for Generative Large Language Models as Decision Makers

- Project: https://github.com/myeon9h/PlanRAG

PlanRAG investigates how large language models can be used to solve solutions to complex data analysis decision problems by defining the Decision QA task, i.e., determining the best decision based on the decision problem Q, the business rules R, and the database D. PlanRAG first generates a decision plan, and then the retriever generates a query for the data analysis.

24. FoRAG [Writer].

Writers: first outline the framework of the article, and then expand and refine the content paragraph by paragraph. It is also equipped with an "editor" who helps to refine every detail through careful fact-checking and revision suggestions to ensure the quality of the work.

- Time: 06.19

- Paper: FoRAG: Factuality-optimized Retrieval Augmented Generation for Web-enhanced Long-form Question Answering

FoRAG proposes a novel outline augmented generator, where the generator uses outline templates to draft answer outlines based on the user query and context in the first stage, and extends each point based on the generated outline to construct the final answer in the second stage. A factual optimization approach based on a well-designed dual fine-grained RLHF framework is also proposed, which provides denser reward signals by introducing fine-grained design in the two core steps of factual evaluation and reward modeling.

25. Multi-Meta-RAG [Meta-Screener

Meta-filter: like an experienced data manager, it pinpoints the most relevant content from a huge amount of information through multiple filtering mechanisms. It does not only look at the surface, but also deeply analyzes the "identity tags" (metadata) of documents to ensure that every piece of information found is truly on topic.

- Time: 06.19

- Paper: Multi-Meta-RAG: Improving RAG for Multi-Hop Queries using Database Filtering with LLM-Extracted Metadata

- Project: https://github.com/mxpoliakov/multi-meta-rag

Multi-Meta-RAG uses database filtering and LLM-extracted metadata to improve RAG for selecting relevant documents related to a problem from various sources.

26. RankRAG [all-rounder

All-rounder: With a little training, you can play both the "judge" and "contestant" roles. Like a gifted athlete, they can outperform the pros in multiple disciplines with a little coaching, and they can master all of their skills.

- Time: 07.02

- Paper: RankRAG: Unifying Context Ranking with Retrieval-Augmented Generation in LLMs

RankRAG's fine-tunes a single LLM with instructions to perform the dual functions of both context ranking and answer generation. By adding a small amount of ranking data to the training data, the instructionally fine-tuned LLMs work surprisingly well and even outperform existing expert ranking models, including the same LLMs specifically fine-tuned on a large amount of ranking data. This design not only simplifies the complexity of multiple models in traditional RAG systems, but also enhances contextual relevance judgments and information utilization efficiency by sharing model parameters.

27. GraphRAG-Local-UI [Tuner].

Tuner: Converted a sports car into a practical car for local roads, adding a friendly dashboard to make it easy for everyone to drive.

- Time: 07.14

- Project: https://github.com/severian42/GraphRAG-Local-UI

GraphRAG-Local-UI is a local model adaptation version of Microsoft-based GraphRAG with a rich ecosystem of interactive user interfaces.

28. ThinkRAG [Little Secretary

Small Secretary: condense the huge body of knowledge into a pocket version, like a portable small secretary, without the need for large-scale equipment can help you find answers at any time.

- Time: 07.15

- Project: https://github.com/wzdavid/ThinkRAG

ThinkRAG large model retrieval enhancement generation system can be easily deployed on laptops to realize the local knowledge base intelligent Q&A.

29. Nano-GraphRAG [Lightly loaded].

Lightly loaded: like a lightly loaded athlete, it simplifies all the complicated equipment, but retains the core capabilities.

- Time: 07.25

- Project: https://github.com/gusye1234/nano-graphrag

Nano-GraphRAG is a smaller, faster, and more concise GraphRAG while retaining the core functionality.

30. RAGFlow-GraphRAG [Navigator

Navigator: open up shortcuts in the labyrinth of questions and answers, first draw a map to mark all the knowledge points, duplicate signposts are merged off, and specially slim down the map so that people asking for directions will not take the long way around.

- Time: 08.02

- Project: https://github.com/infiniflow/ragflow

RAGFlow draws on the implementation of GraphRAG and introduces knowledge graph construction as an optional option in the document preprocessing stage to serve QFS Q&A scenarios, and introduces improvements such as entity de-emphasis and Token optimization.

31. Medical-Graph-RAG [Digital Doctor].

Digital Doctor: like an experienced medical consultant, the complex medical knowledge is organized clearly with graphs, and the diagnostic suggestions are not made out of thin air, but are justified, so that both doctors and patients can understand the basis behind each diagnosis.

- Time: 08.08

- Paper: Medical Graph RAG: Towards Safe Medical Large Language Model via Graph Retrieval-Augmented Generation

- Project: https://github.com/SuperMedIntel/Medical-Graph-RAG

MedGraphRAG is a framework designed to address the challenges of applying LLM in medicine. It uses a graph-based approach to improve diagnostic accuracy, transparency and integration into clinical workflows. The system improves diagnostic accuracy by generating responses supported by reliable sources, addressing the difficulty of maintaining context in large amounts of medical data.

32. HybridRAG [Chinese Medicine Combined Formulas

Chinese Medicine Combination: Like the Chinese medicine "Combination", a single medicine is not as effective as several medicines combined together. Vector database is responsible for rapid retrieval, knowledge graph to supplement the relational logic, the two complement each other.

- Time: 08.09

- Thesis: HybridRAG: Integrating Knowledge Graphs and Vector Retrieval Augmented Generation for Efficient Information Extraction

A new approach based on the combination of Knowledge Graph RAG techniques (GraphRAG) and VectorRAG techniques, called HybridRAG, to augment a question and answer system for extracting information from financial documents is shown to generate accurate and contextually relevant answers. In the retrieval and generation phases, HybridRAG for retrieving context from vector databases and knowledge graphs outperforms traditional VectorRAG and GraphRAG in terms of retrieval accuracy and answer generation.

33. W-RAG [Evolutionary Search].

Evolutionary Search: like a search engine that is good at self-evolution, it learns what is a good answer by scoring article passages by a large model, and gradually improves its ability to find key information.

- Time: 08.15

- Paper: W-RAG: Weakly Supervised Dense Retrieval in RAG for Open-domain Question Answering

- Project: https://github.com/jmnian/weak_label_for_rag

Weakly Supervised Dense Retrieval for Open-domain Question Answering exploits the ranking capabilities of large language models to create weakly labeled data for training dense retrievers. The top K passages retrieved via BM25 are re-ranked by evaluating the probability that the large language model will generate the correct answer based on the question and each passage. The highest ranked passages were then used as positive training examples for dense retrieval.

34. RAGChecker [QC].

QC: Instead of simply judging whether an answer is right or wrong, the RAGChecker will thoroughly check every aspect of the whole answering process, from finding information to generating the final answer, just like a strict examiner, who will not only give a detailed grading report, but also point out where specific improvements are needed.

- Time: 08.15

- Thesis: RAGChecker: A Fine-grained Framework for Diagnosing Retrieval-Augmented Generation

- Project: https://github.com/amazon-science/RAGChecker

RAGChecker is a diagnostic tool that provides fine-grained, comprehensive, and reliable diagnostic reports for RAG systems, and provides actionable directions for further performance improvement. It not only evaluates the overall performance of the system, but also analyzes in-depth the performance of the two core modules, retrieval and generation.

35. Meta-Knowledge-RAG [Scholar].

Scholar: like a senior researcher in academia, it not only collects information, but also takes the initiative to think about the problem, annotate and summarize for each document, and even envision possible problems in advance. It will link related knowledge points together to form a knowledge network, so that the query becomes more depth and breadth, just like a scholar helping you do a research overview.

- Time: 08.16

- Thesis: Meta Knowledge for Retrieval Augmented Large Language Models

Meta-Knowledge-RAG (MK Summary) introduces a novel data-centric RAG workflow that transforms the traditional "retrieve-read" system into a more advanced "prepare-rewrite-retrieve-read " framework to achieve a higher domain expert level understanding of the knowledge base. Our approach relies on the generation of metadata and synthetic questions and answers for each document and the introduction of a new concept of meta-knowledge summarization for metadata-based document clustering. The proposed innovations enable personalized user query enhancement and deep information retrieval across knowledge bases.

36. communityKG-RAG [Community Exploration].

Community Exploration: like a wizard familiar with community relationship networks, it is adept at utilizing associations between knowledge and group characteristics to accurately find relevant information and verify its reliability without special learning.

- Time: 08.16

- Thesis: CommunityKG-RAG: Leveraging Community Structures in Knowledge Graphs for Advanced Retrieval-Augmented Generation in Fact- Checking

CommunityKG-RAG is a novel zero-sample framework that combines community structures in knowledge graphs with a RAG system to enhance the fact-checking process.CommunityKG-RAG is able to adapt to new domains and queries without additional training, and it exploits the multi-hop nature of community structures in knowledge graphs to significantly improve information retrieval accuracy and relevance.

37. TC-RAG [Memory Warlock

Memory Warlock: a brain with an auto-cleaning function for LLM. Just like when we solve a problem, we will write the important steps on the draft paper and cross them out when we are done. It is not rote memorization, what should be remembered is remembered, and what should be forgotten is emptied in time, like a schoolteacher who can clean up his room.

- Time: 08.17

- Paper: TC-RAG: Turing-Complete RAG's Case study on Medical LLM Systems

- Project: https://github.com/Artessay/TC-RAG

More efficient and accurate knowledge retrieval is achieved by introducing a Turing-complete system to manage state variables. By utilizing a memory stack system with adaptive retrieval, reasoning and planning capabilities, TC-RAG not only ensures controlled stopping of the retrieval process, but also mitigates the accumulation of erroneous knowledge through Push and Pop operations.

38. RAGLAB [Arena].

Arena: allows various algorithms to compete and compare fairly under the same rules, like a standardized testing process in a science lab, ensuring that each new method can be evaluated objectively and transparently.

- Time: 08.21

- Paper: RAGLAB: A Modular and Research-Oriented Unified Framework for Retrieval-Augmented Generation

- Project: https://github.com/fate-ubw/RAGLab

There is a growing lack of comprehensive and fair comparisons between novel RAG algorithms, and the high-level abstraction of open-source tools leads to a lack of transparency and limits the ability to develop new algorithms and evaluation metrics.RAGLAB is a modular, research-oriented open-source library that reproduces six algorithms and builds a comprehensive research ecosystem. With RAGLAB, we compare 6 algorithms fairly on 10 benchmarks, helping researchers to efficiently evaluate and innovate algorithms.

39. MemoRAG.

MemoRAG: It doesn't just look up information on demand, but it already understands and memorizes the entire knowledge base in depth. When you ask a question, it can quickly retrieve the relevant memories from this "super brain" and give an accurate and insightful answer, just like a knowledgeable expert.

- Time: 09.01

- Project: https://github.com/qhjqhj00/MemoRAG

MemoRAG is an innovative Retrieval Augmented Generation (RAG) framework built on top of an efficient ultra-long memory model. Unlike standard RAGs that primarily deal with queries with explicit information needs, MemoRAG utilizes its memory model to achieve a global understanding of the entire database. By recalling query-specific cues from memory, MemoRAG enhances evidence retrieval, resulting in more accurate and context-rich response generation.

40. OP-RAG [Attention Management].

ATTENTION MANAGEMENT: like reading a particularly thick book, you can't memorize every detail, but it's the people who know how to mark the key chapters that are the masters. It is not to look aimlessly, but like a senior reader, while reading and drawing down the key points in the focus, when needed, turn directly to the marked page.

- Time: 09.03

- Paper: In Defense of RAG in the Era of Long-Context Language Models

Extremely long contexts in LLM lead to less focus on relevant information and a potential decline in answer quality. Revisiting RAG in Long-Context Answer Generation We propose an order-preserving retrieval-enhanced generation mechanism, OP-RAG, that significantly improves the performance of RAG in long context Q&A applications.

41. AgentRE [Intelligent Extraction

Intelligent Extraction: like a sociologist who is good at observing relationships, it not only remembers key information, but also actively checks and thinks deeply to accurately understand complex relationship networks. Even in the face of intricate relationships, it can be analyzed from multiple perspectives to make sense of them and avoid looking at them.

- Time: 09.03

- Thesis: AgentRE: An Agent-Based Framework for Navigating Complex Information Landscapes in Relation Extraction

- Project: https://github.com/Lightblues/AgentRE

By integrating the memory, retrieval and reflection capabilities of a large-scale language model, AgentRE can effectively address the challenges of diverse relationship types and ambiguous relationships between entities in a single sentence in complex scenes in relationship extraction. agentRE consists of three major modules, which help agents efficiently acquire and process information, and significantly improve the performance of RE.

42. iText2KG [Architect].

Architect: like an organized engineer, it gradually transforms fragmented documents into a systematic knowledge network by refining, extracting and integrating information in steps, and it does not need to prepare detailed architectural drawings in advance, and it can be flexibly expanded and improved according to needs.

- Time: 09.05

- Thesis: iText2KG: Incremental Knowledge Graphs Construction Using Large Language Models

- Project: https://github.com/AuvaLab/itext2kg

iText2KG (Incremental Knowledge Graphs Construction) utilizes Large Language Models (LLMs) to construct knowledge graphs from original documents and achieves incremental knowledge graphs construction through four modules (Document Refiner, Incremental Entity Extractor, Incremental Relationship Extractor, and Graph Integrator) without the need for prior definition of ontologies or extensive supervised training.

43. GraphInsight [Graph Interpretation

GraphInsight: like an expert at infographic analysis, it knows to put important information in the most prominent position, while consulting references to supplement details when needed, and can STEP BY STEP reason about complex graphs, allowing AI to grasp the big picture without missing details.

- Time: 09.05

- Paper: GraphInsight: Unlocking Insights in Large Language Models for Graph Structure Understanding

GraphInsight is a new framework for improving LLMs' understanding of macro- and micro-level graphical information, based on two key strategies: 1) placing key graphical information in locations where LLMs' memory performance is strong; and 2) borrowing the idea of Retrieval-Augmented Generation (RAG) to introduce lightweight external knowledge bases for regions where memory performance is weak. In addition, GraphInsight explores the integration of these two strategies into the LLM agent process for composite graph tasks that require multi-step reasoning.

44. LA-RAG [Dialect Pass].

Dialect Pass: like a language expert who is well versed in dialects from different places, it can not only accurately recognize standard Mandarin, but also understand accents with local characteristics through careful speech analysis and contextual understanding, allowing AI to communicate with people from different regions without any barriers.

- Time: 09.13

- Paper: LA-RAG:Enhancing LLM-based ASR Accuracy with Retrieval-Augmented Generation

LA-RAG, a novel Retrieval-Augmented Generation (RAG) paradigm for LLM-based ASR.LA-RAG utilizes fine-grained token-level speech data storage and speech-to-speech retrieval mechanisms to improve ASR accuracy through LLM Context Learning (ICL) functionality.

45. SFR-RAG [Streamlined Retrieval].

Streamlined Retrieval: like a refined reference advisor, small in size but precise in function, it understands the need and knows how to seek external help, ensuring that answers are both accurate and efficient.

- Time: 09.16

- Thesis: SFR-RAG: Towards Contextually Faithful LLMs

SFR-RAG is a small language model fine-tuned with instructions focusing on context-based generation and minimizing illusions. By focusing on reducing the number of parameters while maintaining high performance, the SFR-RAG model includes function call functionality that allows it to dynamically interact with external tools to retrieve high-quality contextual information.

46. FlexRAG [compression expert

Compression expert: condense a long text into a summary of the essence, and the compression ratio can be flexibly adjusted according to the need, not only do not lose the key information, but also save the cost of storage and processing. It is like refining a thick book into a concise reading notes.

- Time: 09.24

- Paper: Lighter And Better: Towards Flexible Context Adaptation For Retrieval Augmented Generation

The context retrieved by FlexRAG is compressed into compact embeddings before being encoded by LLMs. These compressed embeddings are also optimized to enhance the performance of downstream RAGs.A key feature of FlexRAG is its flexibility to efficiently support different compression ratios and selectively preserve important contexts. Thanks to these technical designs, FlexRAG achieves superior generation quality while significantly reducing operational costs. Comprehensive experiments on various Q&A datasets validate our approach as a cost-effective and flexible solution for RAG systems.

47. coTKR [graph translation

Mapping Translation: Like a patient teacher, it first understands the ins and outs of knowledge, and then explains it step by step, not simply repeating it but relaying it in depth. At the same time, through the continuous collection of "student" feedback to improve their own way of explaining, so that the transfer of knowledge is more clear and effective.

- Time: 09.29

- Thesis: CoTKR: Chain-of-Thought Enhanced Knowledge Rewriting for Complex Knowledge Graph Question Answering

- Project: https://github.com/wuyike2000/CoTKR

The CoTKR (Chain-of-Thought Enhanced Knowledge Rewriting) approach alternately generates reasoning paths and corresponding knowledge, thus overcoming the limitation of single-step knowledge rewriting. In addition, to bridge the preference difference between the knowledge rewriter and the QA model, we propose a training strategy that aligns preferences from Q&A feedback to further optimize the knowledge rewriter by exploiting feedback from the QA model.

48. Open-RAG [Think Tank].

Think Tank: break down the huge language model into expert groups, so that they can both think independently and work together, and they are also particularly good at distinguishing between true and false information, and know whether to check the information or not at critical moments, like an experienced think tank.

- Time: 10.02

- Thesis: Open-RAG: Enhanced Retrieval-Augmented Reasoning with Open-Source Large Language Models

- Project: https://github.com/ShayekhBinIslam/openrag

Open-RAG improves reasoning in RAG with open-source large language models by converting arbitrarily dense large language models into parameter-efficient sparse Mixed-Mixture-of-Experts (MoE) models that are capable of handling complex reasoning tasks, including both single- and multi-hop queries.OPEN-RAG uniquely trains the models to cope with challenging interfering terms that appear to be relevant but are misleading.

49. TableRAG [Excel Expert].

Excel Expert: Instead of simply looking at table data, you know how to understand and retrieve data from both the table header and cell dimensions, just as proficiently using pivot tables to quickly locate and extract the key information you need.

- Time: 10.07

- Paper: TableRAG: Million-Token Table Understanding with Language Models

TableRAG designs a retrieval-enhanced generative framework specifically for table understanding, combining Schema and cell retrieval through query expansion to pinpoint key data before providing information to language models, enabling more efficient data encoding and accurate retrieval, dramatically shortening hint lengths, and reducing information loss. 50.

50. LightRAG [Spider-Man

Spider-Man: Flexibly weaving in and out of the web of knowledge, able to both grasp the filaments between the knowledge points and follow the clues through the web. Like a librarian with clairvoyance, he not only knows where each book is, but also knows which books should be read together.

- Time: 10.08

- Thesis: LightRAG: Simple and Fast Retrieval-Augmented Generation

- Project: https://github.com/HKUDS/LightRAG

This framework incorporates graph structures into the text indexing and retrieval process. This innovative framework employs a two-tier retrieval system that enhances comprehensive information retrieval from both low-level and high-level knowledge discovery. In addition, combining graph structures with vector representations facilitates efficient retrieval of relevant entities and their relationships, significantly improving response time while maintaining contextual relevance. This capability is further enhanced by an incremental update algorithm that ensures timely integration of new data, enabling the system to remain effective and responsive in a rapidly changing data environment.

51. AstuteRAG [Wise Judge].

AstuteRAG: Stay alert to external information, don't trust the search results, make good use of their accumulated knowledge, check the authenticity of the information, like a senior judge, weighing the evidence from multiple sources to reach a conclusion.

- Time: 10.09

- Thesis: Astute RAG: Overcoming Imperfect Retrieval Augmentation and Knowledge Conflicts for Large Language Models

Improve the robustness and trustworthiness of the system by adaptively extracting information from the internal knowledge of LLMs, combining it with external retrieval results, and finalizing the answer based on the reliability of the information.

52. TurboRAG [Stenographic Masters

Shorthand master: do your homework in advance and memorize all the answers in a small notebook. Like a pre-test surprise school bully, not a clinical huddle, but the common questions in advance to organize into the wrong book. When you need to directly turn out to use, save every time to deduce once on the spot.

- Time: 10.10

- Thesis: TurboRAG: Accelerating Retrieval-Augmented Generation with Precomputed KV Caches for Chunked Text

- Project: https://github.com/MooreThreads/TurboRAG

TurboRAG optimizes the inference paradigm of RAG systems by offline precomputing and storing KV caches of documents. Unlike traditional approaches, TurboRAG no longer computes these KV caches at each inference, but retrieves pre-computed caches for efficient pre-population, thus eliminating the need for repetitive online computation. This approach significantly reduces computational overhead and speeds up response time while maintaining accuracy.

53. StructRAG [organizer].

Organizer: Organize cluttered information in categories like an organizer closet. Like a schoolteacher who imitates the human mind, not by rote, but by drawing a mind map first.

- Time: 10.11

- Thesis: StructRAG: Boosting Knowledge Intensive Reasoning of LLMs via Inference-time Hybrid Information Structurization

- Project: https://github.com/Li-Z-Q/StructRAG

Inspired by the cognitive theory that humans convert raw information into structured knowledge when dealing with knowledge-intensive reasoning, the framework introduces a hybrid information structuring mechanism that constructs and utilizes structured knowledge in the most appropriate format according to the specific requirements of the task at hand. By mimicking human-like thought processes, it improves LLM performance on knowledge-intensive reasoning tasks.

54. VisRAG [Eyes of Fire].

Eyes of Fire: finally realizing that words are nothing more than a special representation of images. Like a reader who has opened the eyes of heaven, he no longer insists on parsing word by word, but directly "sees" the whole picture. Instead of OCR, I used a camera to understand the essence of "a picture is worth a thousand words".

- Time: 10.14

- Paper: VisRAG: Vision-based Retrieval-augmented Generation on Multi-modality Documents

- Project: https://github.com/openbmb/visrag

VisRAG: Vision-based Retrieval-augmented Generation on Multi-modality Documents Project: enhances the generation of documents by constructing a Vision-Language Model (VLM)-based RAG process that embeds and retrieves documents directly as images. Compared with traditional text RAG, VisRAG avoids information loss during parsing and preserves the information of the original document more comprehensively. Experiments show that VisRAG outperforms traditional RAG in both the retrieval and generation phases, with an end-to-end performance improvement of 25-39%.VisRAG not only makes effective use of the training data, but also demonstrates strong generalization capabilities, making it an ideal choice for multimodal document RAG.

55. AGENTiGraph [Knowledge Manager

Knowledge Manager: like a librarian who is good at conversations, it helps you organize and display your knowledge through daily communication, with a team of assistants ready to answer questions and update information, making knowledge management simple and natural.

- Time: 10.15

- Thesis: AGENTiGraph: An Interactive Knowledge Graph Platform for LLM-based Chatbots Utilizing Private Data

AGENTiGraph is a platform for knowledge management through natural language interaction. It integrates knowledge extraction, integration, and real-time visualization.AGENTiGraph employs a multi-intelligence architecture to dynamically interpret user intent, manage tasks, and integrate new knowledge, ensuring that it can adapt to changing user needs and data contexts.

56. RuleRAG [Rule Following]

Follow the rules: use rules to teach AI to do things, just like bringing a new person on board and giving an employee handbook first. Instead of learning aimlessly, it is like a strict teacher who first explains the rules and examples, and then lets the students do it by themselves. Do more, these rules will become muscle memory, the next time you encounter similar problems naturally know how to deal with.

- Time: 10.15

- Thesis: RuleRAG: Rule-guided retrieval-augmented generation with language models for question answering

- Project: https://github.com/chenzhongwu20/RuleRAG_ICL_FT

RuleRAG proposes a rule-guided retrieval-augmented generation method based on language models, which explicitly introduces symbolic rules as examples of contextual learning (RuleRAG - ICL) to guide the retriever to retrieve logically related documents in the direction of the rules and uniformly guides the generator to produce informed answers guided by the same set of rules. In addition, the combination of queries and rules can be further used as supervised fine-tuning data for updating the retrievers and generators (RuleRAG - FT) to achieve better rule-based instruction adherence, which in turn retrieves more supportive results and generates more acceptable answers.

57. Class-RAG [Judge

Judge: instead of deciding a case by rigid text, the judge studies the case through an ever-expanding library of precedents. Like an experienced judge, with a loose-leaf codex in hand, you can always turn to the latest cases to make the judgment both warm and measured.

- Time: 10.18

- Paper: Class-RAG: Content Moderation with Retrieval Augmented Generation

Content review classifiers are critical to the security of generative AI. However, the nuances between secure and insecure content are often difficult to distinguish. As technologies become more widely used, it becomes increasingly difficult and expensive to continuously fine-tune models to address risks. To this end, we propose the Class-RAG approach, which achieves immediate risk mitigation by dynamically updating the retrieval base. Compared with traditional fine-tuning models, Class-RAG is more flexible and transparent, and performs better in terms of classification and attack resistance. It is also shown that expanding the search base can effectively improve the audit performance at low cost.

58. Self-RAG [Reflector

Reflectors: When answering a question, they not only consult the information, but also constantly think and check whether their answers are accurate and complete. By "thinking while talking", they act like a prudent scholar and make sure that each point of view is supported by reliable evidence.

- Time: 10.23

- Paper: Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection

- Project: https://github.com/AkariAsai/self-rag

Self-RAG improves the quality and accuracy of language models through retrieval and self-reflection. The framework trains a single arbitrary language model that adaptively retrieves passages on demand and uses special markers called reflection markers to generate and reflect on retrieved passages and their own generated content. Generating reflective markers gives the language model controllability in the reasoning phase, allowing it to adapt its behavior to different task requirements.

59. SimRAG [self-taught].

Self-learning: When facing a specialized field, ask questions by yourself before answering them by yourself, and improve your professional knowledge reserve through continuous practice, just as students familiarize themselves with specialized knowledge by repeatedly doing exercises.

- Time: 10.23

- Paper: SimRAG: Self-Improving Retrieval-Augmented Generation for Adapting Large Language Models to Specialized Domains

SimRAG is a self-training method that equips LLMs with the joint ability of question-answering and question generation to adapt to specific domains. Good questions can only be asked if the knowledge is truly understood. These two capabilities complement each other to help the model better understand specialized knowledge. The LLM is first fine-tuned in terms of instruction following, Q&A and searching for relevant data. It then prompts the same LLM to generate a variety of domain-relevant questions from an unlabeled corpus, with additional filtering strategies to retain high-quality synthetic examples. By utilizing these synthetic examples, the LLM can improve its performance on domain-specific RAG tasks.

60. ChunkRAG [excerpt master].

Excerpt Dudes: first divide a long article into small paragraphs, then pick out the most relevant segments with a professional eye, without missing the point or being distracted by irrelevant content.

- Time: 10.23

- Thesis:

ChunkRAG: Novel LLM-Chunk Filtering Method for RAG Systems

ChunkRAG proposes LLM-driven chunk filtering methods to enhance the framework of RAG systems by evaluating and filtering retrieved information at the chunk level, where "chunks" represent smaller coherent parts of a document. Our approach employs semantic chunking to divide documents into coherent parts and utilizes relevance scoring based on a large language model to evaluate how well each chunk matches the user query. By filtering out less relevant chunks before the generation phase, we significantly reduce illusions and improve factual accuracy.

61. FastGraphRAG [Radar

Radar: like Google PageRank, a hot list is given to knowledge points. It's like an opinion leader in a social network, the more people follow it, the more likely it is to be seen. Instead of searching aimlessly, it is like a scout with radar, looking wherever the signal is strong.

- Time: 10.23

- Project: https://github.com/circlemind-ai/fast-graphrag

FastGraphRAG provides an efficient, interpretable and highly accurate Fast Graph Retrieval Augmented Generation (FastGraphRAG) framework. It applies the PageRank algorithm to the knowledge graph traversal process to quickly locate the most relevant knowledge nodes. By calculating the importance score of a node, PageRank enables GraphRAG to filter and sort information in the knowledge graph more intelligently. This is like equipping GraphRAG with an "Importance Radar" that can quickly locate key information in a vast amount of data.

62. AutoRAG [Tuner].

Tuner: An experienced tuner who doesn't rely on guesswork, but finds the best sound through scientific testing. It will automatically try a variety of RAG combinations, just like a tuner to test different audio equipment, and ultimately find the most harmonious "playing program".

- Time: 10.28

- Paper: AutoRAG: Automated Framework for optimization of Retrieval Augmented Generation Pipeline

- Project: https://github.com/Marker-Inc-Korea/AutoRAG_ARAGOG_Paper

The AutoRAG framework automatically identifies suitable RAG modules for a given dataset, and explores and approximates the optimal combination of RAG modules for that dataset. By systematically evaluating different RAG settings to optimize the technology selection, the framework is similar to the practice of AutoML in traditional machine learning, which optimizes the selection of RAG technologies through extensive experiments to improve the efficiency and scalability of the RAG system.

63. Plan × RAG [Project Manager

Project Manager: Plan first and then act, break down the big task into small tasks, and arrange multiple "experts" to work in parallel. Each expert is responsible for his or her own area, and finally the project manager coordinates and summarizes the results. This approach is not only faster and more accurate, but also clearly account for the source of each conclusion.

- Time: 10.28

- Paper: Plan×RAG: Planning-guided Retrieval Augmented Generation

Plan×RAG is a novel framework that extends the "retrieval-reasoning" paradigm of existing RAG frameworks to a "planning-retrieval" paradigm. plan×RAG formulates reasoning plans as directed acyclic graphs (DAGs) that decompose queries into Plan×RAG formulates the reasoning plan as a directed acyclic graph (DAG) and decomposes the query into interrelated atomic subqueries. Answer generation follows the DAG structure, which significantly improves efficiency by parallelizing retrieval and generation. While state-of-the-art RAG solutions require extensive data generation and fine-tuning of language models (LMs), Plan×RAG incorporates frozen LMs as plug-and-play experts to generate high-quality answers.

64. SubgraphRAG [locator].

Locator: instead of finding a needle in a haystack aimlessly, a small map of knowledge is accurately drawn so that AI can find answers quickly.

- Time: 10.28

- Thesis: Simple is Effective: The Roles of Graphs and Large Language Models in Knowledge-Graph-Based Retrieval-Augmented Generation

- Project: https://github.com/Graph-COM/SubgraphRAG

SubgraphRAG extends the KG-based RAG framework by retrieving subgraphs and utilizing LLMs for inference and answer prediction. A lightweight multilayer perceptron is combined with a parallel ternary scoring mechanism for efficient and flexible subgraph retrieval, while encoding directed structure distances to improve retrieval effectiveness. The size of the retrieved subgraphs can be flexibly adjusted to match the query requirements and the capability of the downstream LLM. This design strikes a balance between model complexity and inference capability, enabling a scalable and generalized retrieval process.

65. RuAG [Alchemist].

Alchemist: like an alchemist, it can distill huge amount of data into clear logic rules and express them in easy-to-understand language, so that AI can be more intelligent in practical applications.

- Time: 11.04

- Paper: RuAG: Learned-rule-augmented Generation for Large Language Models

aims to enhance the reasoning capabilities of large language models (LLMs) by automatically distilling large amounts of offline data into interpretable first-order logic rules and injecting them into LLMs. The framework uses Monte Carlo Tree Search (MCTS) to discover logical rules and transform these rules into natural language, enabling knowledge injection and seamless integration for LLM downstream tasks. The paper evaluates the effectiveness of the framework on public and private industrial tasks, demonstrating its potential to enhance LLM capabilities in diverse tasks.

66. RAGViz [Perspective Eye].

Perspective Eyes: Make the RAG system transparent, see which sentence the model is reading, like a doctor looking at an X-ray, and see where it's wrong at a glance.

- Time: 11.04

- Thesis: RAGViz: Diagnose and Visualize Retrieval-Augmented Generation

- Project: https://github.com/cxcscmu/RAGViz

RAGViz provides visualization of retrieval documents and model attention to help users understand the interaction between generated markup and retrieval documents, and can be used to diagnose and visualize RAG systems.

67. AgenticRAG [Intelligent Assistant].

Intelligent Assistant: no longer a simple find and copy, but with an assistant that can act as a confidential secretary. Like a competent administrator, not only will look up information, but also know when to call, when to meet, when to ask the leader.

- Time: 11.05

AgenticRAG describes a RAG based on the implementation of AI intelligences. specifically, it incorporates AI intelligences into the RAG process to coordinate its components and perform additional actions beyond simple information retrieval and generation to overcome the limitations of non-intelligent body processes.

68. HtmlRAG [Typographer].

Typesetter: takes knowledge not as a running tally, but like typesetting a magazine, bolding what should be bolded and red-labeling what should be red-labeled. Just like a fussy beauty editor, feel that the content alone is not enough, also need to talk about the layout, so that the focus can be clear at a glance.

- Time: 11.05

- Thesis: HtmlRAG: HTML is Better Than Plain Text for Modeling Retrieved Knowledge in RAG Systems

- Project: https://github.com/plageon/HtmlRAG

HtmlRAG uses HTML instead of plain text as the format for retrieving knowledge in RAG. HTML is better than plain text for modeling knowledge in external documents, and most LLMs have a strong understanding of HTML. HtmlRAG proposes HTML cleanup, compression, and pruning strategies to shorten HTML while minimizing information loss.

69. M3DocRAG [Sensory Daredevil].

Sensory Daredevil: Not just reading books, but also reading and recognizing pictures and hearing voices. Like an all-around contestant in a variety show, pictures can be read, text can be understood, when it is time to jump thinking, jumping, when it is time to focus on the details, a variety of challenges are difficult to defeat.

- Time: 11.07

- Thesis: M3DocRAG: Multi-modal Retrieval is What You Need for Multi-page Multi-document Understanding

M3DocRAG is a novel multi-modal RAG framework that flexibly adapts to a variety of document contexts (closed and open domains), question hops (single and multiple hops), and modes of evidence (text, charts, graphs, etc.) M3DocRAG uses a multi-modal retriever and an MLM to find relevant documents and answer questions, and thus can efficiently deal with single or multiple documents, while preserving visual information.

70. KAG [Masters of Logic

Master of Logic: not only find similar answers by feeling, but also the cause and effect relationship between knowledge. Like a rigorous math teacher, not only do we need to know what the answer is, but we also need to explain clearly how this answer is derived step by step.

- Time: 11.10

- Paper: KAG: Boosting LLMs in Professional Domains via Knowledge Augmented Generation

- Project: https://github.com/OpenSPG/KAG

The gap between the similarity of vectors in RAG and the relevance of knowledge inference, as well as the insensitivity to knowledge logics (e.g., numerical values, temporal relations, expert rules, etc.) hinders the effectiveness of expertise services.KAG is designed to leverage the strengths of Knowledge Graphs (KGs) and Vector Retrieval to address the above challenges, and to bi-directionally augment large-scale Language Models (LLMs) and Knowledge Graphs through five key aspects to improve generation and inference performance: (1) LLM-friendly knowledge representation, (2) cross-indexing between Knowledge Graph and raw chunks, (3) logical form-guided hybrid inference engine, (4) knowledge alignment with semantic inference, and (5) modeling capability enhancement of KAG.

71. filco [screener

Filter: like a rigorous editor, adept at identifying and retaining the most valuable information from large amounts of text, ensuring that each piece of content delivered to the AI is accurate and relevant.

- Time: 11.14

- Thesis: Learning to Filter Context for Retrieval-Augmented Generation

- Project: https://github.com/zorazrw/filco

FILCO improves the quality of the context provided to the generator by identifying useful contexts based on lexical and information-theoretic approaches, and by training context filtering models to filter the retrieved context.

72. LazyGraphRAG [Actuary].

Actuary: a step is a step if you can save it, and use the expensive large language model on the knife edge. Like a housewife who knows how to live, it is not to buy when you see a discount in the supermarket, but to compare before deciding where to spend your money for the best value.

- Time: 11.25

- Project: https://github.com/microsoft/graphrag

A novel graph-enhanced generation of enhanced retrieval (RAG) methods. This approach significantly reduces indexing and querying costs while maintaining or outperforming competitors in terms of answer quality, making it highly scalable and efficient across a wide range of use cases.LazyGraphRAG defers the use of LLM. During the indexing phase, LazyGraphRAG uses only lightweight NLP techniques to process the text, delaying the invocation of LLM until the actual query. This "lazy" strategy avoids high upfront indexing costs and achieves efficient resource utilization.

	Traditional GraphRAG	LazyGraphRAG
Indexing Phase	- Extract and describe entities and relationships using LLM - Generate summaries for each entity and relationship - Summarize community content using LLM - Generate embedding vectors - Generate Parquet files	- Extracting concepts and co-occurring relationships using NLP techniques - Constructing a concept map - Extracting community structure - Indexing phase without LLM
Query phase	- Answer queries directly using community summaries - Lack of query refinement and focusing on relevant information	- Use LLM to refine the query and generate subqueries - Select text snippets and communities based on relevance - Extract and generate answers using LLM - More aggregated on relevant content, more precise answers
LLM calls	- Heavily used in both indexing phase and query phase	- No use of LLM in the indexing phase - LLM calls only in the query phase - LLM is used more efficiently
Cost efficiency	- Indexing is expensive and time consuming - Query performance is limited by index quality	- Indexing cost is only 0.1% of traditional GraphRAG - Query efficiency is high and answer quality is good
Data Storage	- Parquet files are generated for indexed data, suitable for large-scale data storage and processing.	- Indexed data is stored in lightweight formats (e.g. JSON, CSV), which is more suitable for rapid development and small-scale data.
Usage Scenarios	- For scenarios that are not sensitive to computing resources and time - Need to build a complete knowledge graph in advance and store it as a Parquet file, so that it can be imported into the database for complex analysis.	- For scenarios that require fast indexing and response - Suitable for one-time queries, exploratory analysis and streaming data processing

RAG Survey.

A Survey on Retrieval-Augmented Text Generation
Retrieving Multimodal Information for Augmented Generation: A Survey
Retrieval-Augmented Generation for Large Language Models: A Survey
Retrieval-Augmented Generation for AI-Generated Content: A Survey
A Survey on Retrieval-Augmented Text Generation for Large Language Models
RAG and RAU: A Survey on Retrieval-Augmented Language Model in Natural Language Processing
A Survey on RAG Meeting LLMs: Towards Retrieval-Augmented Large Language Models
Evaluation of Retrieval-Augmented Generation: A Survey
Retrieval-Augmented Generation for Natural Language Processing: A Survey
Graph Retrieval-Augmented Generation: A Survey
A Comprehensive Survey of Retrieval-Augmented Generation (RAG): Evolution, Current Landscape and Future Directions
Retrieval Augmented Generation (RAG) and Beyond: A Comprehensive Survey on How to Make Your LLMs Use External Data More Wisely

RAG Benchmark

Benchmarking Large Language Models in Retrieval-Augmented Generation
RECALL: A Benchmark for LLMs Robustness against External Counterfactual Knowledge
ARES: An Automated Evaluation Framework for Retrieval-Augmented Generation Systems
RAGAS: Automated Evaluation of Retrieval Augmented Generation
CRUD-RAG: A Comprehensive Chinese Benchmark for Retrieval-Augmented Generation of Large Language Models
FeB4RAG: Evaluating Federated Search in the Context of Retrieval Augmented Generation
CodeRAG-Bench: Can Retrieval Augment Code Generation?
Long2RAG: Evaluating Long-Context & Long-Form Retrieval-Augmented Generation with Key Point Recall

This article was written by Zhidong Fan, Open Source Leader of Ant Graph Computing and Graph Computing Evangelist.

72 RAG Implementation Scenarios - There's One for Everyone!

Previous article.

1. GraphReader [graphical expert

2. MM-RAG [Multi-faceted Hand].

3. CRAG [self-correction

4. RAPTOR [hierarchical summarization

5. T-RAG [Personal Advisor].

6. RAT [Thinker

7. RAFT [master of open-book]

8. Adaptive-RAG [Adaptive-RAG].

9. HippoRAG [hippocampus].

10. RAE [Intelligent Editing

11. RAGCache [Warehouseman].

12. GraphRAG [Community Summary

13. R4 [Orchestration Master

14. IM-RAG [self-talk].

15. AntGroup-GraphRAG [A Hundred Ways].

16. Kotaemon [Lego].

17. FlashRAG [Treasure Chest

18. GRAG [Detective

19. Camel-GraphRAG [Left and Right Open Bows

20. G-RAG [Stringer Artifacts].

21. LLM-Graph-Builder [Porter].

22. MRAG [Octopus

23. planRAG [strategist

24. FoRAG [Writer].

25. Multi-Meta-RAG [Meta-Screener

26. RankRAG [all-rounder

27. GraphRAG-Local-UI [Tuner].

28. ThinkRAG [Little Secretary

29. Nano-GraphRAG [Lightly loaded].

30. RAGFlow-GraphRAG [Navigator

31. Medical-Graph-RAG [Digital Doctor].

32. HybridRAG [Chinese Medicine Combined Formulas

33. W-RAG [Evolutionary Search].

34. RAGChecker [QC].

35. Meta-Knowledge-RAG [Scholar].

36. communityKG-RAG [Community Exploration].

37. TC-RAG [Memory Warlock

38. RAGLAB [Arena].

39. MemoRAG.

40. OP-RAG [Attention Management].

41. AgentRE [Intelligent Extraction

42. iText2KG [Architect].

43. GraphInsight [Graph Interpretation

44. LA-RAG [Dialect Pass].

45. SFR-RAG [Streamlined Retrieval].

46. FlexRAG [compression expert

47. coTKR [graph translation

48. Open-RAG [Think Tank].

49. TableRAG [Excel Expert].

50. LightRAG [Spider-Man

51. AstuteRAG [Wise Judge].

52. TurboRAG [Stenographic Masters

53. StructRAG [organizer].

54. VisRAG [Eyes of Fire].

55. AGENTiGraph [Knowledge Manager

56. RuleRAG [Rule Following]

57. Class-RAG [Judge

58. Self-RAG [Reflector

59. SimRAG [self-taught].

60. ChunkRAG [excerpt master].

61. FastGraphRAG [Radar

62. AutoRAG [Tuner].

63. Plan × RAG [Project Manager

64. SubgraphRAG [locator].

65. RuAG [Alchemist].

66. RAGViz [Perspective Eye].

67. AgenticRAG [Intelligent Assistant].

68. HtmlRAG [Typographer].

69. M3DocRAG [Sensory Daredevil].

70. KAG [Masters of Logic

71. filco [screener

72. LazyGraphRAG [Actuary].

RAG Survey.

RAG Benchmark