LongRefiner: A new approach to solving the problem of long document retrieval and enhanced generation

Written by

Iris Vance

Updated on:June-19th-2025

Large language models and RAG are increasingly used, but they still face many challenges when processing long documents. Today we will talk about a new method to solve this problem - LongRefiner.

Background Issues: Two Major Challenges in Processing Long Documents

There are two main pain points when using a retrieval-augmented generation (RAG) system to process long documents:

Information clutter : Long documents often contain a lot of content that is irrelevant to the user's question, just like looking for a needle in a haystack. It is difficult for the model to accurately find truly useful information.
High computational cost : Processing a complete long document will greatly increase the input length, resulting in increased consumption of computing resources and slower system response, which is especially obvious in practical applications.

LongRefiner: Three-step strategy

As shown in the figure, to address these problems, researchers proposed LongRefiner, a plug-and-play document refining system. It improves the efficiency of long document processing through three key steps:

1. Two-layer query analysis

Different questions require different depths of information. LongRefiner divides queries into two types:

Partial query : only a part or fragment of the document can be answered
Global queries : require a comprehensive understanding of the entire document to answer

The system will first determine what type of question the user has and then decide how much information to extract.

2. Document Structuring

Turning a disorganized long document into an organized structured document mainly includes:

Design a document structure representation based on XML, using special tags (such as<section>,<subsection>) marks the hierarchical structure of the document
Use Wikipedia web page data to build a document structure tree for subsequent processing

3. Adaptive document refinement

Depending on the type of question, the system will evaluate the importance of each part of the document from two perspectives:

Local perspective : Starting from the smallest unit of the document (such as a paragraph), calculate the relevance to the query
Global perspective : starting from the overall structure of the document to ensure a comprehensive understanding of the document

Finally, the system combines the scores from these two perspectives to filter out the most relevant content to answer the question.

Experimental results: Facts speak louder than words

The researchers tested it on a variety of question-answering datasets, and the results were quite impressive:

LongRefiner achieves the best performance on all tested datasets while maintaining low latency.
Compared with existing methods, the performance is improved by more than 9%
Compared to methods that directly use the full document, LongRefiner reduces the amount of tags used by 10 times, reduces latency by 4 times, and performs better on most datasets.

Key Findings

The experimental analysis also revealed several interesting findings:

The three components in the system (two-layer query analysis, document structuring, and adaptive refinement) are indispensable. Removing any one of them will result in a significant performance degradation.
As model parameters increase, performance gains become smaller
LongRefiner is particularly good at processing longer documents
This method is stable on different base generators

Summarize

LongRefiner provides an efficient solution for RAG systems with long documents. By understanding query types, structured documents, and adaptive refinement mechanisms, it successfully reduces computational costs while maintaining high performance. This research provides new ideas for future large language models to handle long documents.

For application scenarios that need to process a large number of long documents, such as intelligent customer service, document retrieval systems, knowledge base question and answer, etc., LongRefiner is undoubtedly a technology worthy of attention.