LlamaIndex combines Ragflow to create high-performance large-model RAG applications

Combine LlamaIndex's data framework and Ragflow's workflow orchestration to create efficient large language model applications.
Core content:
1. LlamaIndex's data connection and indexing features
2. Ragflow's workflow management and multi-task execution capabilities
3. LlamaIndex and Ragflow work together to achieve multi-functional large language model application development
LlamaIndex and Ragflow join hands to create a powerful combination for large language model applications.
LlamaIndex and Ragflow are two open source tools that bring great convenience to developers. As a data framework, LlamaIndex can easily connect large language models with various external data sources, whether it is structured data (such as SQL, NoSQL databases), unstructured data (such as documents, web pages), or private data (obtained through APIs). Ragflow, as a workflow orchestration tool, focuses on managing complex large language model pipeline execution processes to ensure that the entire processing process proceeds in an orderly manner.
The two complement each other and together provide a comprehensive solution for building powerful and highly scalable large language model applications, helping developers to innovate and practice more efficiently in this field.
1 Definition
1.1 LlamaIndex
LlamaIndex allows developers to connect large language models with a variety of external data sources, including structured data (SQL databases, non-relational databases), unstructured data (documents, web pages), and private data (APIs). With it, developers can build large language model applications that can obtain information and reason widely.
LlamaIndex has many features:
Convenient data connectors : It comes with a library of pre-built data connectors that adapt to common data sources. When connecting to new data sources, developers do not need to write custom code. Efficient data indexing : External data can be indexed to quickly search and retrieve information in large data sets. Smart Q&A function : It can answer questions based on external data sources, making it easier for developers to create Q&A applications for specific topics or documents.
1.2 Ragflow
As a workflow orchestration tool, Ragflow can effectively manage the execution process of complex large language model pipelines. With this feature, it provides strong support for building large language model applications with multi-task execution capabilities. These tasks include:
Data Retrieval : Ragflow can retrieve data from external data sources. Data processing : Ragflow can process data, such as cleaning, transforming, and aggregating data. Large language model reasoning : Ragflow can perform large language model reasoning tasks. Output Generation : Ragflow can generate output in a variety of formats, such as text, tables, or charts.
1.3 LlamaIndex and Ragflow Collaboration
LlamaIndex and Ragflow can be used together to build powerful large language model applications.
LlamaIndex is responsible for data interaction, connecting large language models with various data sources, and can also index and query data to broaden the channels for obtaining model information. Ragflow focuses on workflow orchestration and manages complex large language model pipeline execution.
The collaboration between the two makes it possible to develop multifunctional large language model applications. These applications can achieve tasks such as question answering, text generation, and data analysis, meet the needs of different scenarios, and help large language models be widely used.
2 Code Implementation
Next, we implement the code of LlamaIndex and Ragflow step by step:
Step 1: Install the library, initialize the API key and download the data
pip install -U llama-index
# Initialize API key
import os
os.environ[ "OPENAI_API_KEY" ] = "sk-proj-..."
# Download data
!mkdir -p data
!wget --user-Agent "Mozilla" "https://arxiv.org/pdf/2307.09288.pdf" -O "data/llama2.pdf"
Step 2: Workflow Events
from llama_index.core.workflow import Event
from llama_index.core.schema import NodeWithScore
class RetrieverEvent (Event) :
"""Run the search results"""
nodes: list[NodeWithScore]
class RerankEvent (Event) :
"""The result of reordering the retrieved nodes"""
nodes: list[NodeWithScore]
Step 3: Complete Workflow
from llama_index.core import SimpleDirectoryReader, VectorStoreIndex
from llama_index.core.response_synthesizers import CompactAndRefine
from llama_index.core.postprocessor.llm_rerank import LLMRerank
from llama_index.core.workflow import (
Context,
Workflow,
StartEvent,
StopEvent,
step,
)
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding
class RAGWorkflow (Workflow) :
@step(pass_context=True)
async def ingest (self, ctx: Context, ev: StartEvent) -> StopEvent | None :
"""Entry point for ingesting documents, triggered by a StartEvent containing `dirname`."""
dirname = ev.get( "dirname" )
if not dirname:
return None
documents = SimpleDirectoryReader(dirname).load_data()
ctx.data[ "index" ] = VectorStoreIndex.from_documents(
documents=documents,
embed_model=OpenAIEmbedding(model_name= "text-embedding-3-small" ),
)
return StopEvent(result= f"Indexed {len(documents)} documents." )
@step(pass_context=True)
async def retrieve (
self, ctx: Context, ev: StartEvent
) -> RetrieverEvent | None :
"""The entry point of RAG, triggered by a StartEvent containing a `query`."""
query = ev.get( "query" )
if not query:
return None
print( f"Query the database with: {query} " )
# Store the query in the global context
ctx.data[ "query" ] = query
# Get the index from the global context
index = ctx.data.get( "index" )
if index is None :
print( "Index is empty, load some documents before querying!" )
return None
retriever = index.as_retriever(similarity_top_k= 2 )
nodes = retriever.retrieve(query)
print( f"Retrieved {len(nodes)} nodes." )
return RetrieverEvent(nodes=nodes)
@step(pass_context=True)
async def rerank (self, ctx: Context, ev: RetrieverEvent) -> RerankEvent:
# Reorder the nodes
ranker = LLMRerank(
choice_batch_size= 5 , top_n= 3 , llm=OpenAI(model= "gpt-4o-mini" )
)
print(ctx.data.get( "query" ), flush= True )
new_nodes = ranker.postprocess_nodes(
ev.nodes, query_str=ctx.data.get( "query" )
)
print( f"Reranked nodes to {len(new_nodes)} " )
return RerankEvent(nodes=new_nodes)
@step(pass_context=True)
async def synthesize (self, ctx: Context, ev: RerankEvent) -> StopEvent:
"""Return a streaming response with the reordered nodes."""
llm = OpenAI(model= "gpt-4o-mini" )
summarizer = CompactAndRefine(llm=llm, streaming= True , verbose= True )
query = ctx.data.get( "query" )
response = await summarizer.asynthesize(query, nodes=ev.nodes)
return StopEvent(result=response)
Step 4: Run the workflow
w = RAGWorkflow()
# Ingest Documents
await w.run(dirname = "data" )
# Run the query
result = await w.run(query= "How was Llama2 trained?" )
async for chunk in result.async_response_gen():
print(chunk, end= "" , flush= True )
Query the database with: How was Llama2 trained?
Retrieved 2 nodes.
Llama 2 was trained through a multi-step process that began with pretraining using publicly available online sources. This was followed by the creation of an initial version of Llama 2-Chat through supervised fine-tuning. The model was then iteratively refined using Reinforcement Learning with Human Feedback (RLHF) methodologies, which included techniques like rejection sampling and Proximal Policy Optimization (PPO).
During pretraining, the model utilized an optimized auto-regressive transformer architecture, incorporating robust data cleaning, updated data mixes, and training on a significantly larger dataset of 2 trillion tokens. The training process also involved increased context length and the use of grouped-query attention (GQA) to enhance inference scalability.
The training employed the AdamW optimizer with specific hyperparameters, a cosine learning rate schedule, and gradient clipping. The models were pretrained on Meta 's Research SuperCluster and internal production clusters, utilizing NVIDIA A100 GPUs for the training process.
3 Conclusion
LlamaIndex and Ragflow play an important role in the development of large language model (LLM) applications. These two open source tools have unique advantages and can help build applications based on large language models.
LlamaIndex can connect to data sources and process data, and Ragflow can efficiently orchestrate workflows. The two work together to provide a comprehensive solution for building powerful and scalable large language model applications.
For relevant technical personnel, exploring LlamaIndex and Ragflow and using them to build applications will help keep up with technology trends and improve development capabilities. We hope that everyone will tap their potential in practice and promote the innovative development of large language model applications.