RAG system development 01: Using rig to call ollama's model

Explore the RAG system development under the Rust language ecosystem, from basic environment construction to model calling, this series will take you to a deeper understanding.
Core content:
1. How to set up the Rust development environment
2. Use RsProxy to configure Rustup and crates.io images
3. Install the ollama model and choose project development tools
This is a series of articles that will introduce how to develop a RAG system based on the Rust language ecosystem. This article is the first one and mainly introduces how to use rig [1] to call the ollama model.
Project Preparation
Setting up your Rust development environment
It is recommended to use RsProxy to set up the Rust development environment. The steps are very simple:
1. Set up the Rustup image and modify the configuration ~/.zshrc
or~/.bashrc
export RUSTUP_DIST_SERVER="https://rsproxy.cn"
export RUSTUP_UPDATE_ROOT="https://rsproxy.cn/rustup"
2. Install Rust (please complete the environment variable import in step 1 and source the rc file or restart the terminal to take effect)
curl --proto '=https' --tlsv1.2 -sSf https://rsproxy.cn/rustup-init.sh | sh
3. Set up the crates.io image and modify the configuration ~/.cargo/config.toml
:
[source.crates-io]
replace-with = 'rsproxy-sparse'
[source.rsproxy]
registry = "https://rsproxy.cn/crates.io-index"
[source.rsproxy-sparse]
registry = "sparse+https://rsproxy.cn/index/"
[registries.rsproxy]
index = "https://rsproxy.cn/crates.io-index"
[net]
git-fetch-with-cli = true
Install ollama and download the model
For detailed installation and usage, please refer to my previous article: Running deepseek-r1 locally, LLM installation concise guide
Create a project
We recommend using VSCode as a development tool and installing the rust-analyzer plugin . Execute the following commands in the command line terminal to create a Rust project and add the necessary crates:
cargo new fusion-rag
cd fusion-rag
cargo add rig-core --features derive
cargo add tokio --features full
Now that the project has been built, you can open it through VSCode
code .
Execute the default main.rs
File can be run successfully.
Using rig-core
Accessing the Ollama API in openai-compatible mode
edit main.rs
File, modify it to the following code:
use rig::{completion::Prompt, providers};
#[tokio::main]
async fn main () -> Result <(), Box < dyn core::error::Error>> {
let client = providers::openai::Client:: from_url ( "ollama" , "http://localhost:11434/v1" );
let v1 = client
.agent ( "qwen2.5:latest" ) // .agent ("deepseek-r1:latest")
// preamble is used to set the `system` part of the conversation, usually set to the prompt of the chat context
. preamble ( "You are an AI assistant. You are better at logical reasoning and conversations in Chinese and English." )
.build () ;
// prompt is used to set the `user` part of the conversation, which is used to provide the content of each conversation
let response = v1.prompt ( "Which is bigger , 1.1 or 1.11?" ) .await ?;
println! ( "Answer: {}" , response);
Ok (())
}
Running the program will give you the following output:
Answer: In numerical comparison, when 1.1 and 1.11 are compared, it can be seen that 1.11 is greater than 1.1.
The specific mathematical comparison process is as follows:
-First compare the first digit after the decimal point. In this case both are "1", so this digit is equal.
-Then we continue to compare the next digit, which is the digit after the second decimal point. For 1.1, there is no digit after this step, so we assume it is 0 (usually padded with zeros in practice), so we can think that 1.1 is equivalent to 1.10. At this time, we can see that when comparing "1.10" and "1.11", the result of "1.11" is obviously greater than "1.10".
So, the conclusion is: 1.11 is greater than 1.1.
Tip: Use
deepseek-r1:latest
Models can get more detailed answers (including thought processes), but they require more resources and the output will be longer. Readers can choose the model that suits them.
Implementing RAG via Embedding Model
nomic-embed-text model
nomic-embed-text
It is a model specifically designed for generating text embeddings. Text embedding is the process of converting text data into vector representations. These vectors can capture the semantic information of the text and are very useful in many natural language processing tasks, such as information retrieval (finding documents with semantics similar to the query text), text classification, cluster analysis, etc. You can download this model using the following command.
ollama pull nomic-embed-text
Add crates dependency
cargo add serde
Implementing RAG logic
edit main.rs
Update the file to the following code:
use rig::{
completion::Prompt, embeddings::EmbeddingsBuilder, providers,
vector_store::in_memory_store::InMemoryVectorStore, Embed,
};
use serde::Serialize;
// Data that needs to be RAG processed. Need to perform a vector search on the `definitions` field,
// So we mark `WordDefinition` with the `#[embed]` macro to derive the `Embed` trait.
#[derive(Embed, Serialize, Clone, Debug, Eq, PartialEq, Default)]
struct WordDefinition {
id: String ,
word: String ,
#[embed]
definitions: Vec < String >,
}
#[tokio::main]
async fn main () -> Result <(), Box < dyn core::error::Error>> {
const MODEL_NAME: & str = "qwen2.5" ;
const EMBEDDING_MODEL: & str = "nomic-embed-text" ;
let client = providers::openai::Client:: from_url ( "ollama" , "http://localhost:11434/v1" );
let embedding_model = client. embedding_model (EMBEDDING_MODEL);
// Generate embedding vectors for all document definitions using the specified embedding model
let embeddings = EmbeddingsBuilder:: new (embedding_model. clone ())
. documents ( vec! [
WordDefinition {
id: "doc0" . to_string (),
word: "flurbo" . to_string (),
definitions: vec !
"1. *flurbo* (noun): A flurbo is a green alien that lives on a cold planet." . to_string (),
"2. *flurbo* (noun): A fictional digital currency that originated from the animated series Rick and Morty." . to_string ()
]
},
WordDefinition {
id: "doc1" . to_string (),
word: "glarb glarb" . to_string (),
definitions: vec !
"1. *glarb glarb* (noun): glarb glarb is an ancient tool used by the ancestors of the inhabitants of the planet Jiro to cultivate the land." . to_string (),
"2. *glarb glarb* (noun): a fictional creature found in a remote swamp on the planet Glibbo in the Andromeda Galaxy." . to_string ()
]
},
])?
.build ( )
. await ?;
// Create vector storage using these embeddings
let vector_store = InMemoryVectorStore:: from_documents (embeddings);
// Create vector storage index
let index = vector_store.index ( embedding_model );
let rag_agent = client
.agent (MODEL_NAME )
.preamble (
"You are the dictionary assistant here, helping users understand the meaning of words.
Below you will find other non-standard word definitions that may be useful. " ,
)
. dynamic_context ( 1 , index)
.build () ;
// Prompt and print the response
let response = rag_agent.prompt ( "What does \ "glarb glarb\" mean?" ). await ?;
println! ( "{}" , response);
Ok (())
}
Run the program first to see the effect, and you can get the following output:
$ cargo run -q
In the definition given, "glarb glarb" has the following two meanings:
1. **Noun**: This is an ancient tool used by the ancestors of the inhabitants of the planet Jiro to cultivate the land.
2. **Noun**: A fictional creature found in a remote swamp on the planet Glibbo in the Andromeda Galaxy.
Note that this is based on the document definition provided, and "glarb glarb" may be two different nouns with different meanings and contexts.
When we comment out .dynamic_context(1, index)
Run it again with one line, and the output is as follows:
$ cargo run -q
Sorry, "glarb glarb" is not a known word or expression with no clear meaning in the standard language. It could be a typo or a made-up phrase for a specific situation. More context is needed to determine the exact meaning. If you see this phrase in a game, book, or special community, you may need to refer to the rules or explanation in that context.
As you can see, we are using .dynamic_context(1, index)
function, which searches for the most similar documents from the vector storage based on the input prompt, and then adds these documents to the prompt, thus achieving the RAG (Retrieval Augmented Generation) effect.
summary
This article briefly introduces how to use the rig-core library to implement RAG using Ollama's local model. This is a basic example, and actual applications may require some adjustments and extensions based on requirements. There will be more detailed introductions and examples later, such as document (PDF, Word, Excel, PPT) parsing, data persistence storage, etc. Please stay tuned.