Ollama-Deep-Researcher-Local Mac combined with the Moda community model to build a web research assistant

Ollama-Deep-Researcher

Ollama Deep Researcher is a fully localized web research assistant that works with any LLM hosted on Ollama. Enter a topic and it will generate web search queries, collect web search results (via Tavily by default), summarize web search results, reflect on the summary to check for knowledge gaps, generate new search queries to address gaps, search and improve the summary, for a user-defined number of times. It will provide the user with a final markdown summary containing all sources used.

How it works

Ollama Deep Researcher was inspired by IterDRAG. This approach breaks down a query into subqueries, retrieves documents for each subquery, answers the subquery, and then builds the answer by retrieving documents for the second subquery. In Ollama Deep Researcher, we do something similar:

Given a user-provided topic, generate a web search query using the local LLM (via Ollama)
Use a search engine (configured as DuckDuckGo, Tavily, or Perplexity, Tavily is recommended for this article) to find relevant sources
Use LLM to summarize web search results related to a user-provided research topic
Then, use LLM to reflect and summarize and identify knowledge gaps
LLM generates new search queries to address knowledge gaps
The process is repeated and the summary is constantly updated with new information from web searches.
Runs a configurable number of iterations (see the configuration tab)

Local Mac Best Practices

Download Ollama and prepare the model

Click on Ollama to download. After downloading, activate it:

ollama serve

Prepare the required models and select the appropriate large model from the model page of the MoDa community ( https://modelscope.cn/models?name=gguf&page=1) or the model page of Ollama ( https://ollama.com/search)

Take the QWQ 32B model as an example and use this model:

ollama run modelscope.cn/Qwen/QwQ-32B-GGUF

Download and configure ollama-deep-researcher

Download the ollama-deep-researcher repository:

git clone https://github.com/langchain-ai/ollama-deep-researcher.gitcd ollama-deep-researcher

Create an environment variable file .env and configure environment variables

cp .env.example .env

Fill the following content into the .env file:

OLLAMA_MODEL: The name of the model used. You can replace it with the model you downloaded from Ollama

SEARCH_API: Find the browser name used by the web page, and select one from duckduckgo, tavily, and perplexity. Duckduckgo does not require API_KEY, and the other two require going to the corresponding website to obtain API_KEY (due to network restrictions, you can first test whether these websites can be opened in the browser)

OLLAMA_BASE_URL=http://localhost:11434 OLLAMA_MODEL=qwq SEARCH_API=tavilyTAVILY_API_KEY=tvly-xxxxxPERPLEXITY_API_KEY=pplx-xxxxx MAX_WEB_RESEARCH_LOOPS=3FETCH_FULL_PAGE=

The web search tool recommended in this article is tavily: https://tavily.com/

Getting started with LangGraph server

Enter the following command to start running:

curl -LsSf https://astral.sh/uv/install.sh | shuvx --refresh --from "langgraph-cli[inmem]" --with-editable . --python 3.11 langgraph dev

Open the prompted local URL 127.0.0.1:2024 , enter the question, and start analyzing~

Output

The output of the figure is a markdown file containing a summary of the study and citations to the sources used.

All sources collected during the study are saved to the graph state.

They can be visualized in the graph state, which can be seen in LangGraph Studio: