Woter AI detection.Hurry - ends Jul 16th

New Year Sales :up to 80% OFF

AI Humanize AI Translator Bypass AI AI Rewriter AI Detector

PRICING

TRY FOR FREE

Powerful AI research assistant Local Deep Research

Written by

Audrey Miles

Updated on:July-08th-2025

Project Introduction

A powerful AI-driven research assistant that uses multiple LLMs and web searches for deep, iterative analysis. The system can run locally to protect privacy or be configured to use cloud-based LLMs for enhanced functionality.

characteristic
? Advanced research capabilities

Automatic in-depth research and intelligent follow-up questions
Citation tracking and source verification
Multiple iterations of analysis for comprehensive coverage
Full-text web page content analysis (not just snippets)

? Flexible LLM support

Local AI processing, using the Ollama model
Cloud LLM support (Claude, GPT)
Supports all Langchain models
Configurable model selection, according to needs

? Rich output options

Detailed research findings, with citations
Comprehensive research report
Quick summary for quick insights
Source tracking and verification

Privacy first

Runs entirely on your machine, using local models
Configurable search settings
Transparent data processing

? Enhanced search integration

Automatically select search sources: The smart "Automatic" search engine will intelligently analyze your query and select the most appropriate search engine based on the query content
Wikipedia factual knowledge integration
arXiv scientific papers and academic research collection
PubMed integrates biomedical literature and medical research
DuckDuckGo web search integration (may encounter rate limits)
SerpAPI integration into Google search results (API key required)
Google programmable search engine integration for custom search experience (API key required)
Guardian news articles and journalism integration (API key required)
Local RAG Search Private Documents - Search your documents using vector embeddings
Full-text web content retrieval
Source filtering and verification
Configurable search parameters

? Local Document Search (RAG)

Local document search based on vector embedding
Create custom document collections on different topics
Privacy protection - your documents stay on your device
Smart chunking and retrieval
Compatible with various document formats (PDF, text, Markdown, etc.)
Automatically integrate with metasearch engines for unified querying

Example Study: Fusion Energy Development
The repository contains complete research examples that showcase the tool's capabilities. For example, our Fusion Energy research analysis provides the following comprehensive overview:

Latest scientific breakthroughs in nuclear fusion research (2022-2025)
Private sector funding exceeds $6 billion
Experts predict timeline for commercial fusion energy
Regulatory frameworks are developing for converged deployments
Technical challenges that must be overcome to achieve commercial viability

This example demonstrates the system’s ability to perform multiple research iterations, tracing evidence trails across scientific and commercial domains and synthesizing information from different sources while maintaining appropriate citations.

Install

Clone the repository:

git clone https://github.com/yourusername/local-deep-research.gitcd local-deep-research

Install dependencies:

pip install -r requirements.txt

Install Ollama (for local models):

# Install Ollama from https://ollama.aiollama pull mistral # Default model - many work really well choose best for your hardware (fits in GPU)

Configure environment variables:

# Copy the templatecp .env.template .env
# Edit .env with your API keys (if using cloud LLMs)ANTHROPIC_API_KEY=your-api-key-here   # For ClaudeOPENAI_API_KEY=your-openai-key-here   # For GPT modelsGUARDIAN_API_KEY=your-guardian-api-key-here   # For The Guardian search

use

Terminal usage (not recommended):

python main.py

Web interface
The project includes a web interface for a more user-friendly experience:

python app.py

This will start a local web server that you can access in your browser via http://127.0.0.1:5000 access.

Web interface features:

Dashboard: An intuitive interface for launching and managing research queries
Live Updates: Track research progress with real-time updates
Research History: Access and manage past inquiries
PDF Export: Download the completed research report as a PDF document
Study management: terminate ongoing study process or delete past records

Configuration

Please report your best settings in issues so we can improve the default settings.

Key settings in config.py:
The key setting is config.py middle:

# LLM ConfigurationDEFAULT_MODEL  =  "mistral"   # Change based on your needsDEFAULT_TEMPERATURE  =  0.7MAX_TOKENS  =  8000
# Search ConfigurationMAX_SEARCH_RESULTS  =  40SEARCH_REGION  =  "us-en"TIME_PERIOD  =  "y"SAFE_SEARCH  =  TrueSEARCH_SNIPPETS_ONLY  =  False
# Choose search tool: "wiki", "arxiv", "duckduckgo", "guardian", "serp", "local_all", or "auto"search_tool  =  "auto"   # "auto" will intelligently select the best search engine for your query

Local Archive Search (RAG)
The system includes powerful local document search capabilities using Retrieval Enhancement Generation (RAG). This allows you to search and retrieve content from your own document collections.

Set up local collection

Create a file called local_collections.py The files are in the project root directory:

# local_collections.pyimport  osfrom  typing  import  Dict ,  Any
# Registry of local document collectionsLOCAL_COLLECTIONS = {    # Research Papers Collection    "research_papers" : {        "name" :  "Research Papers" ,        "description" :  "Academic research papers and articles" ,        "paths" : [os.path.abspath( "local_search_files/research_papers" )],   # Use absolute paths        "enabled" :  True ,        "embedding_model" :  "all-MiniLM-L6-v2" ,        "embedding_device" :  "cpu" ,        "embedding_model_type" :  "sentence_transformers" ,        "max_results" :  20 ,        "max_filtered_results" :  5 ,        "chunk_size" :  800 ,   # Smaller chunks for academic content        "chunk_overlap" :  150 ,        "cache_dir" :  ".cache/local_search/research_papers"    },
    # Personal Notes Collection    "personal_notes" : {        "name" :  "Personal Notes" ,        "description" :  "Personal notes and documents" ,        "paths" : [os.path.abspath( "local_search_files/personal_notes" )],   # Use absolute paths        "enabled" :  True ,        "embedding_model" :  "all-MiniLM-L6-v2" ,        "embedding_device" :  "cpu" ,        "embedding_model_type" :  "sentence_transformers" ,        "max_results" :  30 ,        "max_filtered_results" :  10 ,        "chunk_size" :  500 ,   # Smaller chunks for notes        "chunk_overlap" :  100 ,        "cache_dir" :  ".cache/local_search/personal_notes"    }}
Create the directories  for  your collections:```bashmkdir -p local_search_files/research_papersmkdir -p local_search_files/personal_notes

Add your documents to these folders and they will be automatically indexed and made searchable.

Use Local Search

There are several ways you can use local search:

Automatic selection: config.py Medium Settings search_tool = "auto" , the system will automatically use your local collection when the query is appropriate.
Explicit selection: search_tool = "research_papers" Set to search only specific collections.
Search all local collections: search_tool = "local_all" Set to search all your local document collections.
Query syntax: Use collection:collection_name your query to specify a specific collection in the query.

Search Engine Options

The system supports multiple search engines, which can be changed by config.py In search_tool Variables to choose from:

Automatic ( auto ): smart search engine selector that analyzes your query and selects the most appropriate source (Wikipedia, arXiv, local collection, etc.)
Wikipedia ( wiki ): Best for general knowledge, facts, and overview information
arXiv（ arxiv ): Great for scientific and academic research, access to preprints and papers
PubMed pubmed ): Ideal for biomedical literature, medical research and health information
DuckDuckGo duckduckgo ): Universal web search without API key
The Guardian guardian ): High-quality news and articles (API key required)
SerpAPI（ serp ): Google search results (API key required)
Google's programmable search engine google_pse ): Customized search experience with control over search scope and domain (requires API key and search engine ID)
Local Collection: In your local_collections.py Any collections defined in the file