DeepSeek is deployed locally, with visualization, no delay, and smooth use

Written by
Silas Grey
Updated on:July-15th-2025
Recommendation

Easily build a local DeepSeek and enjoy a latency-free, highly private intelligent conversation experience.

Core content:
1. DeepSeek API service latency issues and the necessity of local deployment
2. Local deployment hardware requirements for different versions of DeepSeek-R1 models
3. Detailed steps to build a local DeepSeek intelligent assistant using Ollama and Chatbox

Yang Fangxian
Founder of 53AI/Most Valuable Expert of Tencent Cloud (TVP)
Does your DeepSeek prompt that the server is busy again?
Many friends have found that DeepSeek's API service capabilities are far from keeping up with its popularity. The response speed of answering questions is obviously slower than that of domestic large models such as Doubao, Kimi, and Zhipu, and it crashes from time to time~~
Whether it is personal experience or enterprise business, a low-latency response large model service is needed. In addition, if there are requirements for data privacy, customization (fine-tuning, RAG, multi-modality), local deployment is even more necessary. For RAG technology, please refer to LangChain+RAG

The first step is to select the model

The following are the hardware resources required for local deployment of different versions of DeepSeek-R1 distillation models. We will download the corresponding models based on our own hardware resources later:

Step 2: Download and run the model

2.1 Download Ollama (URL: https://ollama.com/download)

Ollama is used to run AI models locally and is very easy to use.

Taking Windows system as an example, after installing OllamaSetup.exe, double-click the software icon to run the Ollama service.

2.2 Use Ollama to download and run the model. Here we choose the smallest 1.5b model (1.5 billion parameters)

ollama run deepseek-r1:1.5b

This is the command to run the model. Since there is no model locally, Ollama will download the model from the warehouse before running it, which is equivalent to:

ollama pull deepseek-r1:1.5b ollama run deepseek-r1:1.5b

At this point, the model service has been started. The default local service address is http://localhost:11434. With this local API service address, we can configure it to other software to create a local DeepSeek smart assistant, such as a code assistant. See DeepSeek Code Assistant.

You can also use it directly from the command line:


Step 3: Configure the dialogue interface

If you want a chat interface like the large models commonly seen on the market, you can use Chatbox

3.1 Install Chatbox (URL: https://chatboxai.app/zh)

After the installation is completed, you will see the following interface:
The model service has not been configured yet. Click "Click here to open settings" in the red box prompt, and then configure our Ollama service:
Then you can happily use the local model:
--------THE END-------