Data stays within the intranet: Building an enterprise-specific DeepSeek intelligent middle platform based on Ollama+OneAPI

Experience sharing of deploying DeepSeek smart middle platform in a non-external network environment, Ollama+OneAPI technical practice.
Core content:
1. Challenges and solutions for deploying DeepSeek on a server with limited external network
2. Steps for downloading, uploading and installing the local installation package of Ollama
3. Adding and managing Ollama services, including creating systemd service files and enabling services
Preface
DeepSeek has been deployed on Linux servers using Ollama
This time I deployed it on a server without external network (it should be said that it was more restricted). I encountered some pitfalls and recorded them here.
ollama
ollama naturally cannot use the online installation script
According to Ollama's documentation
First download the installation package on the local computer according to the server's system and CPU architecture
curl -L https://ollama.com/download/ollama-linux-amd64.tgz -o ollama-linux-amd64.tgz
Then use scp and other tools to upload to the server
scp ollama-linux-amd64.tgz Server address:/temp
After connecting to the server, unzip and install it, just follow the Ollama documentation (see the first reference)
sudo tar -C /usr -xzf ollama-linux-amd64.tgz
At this time, the ollama program can be executed
ollama serve
Then add it to the service, which is also the official recommended method of Ollama for easy management
sudo useradd -r -s /bin/ false -U -m -d /usr/share/ollama ollama
sudo usermod -a -G ollama $(whoami)
Create a new ollama.service file under /etc/systemd/system
[Unit]
Description=Ollama Service
After=network-online.target
[Service]
ExecStart=/usr/bin/ollama serve
User=ollama
Group=ollama
Restart=always
RestartSec=3
Environment="PATH=$PATH"
[Install]
WantedBy=default.target
Then enable the service
sudo systemctl daemon-reload
sudo systemctl enable ollama
At this point, the installation of Ollama is complete.
Model deployment
Offline servers cannot use Ollama pull to pull models
You need to download it locally first, and you can execute the Ollama pull operation on your local computer.
Then find the model file and upload it to the server
This is the general idea. The details are introduced below.
Find the local model file
If there is no special configuration, the default model files of Ollama are all in ~/.ollama/models/blobs
inside
First execute the command to see the path of the specified model, for example, to find the deepseek-r1:32b model
ollama show deepseek-r1:32b --modelfile
Output after executing the command (excerpt)
FROM C:\Users\deali\.ollama\models\blobs\sha256- 96 c415656d377afbff962f6cdb2394ab092ccbcbaab4b82525bc4ca800fe8a49
TEMPLATE "" "{{- if .System }}{{ .System }}{{ end }}
{{- range $i, $_ := .Messages }}
{{- $last := eq (len (slice $.Messages $i)) 1}}
{{- if eq .Role " user " }}<|User|>{{ .Content }}
{{- else if eq .Role " assistant " }}<|Assistant|>{{ .Content }}{{- if not $last }}<|end of sentence|>{{- end }}
{{- end }}
{{- if and $last (ne .Role " assistant ") }}<|Assistant|>{{- end }}
{{- end }}" ""
PARAMETER stop <|begin of sentence|>
PARAMETER stop <|end of sentence|>
PARAMETER stop <|User|>
PARAMETER stop <|Assistant|>
You can see this line
FROM C:\Users\deali\.ollama\models\blobs\sha256-96c415656d377afbff962f6cdb2394ab092ccbcbaab4b82525bc4ca800fe8a49
This is the path of the model downloaded to the local computer by Ollama.
Upload this file to the server
Export Modelfile
This file format is similar to Dockerfile
Export using the following command
ollama show deepseek-r1:32b --modelfile > Modelfile
Then this file also needs to be uploaded to the server
Import the model on the server
After uploading the model file and Modelfile, put them in the same directory
Rename it first to facilitate subsequent import
mv sha256-96c415656d377afbff962f6cdb2394ab092ccbcbaab4b82525bc4ca800fe8a49 deepseek-r1_32b.gguf
Next, edit the Modelfile file and change the FROM line to the name of the model file you just modified.
FROM ./deepseek-r1_32b.gguf
Then execute the following command to import
ollama create deepseek-r1:32b -f Modelfile
If there are no accidents, the import is successful and you can execute ollama list
to see if it has been imported.
one-api
One API is an open source LLM (Large Language Model) API management and distribution system, which aims to provide unified access to multiple large models through the standard OpenAI API format, ready to use out of the box. It supports a variety of mainstream large models, including OpenAI ChatGPT series, Claude series, Google PaLM2/Gemini series, Mistral series, ByteDance Doubao large model, Baidu Wenxin Yiyan series model, Ali Tongyi Qianwen series model, iFlytek Spark cognitive large model, Zhipu ChatGLM series model, Tencent Hunyuan large model, etc.
Docker deployment
One-api is developed with the gin framework of go, and it is easy to deploy. I usually deploy it with docker, so I won’t go into details here.
services:
db:
image: mysql:8.1.0
container_name: mysql
restart: always
environment:
MYSQL_ROOT_PASSWORD: mysql-password
volumes:
- ./data:/var/lib/mysql
one-api:
image: justsong/one-api
container_name: one-api
restart: always
ports:
- "3000:3000"
depends:
-db
environment:
- SQL_DSN=root:mysql-password@tcp(db:3306)/one_api
- TZ=Asia/Shanghai
- TIKTOKEN_CACHE_DIR=/TIKTOKEN_CACHE_DIR
volumes:
- ./data:/data
- ./TIKTOKEN_CACHE_DIR:/TIKTOKEN_CACHE_DIR
networks:
default:
name: one-api
Solving the Tiktoken Problem
The problem is that it depends on the tiktoken library, which requires an internet connection to download the token encoder.
The solution is to look at the error log, such as
one-api | [FATAL] 2025/02/17 - 10:47:21 | relay/adaptor/openai/token.go:26 [InitTokenEncoders] failed to get gpt-3.5-turbo token encoder: Get "https://openaipublic.blob.core.windows.net/encodings/cl100k_base.tiktoken" : dial tcp 57.150.97.129:443: i/o timeout, if you are using in offline environment, please set TIKTOKEN_CACHE_DIR to use exsited files
Here you need to download from https://openaipublic.blob.core.windows.net/encodings/cl100k_base.tiktoken
We first download the file locally and then upload it to the server
But not yet.
Tiktoken only recognizes the SHA-1 of the URL
Generate SHA-1
TIKTOKEN_URL=https://openaipublic.blob.core.windows.net/encodings/cl100k_base.tiktoken
echo -n $TIKTOKEN_URL | sha1sum | head -c 40
You can also combine one line of commands
echo -n "https://openaipublic.blob.core.windows.net/encodings/cl100k_base.tiktoken" | sha1sum | head -c 40
In this command line,echo -n
Used to output the specified URL string (its -n
The parameter is used to prevent a newline character from being added to the end of the output ).sha1sum
Calculate its SHA-1 hash value,head -c 40
Take the first 40 characters, which are the first 40 bits of the hash value.
The execution result is
9b5ad71b2ce5302211f9c61530b329a4922fc6a4
Then rename the cl100k_base.tiktoken file to output 9b5ad71b2ce5302211f9c61530b329a4922fc6a4
In the previous docker-compose.yaml, we have specified the TIKTOKEN_CACHE_DIR environment variable
Then put this 9b5ad71b2ce5302211f9c61530b329a4922fc6a4 file in the TIKTOKEN_CACHE_DIR directory.
If you encounter similar errors later, repeat the above steps until no errors are reported.
The version I am currently using only downloaded two encoders
Add Ollama channel in OneApi
There will be some trouble here because of the docker network problem
There are many ideas. One is to let the OneApi container run in host network mode.
One is to use the address host.docker.internal
Of course, the premise is that Ollama's host is set to 0.0.0.0
For this configuration, please refer to my previous article: LLM Exploration: Local Deployment of DeepSeek-R1 Model
When adding a channel, select Ollama as the type
The custom model part is filled in with the deepseek-r1:32b we deployed
The agent then fills in http://host.docker.internal:11434
Note: In Linux environments,host.docker.internal
may not work, but you can use the host machine's IP address directly. For example, if the host machine's IP address is 192.168.1.100
, which can be used in OneApi http://192.168.1.100:11434
To access Ollama services