Data stays within the intranet: Building an enterprise-specific DeepSeek intelligent middle platform based on Ollama+OneAPI

Written by
Silas Grey
Updated on:July-14th-2025
Recommendation

Experience sharing of deploying DeepSeek smart middle platform in a non-external network environment, Ollama+OneAPI technical practice.

Core content:
1. Challenges and solutions for deploying DeepSeek on a server with limited external network
2. Steps for downloading, uploading and installing the local installation package of Ollama
3. Adding and managing Ollama services, including creating systemd service files and enabling services

Yang Fangxian
Founder of 53AI/Most Valuable Expert of Tencent Cloud (TVP)

Preface

DeepSeek has been deployed on Linux servers using Ollama

This time I deployed it on a server without external network (it should be said that it was more restricted). I encountered some pitfalls and recorded them here.

ollama

ollama naturally cannot use the online installation script

According to Ollama's documentation

First download the installation package on the local computer according to the server's system and CPU architecture

curl -L https://ollama.com/download/ollama-linux-amd64.tgz -o ollama-linux-amd64.tgz

Then use scp and other tools to upload to the server

scp ollama-linux-amd64.tgz Server address:/temp

After connecting to the server, unzip and install it, just follow the Ollama documentation (see the first reference)

sudo tar -C /usr -xzf ollama-linux-amd64.tgz

At this time, the ollama program can be executed

ollama serve

Then add it to the service, which is also the official recommended method of Ollama for easy management

sudo useradd -r -s /bin/ false  -U -m -d /usr/share/ollama ollama
sudo usermod -a -G ollama $(whoami)

Create a new ollama.service file under /etc/systemd/system

[Unit]
Description=Ollama Service
After=network-online.target

[Service]
ExecStart=/usr/bin/ollama serve
User=ollama
Group=ollama
Restart=always
RestartSec=3
Environment="PATH=$PATH"

[Install]
WantedBy=default.target

Then enable the service

sudo systemctl daemon-reload
sudo systemctl  enable  ollama

At this point, the installation of Ollama is complete.

Model deployment

Offline servers cannot use Ollama pull to pull models

You need to download it locally first, and you can execute the Ollama pull operation on your local computer.

Then find the model file and upload it to the server

This is the general idea. The details are introduced below.

Find the local model file

If there is no special configuration, the default model files of Ollama are all in ~/.ollama/models/blobs inside

First execute the command to see the path of the specified model, for example, to find the deepseek-r1:32b model

ollama show deepseek-r1:32b --modelfile

Output after executing the command (excerpt)

FROM  C:\Users\deali\.ollama\models\blobs\sha256- 96 c415656d377afbff962f6cdb2394ab092ccbcbaab4b82525bc4ca800fe8a49
TEMPLATE  "" "{{- if .System }}{{ .System }}{{ end }}
{{- range $i, $_ := .Messages }}
{{- $last := eq (len (slice $.Messages $i)) 1}}
{{- if eq .Role "
user " }}<|User|>{{ .Content }}
{{- else if eq .Role " assistant " }}<|Assistant|>{{ .Content }}{{- if not $last }}<|end of sentence|>{{- end }}
{{- end }}
{{- if and $last (ne .Role " assistant ") }}<|Assistant|>{{- end }}
{{- end }}" ""
PARAMETER stop <|begin of sentence|>
PARAMETER stop <|end of sentence|>
PARAMETER stop <|User|>
PARAMETER stop <|Assistant|>

You can see this line

FROM C:\Users\deali\.ollama\models\blobs\sha256-96c415656d377afbff962f6cdb2394ab092ccbcbaab4b82525bc4ca800fe8a49

This is the path of the model downloaded to the local computer by Ollama.

Upload this file to the server

Export Modelfile

This file format is similar to Dockerfile

Export using the following command

ollama show deepseek-r1:32b --modelfile > Modelfile

Then this file also needs to be uploaded to the server

Import the model on the server

After uploading the model file and Modelfile, put them in the same directory

Rename it first to facilitate subsequent import

mv sha256-96c415656d377afbff962f6cdb2394ab092ccbcbaab4b82525bc4ca800fe8a49 deepseek-r1_32b.gguf

Next, edit the Modelfile file and change the FROM line to the name of the model file you just modified.

FROM  ./deepseek-r1_32b.gguf

Then execute the following command to import

ollama create deepseek-r1:32b -f Modelfile

If there are no accidents, the import is successful and you can execute ollama list to see if it has been imported.

one-api

One API is an open source LLM (Large Language Model) API management and distribution system, which aims to provide unified access to multiple large models through the standard OpenAI API format, ready to use out of the box. It supports a variety of mainstream large models, including OpenAI ChatGPT series, Claude series, Google PaLM2/Gemini series, Mistral series, ByteDance Doubao large model, Baidu Wenxin Yiyan series model, Ali Tongyi Qianwen series model, iFlytek Spark cognitive large model, Zhipu ChatGLM series model, Tencent Hunyuan large model, etc.

Docker deployment

One-api is developed with the gin framework of go, and it is easy to deploy. I usually deploy it with docker, so I won’t go into details here.

services:
  db:
    image: mysql:8.1.0
    container_name: mysql
    restart: always
    environment:
      MYSQL_ROOT_PASSWORD: mysql-password
    volumes:
      - ./data:/var/lib/mysql
one-api:
    image: justsong/one-api
    container_name: one-api
    restart: always
    ports:
      - "3000:3000"
    depends:
      -db
    environment:
      - SQL_DSN=root:mysql-password@tcp(db:3306)/one_api
      - TZ=Asia/Shanghai
      - TIKTOKEN_CACHE_DIR=/TIKTOKEN_CACHE_DIR
    volumes:
      - ./data:/data
      - ./TIKTOKEN_CACHE_DIR:/TIKTOKEN_CACHE_DIR

networks:
default:
    name: one-api

Solving the Tiktoken Problem

The problem is that it depends on the tiktoken library, which requires an internet connection to download the token encoder.

The solution is to look at the error log, such as

one-api | [FATAL] 2025/02/17 - 10:47:21 | relay/adaptor/openai/token.go:26 [InitTokenEncoders] failed to get gpt-3.5-turbo token encoder: Get  "https://openaipublic.blob.core.windows.net/encodings/cl100k_base.tiktoken" : dial tcp 57.150.97.129:443: i/o timeout,  if  you are using  in  offline environment, please  set  TIKTOKEN_CACHE_DIR to use exsited files

Here you need to download from https://openaipublic.blob.core.windows.net/encodings/cl100k_base.tiktoken

We first download the file locally and then upload it to the server

But not yet.

Tiktoken only recognizes the SHA-1 of the URL

Generate SHA-1

TIKTOKEN_URL=https://openaipublic.blob.core.windows.net/encodings/cl100k_base.tiktoken
echo  -n  $TIKTOKEN_URL  | sha1sum | head -c 40

You can also combine one line of commands

echo  -n  "https://openaipublic.blob.core.windows.net/encodings/cl100k_base.tiktoken"  | sha1sum | head -c 40

In this command line,echo -n Used to output the specified URL string (its -n The parameter is used to prevent a newline character from being added to the end of the output ).sha1sum Calculate its SHA-1 hash value,head -c 40 Take the first 40 characters, which are the first 40 bits of the hash value.

The execution result is

9b5ad71b2ce5302211f9c61530b329a4922fc6a4

Then rename the cl100k_base.tiktoken file to output 9b5ad71b2ce5302211f9c61530b329a4922fc6a4

In the previous docker-compose.yaml, we have specified the TIKTOKEN_CACHE_DIR environment variable

Then put this 9b5ad71b2ce5302211f9c61530b329a4922fc6a4 file in the TIKTOKEN_CACHE_DIR directory.

If you encounter similar errors later, repeat the above steps until no errors are reported.

The version I am currently using only downloaded two encoders

Add Ollama channel in OneApi

There will be some trouble here because of the docker network problem

There are many ideas. One is to let the OneApi container run in host network mode.

One is to use the address host.docker.internal

Of course, the premise is that Ollama's host is set to 0.0.0.0 For this configuration, please refer to my previous article:  LLM Exploration: Local Deployment of DeepSeek-R1 Model

When adding a channel, select Ollama as the type

The custom model part is filled in with the deepseek-r1:32b we deployed

The agent then fills in http://host.docker.internal:11434

Note: In Linux environments,host.docker.internal may not work, but you can use the host machine's IP address directly. For example, if the host machine's IP address is 192.168.1.100, which can be used in OneApi http://192.168.1.100:11434 To access Ollama services