Ollama deploys DeepSeek translation locally. "Let him be strong, the breeze blows over the hills; let him be arrogant, the bright moon shines on the river."

Written by

Silas Grey

Updated on:July-14th-2025

1. Meet Ollama

The most common version of Ollama’s introduction is: “Ollama is an open source, lightweight, extensible framework for building, running, and managing large models on local machines.”

It has several keywords: lightweight, local, build, run and manage. This shows that Ollama is good at deploying and running large models on a single machine, and it is perfect for running large models on a low-configuration laptop.

2. Install Ollama

The official website is: https://ollama.com/ There are MacOS, Linux, and Windows versions. You can download and install them according to your needs.

For Windows, you can download an exe file to install it. For MacOS, it is even simpler. Use the command:

brew install ollama

The installation is complete.

Then use the following command: ollama pull xxx to run the model named xxx. Ollama actually runs locally in CS architecture mode. If you encounter the following error on MacOS:

Error: could not connect to ollama app, is it running?

It may be that MacOS does not automatically register Ollama as a server. You can solve this error by manually executing the Ollama serve command.

Today, Foreign Minister Wang Yi used Jin Yong's famous quote " Let him be strong, the breeze blows over the hills; let him be arrogant, the bright moon shines on the river " to express the China-US relationship , which cued DeepSeek. So let's use this task to see the effect of locally deploying DeepSeek.

3.Translation on Windows

My Windows configuration is i932G, graphics card 4G memory. Use the following command to run Deepseek-R1 14B model:

ollama run deepseek-r1:14b

The first time you execute the program, there is no model locally, so there is a process of downloading the model. The model file is about 9G, and the download speed is only due to the network, which can reach 8M/s at a fast speed.

The first translation effect is shown below:

Overall, it is quite satisfactory and can be considered a literal translation.

During execution, the independent video memory can use about 3G. This performance is still very strong. I remember running the Qwen model before, 2.5B was the limit, Deepseek can run 14B, the evolution speed is really very fast.

4.Translation on MacOS

My MacOS configuration is M1, 16G memory, which is a Mac in 2021. At first, I thought this configuration could not run a 14B model, so I installed Ollama with the idea of giving it a try. The first translation effect is as shown below:

Compared with Windows' first inference, it is quite different. Let it re-translate and give the following version:

This time it was pretty standard. During inference, the MacOS GPU usage was between 60% and 70%, while the memory usage soared. This is because the MacOS graphics card does not use exclusive memory, but memory shared with the entire system.

Finally, take a look at the translation of DeepSeek official website and chatGPT:

DeepSeek official website:

chatGPT official website:

In comparison, the official website experience is indeed better, which may be related to the fact that the local deployment is only 14B.