Deployment and maintenance of the SRE-specific large model

A revolutionary tool for operation and maintenance SRE is here, a 7B parameter exclusive large model to help automate operation and maintenance.
Core content:
1. Introduction to the 7B parameter SRE exclusive large model based on DeepSeek architecture
2. Alibaba Cloud GPU server deployment environment construction guide
3. Docker image deployment and dependent component installation steps
We chose Alibaba Cloud GPU server as the deployment environment because the local Mac computer could not run it. Recommended GPU configuration: system disk at least 100 GB, memory 60 GB. Package and install dependent components through docker images. The component information is as follows:
The instance supports the following GPU instance families:
gn6e、ebmgn6e
gn7i, ebmgn7i, ebmgn7ix
gn7e, ebmgn7e, ebmgn7ex
ebmgn8v、ebmgn8is
Mirror : SelectUbuntu 20.04operating system as an example.GPUUse on instancevLLMContainer image, which needs to be installed on the instance in advanceTeslaDriver and the driver version should be535or higher, it is recommended that youECSConsole PurchaseGPUWhen you install the instance, select InstallGPUdrive .
sudo apt-get updatessudo apt-get -y install ca-certificates curlsudo install -m 0755 -d /etc/apt/keyringssudo curl -fsSL http://mirrors.cloud.aliyuncs.com/docker-ce/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.ascsudo chmod a+r /etc/apt/keyrings/docker.ascecho \ "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] http://mirrors.cloud.aliyuncs.com/docker-ce/linux/ubuntu \ $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \ sudo tee /etc/apt/sources.list.d/docker.list > /dev/nullsudo apt-get updatesudo apt-get install -y docker-ce docker-ce-cli containerd.io
4. Execute the following command to checkDockerWhether the installation is successful.
5. Execute the following command to installnvidia-container-toolkit.
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
&& curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
sudo apt-get update
sudo apt-get install -y nvidia-container-toolkit
6. SettingsDockerStartup and restartDockerServe.
sudo systemctl enable docker
sudo systemctl restart docker
sudo systemctl status docker
8. Execute the following command to pullvLLMMirror image.
9. Execute the following command to runvLLMcontainer.
sudo docker run -d -t --net=host --gpus all \ --privileged \ --ipc=host \ --name vllm \ -v /root:/root \ egs-registry.cn-hangzhou.cr.aliyuncs.com/egs/vllm:0.8.2-pytorch2.6-cu124-20250328
10. Execute the following command to viewvLLMWhether the container is started successfully.
apt install git-lfscd /root
git lfs clone https://www.modelscope.cn/phpcool/DeepSeek-R1-Distill-SRE-Qwen-7B.git
docker exec -it vllm bash
vllm serve /root/DeepSeek-R1-Distill-SRE-Qwen-7B --tensor-parallel-size 1 --max-model-len 2048 --enforce-eager
As shown below, the vLLM inference service has been started.
curl http://localhost:8000/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{ "model": "/root/DeepSeek-R1-Distill-SRE-Qwen-7B", "messages": [ {"role": "system", "content": "You are an intelligent operation and maintenance assistant."}, {"role": "user", "content": "How to optimize the storage performance of the server to increase data reading and writing speed?" } ]}'