A highly controversial open source project, "WeChat Clone" has become popular!

Written by
Iris Vance
Updated on:June-21st-2025
Recommendation

Explore the mystery of digital avatars and experience the novel feeling of dialogue with memories.

Core content:
1. Introduction to the WeClone project, how to clone digital avatars through WeChat chat records
2. Analysis of core functions, including LLM fine-tuning, voice cloning and multi-platform deployment
3. Outlook of application scenarios, from personal assistants to content creation, to the possibility of digital immortality

Yang Fangxian
Founder of 53A/Most Valuable Expert of Tencent Cloud (TVP)

Do you have a chat window in your WeChat? It hasn't had any new messages pop up for a long time, but you often open it and scroll through it over and over again in the middle of the night?

If you could use these chat records to clone the other person's "digital avatar", save their tone of voice, style, unique catchphrases, and even send you voice messages, what would you choose?

Recently, a new open-source project WeClone was launched on GitHub, which makes it possible for the person in your memory to live forever in the digital world.

WeClone fine-tunes the Large Language Model (LLM) through personal WeChat chat records to create a personalized digital avatar.

It provides a full-link solution from text generation to voice cloning, from training to deployment, allowing the digital avatar to not only speak for the person, but also sound like the person himself.

In addition to keeping the person in your memory, you can also create your own digital avatar. Have you ever thought about what it would be like to chat with yourself? Would you like to chat with yourself?

The playability of digital human technology is indeed very high. Once it was launched, it attracted a lot of attention from netizens both on the intranet and the Internet. Many netizens also had their imaginations run wild.

Project guide: https://github.com/mtcto/weclone

Let’s first take a look at the core functions of WeClone.

Core Features

  • Fine-tuning LLM using WeChat chat records

WeClone supports exporting WeChat chat records and formatting them into question-and-answer format for model fine-tuning.

In terms of model fine-tuning, WeClone supports low-resource fine-tuning of mainstream 0.5B–7B scale models based on LoRA, including ChatGLM3-6B, Qwen2.5-7B, etc. Effectively capture the user's language habits and expressions.

Model training requires about 16GB of video memory. The video memory requirement is controllable, the training efficiency is high, and it meets the needs of small sample and low resource scenarios.

The estimated video memory required is as follows:

  • Using WeChat Voice Message + Spark-TTS Model to Achieve High-Quality Voice Cloning

The project's supporting submodule WeClone - audio (https://github.com/xming521/WeClone/tree/master/WeClone-audio) is based on the lightweight Tacotron or WavLM model. Using a 0.5B parameter model and a 5-second voice sample, it can clone sounds with a similarity of up to 95%, further enhancing the realism of the digital avatar.

  • Multi-platform deployment

Through the AstrBot framework, the digital avatar is deployed to multiple chat platforms such as WeChat, QQ, Telegram, WeChat for Business, Lark, etc. A single command can quickly start the conversation with the digital avatar in real time.

Possible application scenarios

Personal assistant customization: When you are busy, your digital avatar can reply to messages and handle daily affairs on your behalf, such as writing emails and replying to comments.

Content creation: quickly produce personalized text content in a specific style to help you operate multiple accounts with the same style, such as writing tweets, scripts, and commentaries.

Digital immortality: Create a digital avatar of yourself or others to achieve eternal life.

Core Module Introduction

WeClone's digital clone full-link core module consists of three parts:

Data export and preprocessing → LoRA model fine-tuning → Multi-platform deployment

Next, let’s look at the technical highlights of each module.

1. Data export and preprocessing

WeClone first converts the CSV/SQLite files exported by WeChat into standard JSON files. Then it performs text cleaning, mainly to remove noise and filter out sensitive information. Finally, it segments the conversation information, annotates the chat records in segments, and retains the timestamp,

2. Model fine-tuning

WeClone uses ChatGLM3-6B as the base model and performs fine-tuning in the SFT (Supervised Fine-Tuning) stage based on the LoRA framework.

Key highlights include:

  • Use low-rank adapters to significantly reduce trainable parameters.

  • Compatible with stand-alone/distributed training, and supports multi-card training acceleration.

3. Model Deployment

WeClone uses FastAPI/Flask to package the fine-tuned model, supports GPU/CPU hybrid deployment, multi-platform login, and supports custom parameters.

Installation and deployment tutorial

Environment Construction

It is recommended to use uv, which is a very fast Python environment manager. After installing uv, you can use the following command to create a new Python environment and install dependencies. Note that this does not include the dependencies for the xcodec (audio cloning) feature:

git  clone  https://github.com/xming521/WeClone.git
cd  WeClone
uv venv .venv --python=3.9
source  .venv/bin/activate
uv pip install --group main -e . 
Note

Training and inference related configurations are unified in the file settings.json.

Data preparation

Please use PyWxDump to extract WeChat chat records. After downloading the software and decrypting the database, click Chat Backup, export type to CSV, you can export multiple contacts or group chats, and then export the files in wxdump_tmp/export The csv folder is placed in ./data Directory, that is, the folders of chat records of different people are placed together ./data/csvThe sample data is located at data/example_chat.csv.

Data preprocessing

By default, the project removes the mobile phone number, ID number, email address, and website address from the data. It also provides a blocked_words dictionary, where you can add words and sentences that need to be filtered (the entire sentence including the blocked words will be removed by default). ./make_dataset/csv_to_json.py The script processes the data.

When the same person answers multiple sentences in a row, there are three ways to deal with it:

Model Download

First download the ChatGLM3 model from Hugging Face. If you encounter problems downloading the Hugging Face model, you can use the Moda community through the following method. Subsequent training and reasoning need to be executed first. export USE_MODELSCOPE_HUB=1 Let’s use the model of the MoDa community.

Since the model is large, the download process will take a long time, please be patient.

export  USE_MODELSCOPE_HUB=1  # Windows uses `set USE_MODELSCOPE_HUB=1`
git lfs install
git  clone  https://www.modelscope.cn/ZhipuAI/chatglm3-6b.git

The Magic Community modeling_chatglm.py The file needs to be replaced with Hugging Face's.

Configure parameters and fine-tune the model

  • (Optional) Modify settings.json and select other models downloaded locally.

  • Reviseper_device_train_batch_sizeas well asgradient_accumulation_stepsTo adjust the video memory usage.

  • You can modify it according to the quantity and quality of your own data setnum_train_epochs,lora_rank,lora_dropoutAnd other parameters.

Single card training

run src/train_sft.py After fine-tuning the SFT stage, my loss only dropped to about 3.5. Reducing it too much may lead to overfitting. I used about 20,000 integrated valid data.

python src/train_sft.py

Multi-card training

uv pip install deepspeed
deepspeed --num_gpus=Number of GPUs to use src/train_sft.py

Simple reasoning using the browser demo

python ./src/web_demo.py 

Reasoning with interfaces

python ./src/api_service.py

Use the Common Chat Questions Test

python ./src/api_service.py
python ./src/test_model.py

Deploy to chatbot

AstrBot Solution

AstrBot is an easy-to-use multi-platform LLM chatbot and development framework.

Steps:

  1. Deploy AstrBot
  2. Deploy the messaging platform in AstrBot
  3. implement python ./src/api_service.py , start the api service
  4. Add a new service provider in AstrBot, select OpenAI as the type, fill in the API Base URL according to the AstrBot deployment method (for example, docker deployment may be http://172.17.0.1:8005/v1), and fill in gpt-3.5-turbo as the model
  5. After fine-tuning, tool calls are not supported. Please turn off the default tool first and send instructions on the message platform: /tool ​​off reminder, otherwise there will be no fine-tuning effect.
  6. Set the system prompt word in AstrBot according to the default_system used during fine-tuning.